fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-28 18:38:21 +02:00

Author	SHA1	Message	Date
Timur Kristóf	28eb481bc2	nouveau/nvc0: add extern keyword to nvc0_miptree_vtbl. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2020-01-21 17:36:36 +01:00
Icecream95	31bd3b5279	panfrost: Add ASTC texture formats Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>	2020-01-21 08:35:23 -05:00
Icecream95	960fe9daea	panfrost: Add ETC1/ETC2 texture formats Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>	2020-01-21 08:35:23 -05:00
Alyssa Rosenzweig	2091d311c9	panfrost: Rework linear<--->tiled conversions There's a lot going on here (it's a ton of commits squashed together since otherwise this would be impossible to review...) 1. We have a fast path for linear->tiled for whole (aligned) tiles, but we have to use a slow path for unaligned accesses. We can get a pretty major win for partial updates by using this slow path simply on the borders of the update region, and then hit the fast path for the tile-aligned interior. This does require some shuffling. 2. Mark the LUTs constant, which allows the compiler to inline them, which pairs well with loop unrolling (eliminating the memory accesses and just becoming some immediates.. which are not as immediate on aarch64 as I'd like..) 3. Add fast path for bpp1/2/8/16. These use the same algorithm and we have native types for them, so may as well get the fast path. 4. Drop generic path for bpp != 1/2/8/16, since these formats are generally awful and there's no way to tile them efficienctly and honestly there's not a good reason too either. Lima doesn't support any of these formats; Panfrost can make the opinionated choice to make them linear. 5. Specialize the unaligned routines. They don't have to be fully generic, they just can't assume alignment. So now they should be nearly as fast as the aligned versions (which get some extra tricks to be even faster but the difference might be neglible on some workloads). 6. Specialize also for the size of the tile, to allow 4x4 tiling as well as 16x16 tiling. This allows compressed textures to be efficiently tiled with the same routines (so we add support for tiling ASTC/ETC textures while we're at it) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>	2020-01-21 08:35:19 -05:00
Alyssa Rosenzweig	f2d876b2b2	panfrost,lima: De-Galliumize tiling routines There's an implicit dependence on Gallium here that will add more complexity than needed when testing/optimizing out of driver as well as potentially Vulkanizing. We don't need a full pipe_box, just the x/y/w/h properties directly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>	2020-01-21 08:35:16 -05:00
Jan Zielinski	a24b3b228a	gallium/gallivm: enable linking lp_bld_printf function with C++ code To enable linking functions declared in lp_bld_printf.h file with C++, we need to add appropriate macros to the header. Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3470> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3470>	2020-01-21 11:00:18 +00:00
Danylo Piliaiev	3f9a6011a6	iris: Fix value of out-of-bounds accesses for vertex attributes Having VERTEX_BUFFER_STATE.BufferSize greater than the size of a bound vertex buffer allows shader to read uninitialized vertex attributes from BO, instead of allowing hardware to return zeroes on out-of-bounds access. OpenGL spec "6.4 Effects of Accessing Outside Buffer Bounds" says: "Robust buffer access can be enabled by creating a context with robust access enabled through the window system binding APIs. When enabled, any command unable to generate a GL error as described above, such as buffer object accesses from the active program, will not read or modify memory outside of the data store of the buffer object and will not result in GL interruption or termination. Out-of-bounds reads may return values from within the buffer object or zero values." Fixes three webgl tests: conformance/rendering/out-of-bounds-array-buffers.html conformance2/rendering/out-of-bounds-index-buffers-after-copying.html conformance2/rendering/element-index-uint.html See #1996 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3427> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3427>	2020-01-21 09:52:40 +00:00
Jan Vesely	87e1f8eca5	clover: Initialize Asm Parsers Fixes piglits that use ADMGCN inline assembly: program@execute@calls program@execute@amdgcn-mubuf-negative-vaddr CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2020-01-21 01:39:08 +00:00
Marek Olšák	735a3ba007	radeonsi/gfx10: enable GS fast launch for triangles and strips with NGG culling Only non-indexed triangle lists and strips are supported. This increases performance if there is something to cull. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	c377f45c18	radeonsi/gfx10: rewrite late alloc computation - Use conservative late alloc when the number of CUs <= 6. - Move the late alloc GS register to the GS shader state, so that it can be tuned for NGG culling. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	4e4b2d13f0	ac: add helper ac_build_triangle_strip_indices_to_triangle Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	8db00a51f8	radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	aa2d846604	radeonsi/gfx10: move GE_PC_ALLOC setting to shader states The value is not changed. I just use a different way to compute it. The value will vary with NGG culling. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	41fef6fc09	radeonsi/gfx10: don't initialize VGPRs not used by NGG passthrough v2: TES doesn't use the GS PrimitiveID Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	943d131e7d	radeonsi/gfx10: merge main and pos/param export IF blocks into one if possible Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	a966729c84	radeonsi/gfx10: export primitives at the beginning of VS/TES This decreases VGPR usage and will allow us to merge some IF blocks in shaders. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	5a0fcf11f0	radeonsi/gfx10: move s_sendmsg gs_alloc_req to the beginning of shaders This will allow us to merge some IF blocks in shaders. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	cf9f8d1ea2	radeonsi/gfx10: correct VS PrimitiveID implementation for NGG We didn't use the correct LDS pointer, though it probably doesn't matter, because I think that nothing else is using LDS here. This commit makes it consistent with all other esgs_ring use. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	b2326a7549	radeonsi/gfx10: update comments and remove invalid TODOs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	679b6244e1	radeonsi: turn an assertion into return in si_nir_store_output_tcs Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 15:40:13 -05:00
Marek Olšák	27cc7703d3	radeonsi: fix doubles and int64 Fixes: `57bd73e229` - radeonsi: remove llvm_type_is_64bit Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 15:40:10 -05:00
Marek Olšák	df34fa14bb	radeonsi: don't invoke decompression inside internal launch_grid Decompress resources properly but don't do it inside launch_grid to prevent recursion. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: 19.3 <mesa-stable@lists.freedesktop.org>	2020-01-20 15:40:08 -05:00
Marek Olšák	58c929be0d	radeonsi: clean up how internal compute dispatches are handled Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: 19.3 <mesa-stable@lists.freedesktop.org>	2020-01-20 15:40:07 -05:00
Marek Olšák	d69483270e	Revert "radeonsi: unbind image before compute clear" This reverts commit `3a527eda7c`. It's incorrect. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 15:40:05 -05:00
Daniel Stone	cf5fccb0d9	Revert "gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES" This reverts commit `bec9c90b5e`. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>	2020-01-20 12:33:29 +00:00
Daniel Stone	32d45733ae	Revert "st/dri: do FLUSH_VERTICES before calling flush_resource" This reverts commit `3ba16d36c9`. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>	2020-01-20 12:33:22 +00:00
Icecream95	d8a3501f1b	panfrost: Dynamically allocate shader variants This fixes a crash in LZDoom where over 16 shader variants are needed for a few shaders in some maps, and should also save a few kilobytes of RAM as most of the time only one or two variants of the 8 previously allocated are actually needed. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-18 11:47:34 -05:00
Alyssa Rosenzweig	bef716b56c	panfrost: Expose some functionality with dEQP flag These features are stable enough that they don't need to be hidden. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3464> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3464>	2020-01-18 14:57:52 +00:00
Icecream95	5e8386c606	panfrost: Compact the bo_access readers array Previously, the array bo_access->readers was only cleared when there were no unsignaled fences, which in some situations never happened. That resulted in the array having thousands of NULL pointers, but only a handful of active readers. With this patch, all the unsignaled readers are moved to the front of the array, effectively building a new array only containing the active readers in-place. This results in the readers array usually only having a couple of elements. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3419> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3419>	2020-01-18 13:58:43 +00:00
Erik Faye-Lund	c0ba9000d2	zink: support arrays of samplers Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	a9023ec566	zink: support sampling non-float textures Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	3e1acff560	zink: store image-type per texture Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	5fc1562a72	zink: avoid incorrect vector-construction Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	8112240d29	zink: support offset-variants of texturing Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	f1a5bcdc16	zink: implement nir_texop_txs Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Rob Clark	95187083c4	freedreno/a6xx: add PROG_FB_RAST stateobj For the handful of registers that depend on the union of program/ framebuffer/rasterizer state. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Rob Clark	6dc9b292d0	freedreno/a6xx: move dynamic program state to streaming stateobj Move the program state which we can't pre-bake to a streaming state object, rather than emitting directly in the draw cmdstream. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Rob Clark	d2fd6469c3	freedreno/a6xx: drop a few more per-draw registers Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Rob Clark	4d8f42c851	freedreno/a6xx: separate rast stateobj for prim restart This lets us move PC_PRIMITIVE_CNTL into the rasterizr stateobj, rather than unconditionally emitting it directly in the cmdstream on every draw. This also starts adding some tracking about previous draw state, so that following patches can limit some of the register writes we currently emit on every draw. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Rob Clark	0e063b3079	freedreno/a6xx: cleanup rasterizer state All but one of the reg values is only used in the stateobj, so we can inline the register value setup and stateobj construction. While we are at it, switch over to the new register builders. Prep work for next patch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Rob Clark	fba7e6f896	freedreno/a6xx: limit scratch/debug markers to debug builds The overhead does seem to matter when you have a high enough # of draw calls that effect few bins/pixels, because these writes would happen unconditionally (ie. not part of a state-group). Possibly we could keep these if we moved them into a state-group so the register writes would be no-ops on bins with no geometry. OTOH I usually end up adding in a WFI when using them scratch reg values to track down a crash. (So add a WFI to mitigate the annoyance of needing to use a debug build to get scratch regs to locate the position of a crash/hang in the cmdstream.) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Jordan Justen	5d7381c645	iris: Fix some indentation in iris_init_render_context Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2020-01-17 15:28:07 -08:00
Marek Olšák	3ba16d36c9	st/dri: do FLUSH_VERTICES before calling flush_resource	2020-01-17 15:04:35 -05:00
Marek Olšák	bec9c90b5e	gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES	2020-01-17 15:04:35 -05:00
Krzysztof Raszkowski	ad820d5aca	gallium/swr: Disable showing detected arch message. When swr driver is in use it print detected architecture message to std::err. It can be harmfull when swr is using in multinodes environments. It can be enabled setting env var SWR_PRINT_INFO to 1. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2020-01-17 16:41:53 +00:00
Pierre-Eric Pelloux-Prayer	5b1c4e1b75	util: call bind_sampler_states before setting sampler_views Fixes the following valgrind error: Invalid read of size 16 at 0x28F458A1: si_set_sampler_view_desc (in radeonsi_drv_video.so) by 0x28F4657E: si_set_sampler_views (in radeonsi_drv_video.so) by 0x28D62BF5: util_compute_blit (in radeonsi_drv_video.so) by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so) by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so) by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0) Address 0x18142a10 is 0 bytes inside a block of size 48 free'd at 0x48369AB: free (vg_replace_malloc.c:540) by 0x28D62D51: util_compute_blit (in radeonsi_drv_video.so) by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so) by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so) by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0) Block was alloc'd at at 0x4837B65: calloc (vg_replace_malloc.c:762) by 0x28EFB2EC: si_create_sampler_state (in radeonsi_drv_video.so) by 0x28D62C30: util_compute_blit (in radeonsi_drv_video.so) by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so) by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so) by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0) Fixes: `69430d7e59` ("va: use a compute shader for the blit") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2321 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428>	2020-01-17 10:14:57 +01:00
Andreas Baierl	2ebfc6db16	lima: Fix alpha blending Introduce separate helper functions to set the blendfactor bits. Lima uses bits 0-2 for the type, bit 3 sets the inverted function and bit 4 is set if alpha is used. alpha_src_factor and alpha_dst_factor don't need the alpha bit, so they are masked with 0xf. There is only place for 4 bits anyway. If alpha_src_factor is PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE, we need to change it to PIPE_BLENDFACTOR_ONE first. This is exactly what the blob does and we pass all dEQP-GLES2.functional.fragment_ops.blend.* tests now. Better than the blob btw... Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>	2020-01-16 16:43:41 +00:00
Tapani Pälli	3cec148455	iris: set depth stall enabled when depth flush enabled on gen12 This implements HW workaround #1409600907 for iris driver. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>	2020-01-16 14:05:54 +02:00
Lionel Landwerlin	9eca823cce	iris: implement another workaround for non pipelined states v2: add comment (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>	2020-01-16 11:51:22 +02:00
Lionel Landwerlin	e6e5cbac04	iris: handle new PIPE_CONTROL field Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>	2020-01-16 11:48:11 +02:00

1 2 3 4 5 ...

41323 commits