fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-11 10:48:06 +02:00

Author	SHA1	Message	Date
Christian Gmeiner	d48d8aefdf	docs: Move isaspec out of drivers/freedreno Lets put it under 'Developer Topics'. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Acked-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25452>	2023-10-02 07:20:13 +00:00
Iago Toral Quiroga	4afbf4ad31	v3d: get rid of shader_state pointer in v3d_key Having this pointer in the key is undesirable since it makes copying keys difficult and error prone (as seen in previous patches), also, it is only there for convenience and we don't strictly need it (in fact the vulkan driver doesn't use it at all), so let's just get rid of it so our v3d_key is fully static. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25418>	2023-10-02 06:35:07 +00:00
Iago Toral Quiroga	3fb9e27a3d	v3d: fix RAM shader cache The RAM shader cache was using the v3d_key for hashes and comparisons which is not correct. Particularly, this struct has a void pointer where we store a reference to an uncompiled shader with the NIR code, and that is of course not accounted for when hashing and comparing keys, which can lead to bogus cache hits. This patch introduces a v3d_cache_key that has both the v3d key and a sha1 of the uncompiled NIR. Now key hashing and comparison is done on the static part of the v3d key (that is, excluding the uncompiled shader pointer, which may be invalid in the cache if the original shader was deleted) and taking the sha1 from the uncompiled shader. This also makes sure the shader key we store in the cache has a NULL shader_state pointer to make it more clear that this field may not be used at all for caching purposes. This fixes GPU hangs with some OpenCL tests (through Rusticl) caused by incorrect RAM cache hits. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25418>	2023-10-02 06:35:07 +00:00
Iago Toral Quiroga	8a4bd328cf	v3d: use pre-computed shader sha1 for disk cache Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25418>	2023-10-02 06:35:06 +00:00
Iago Toral Quiroga	0ed36b524c	v3d: compute nir sha1 for uncompiled shader state Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25418>	2023-10-02 06:35:06 +00:00
Iago Toral Quiroga	adc63d2503	broadcom/compiler: add a couple of shader key helpers Our shader key includes a void pointer that we can't just memcmp, so add helpers that allow us toget the 'static' portion and size of a key. We will use this to fix up the shader cache in v3d in a later patch. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25418>	2023-10-02 06:35:06 +00:00
Sergi Blanch Torne	ccd3e68146	ci: disable Collabora's LAVA lab for maintance This is to inform you of some planned downtime in the LAVA lab as follows: * Start: 2023-10-02 08:00 BST (07:00 UTC) * End: 2023-10-02 12:00 BST (11:00 UTC) Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25368>	2023-10-02 07:42:08 +02:00
Martin Roukala (né Peres)	b1156507ed	ci/vkcts-navi21: mark more of the RT handles checks as flakes We keep hitting more and more of them, so let's be more inclusive. Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25495>	2023-10-02 03:08:11 +00:00
Martin Roukala (né Peres)	a7ed839490	ci/vkcts-vangogh: mark dEQP-VK.dynamic_rendering.primary_cmd_buff.basic.* as flake This mirrors what we did on navi21, as there are just too many of these tests. Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25495>	2023-10-02 03:08:11 +00:00
Janne Grunau	dbe2230408	asahi: decode: Fix uint64_t format modifiers in agxdecode_stateful() Fixes i386 build. Fixes: `acd5ed0451` ("asahi: decode: Implement VDM call/ret") Signed-off-by: Janne Grunau <j@jannau.net>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	d99ed6d66d	asahi: Handle layered background programs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	3715586580	asahi: Generate layered EOT programs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	c87095e518	asahi: Use a 2D Array texture for array render targets Fixes KHR-GLES31.core.geometry_shader.layered_framebuffer.blending_support with eMRT forced. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	87a7b239e1	asahi: Write to cubes/etc attachments as 2D array To reduce shader variants, the tilebuffer lowering code does not know the actual texture targets of the spilled render targets, only whether they are layered or not. As such, all layered targets (3D, cube map, etc) get written out uniformly as 2D Arrays. For that to work, the driver needs to do the corresponding transform. Regular imageStore() instructions are not affected by any of this. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	0cbecc1ad1	asahi: Predicate layer ID reads Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	e2a0d64d52	asahi: Add pass to predicate layer ID reads Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	e518c92d26	asahi: Assume LAYER is flat-shaded It can't be anything else, this makes sure the varyings are sorted properly. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	68437eb0ba	asahi: Account for layering for attachment views Do not force a single-layer view, use an actual array attachment when there are multiple layers, since this corresponds to a layered framebuffer that will write to an array with the eMRT path. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	9dc87a00fd	asahi: Expose VS_LAYER_VIEWPORT behind a flag We can't technically expose the extension without a higher GL version, but the implemented subset should work and this lets us test with piglit. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	2396d3fe62	asahi: Use layered layouts For correct eMRT code. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	8a48af4f8f	agx/lower_tilebuffer: Support spilled layered RTs If we spill render targets with a layered framebuffer, our spilled targets are assumed to be 2D Arrays (in general). We need to use arrayed image operations to load/store from these. The layer is given by the layer as read in the fragemnt shader. This handles the eMRT portion of layered rendering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	041451b655	agx/tilebuffer: Support layered layouts Just add a flag for it. We don't care about the actual # of layers when calculating the layout, only the boolean fact of being layered or not. The reason we need this at all is because the eMRT implementation needs to account for layering and that is only keyed off the tilebuffer layout. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:37:55 -04:00
Alyssa Rosenzweig	b252630604	agx: Support packed layered rendering writes With the new pass. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	4a954dff07	asahi,agx: Select layered rendering outputs These 2 are together Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	88fd76d378	asahi: Add helper to get layer id in internal program For background/EOT only. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	7d94f2ee49	agx: Add pass to lower layer ID writes The hardware needs the layer ID and the viewport index packed together. That consumes an entire varying slot, if we want those available in the frag shader we need a separate slot. Add a pass to insert the extra packed write. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	175819eec6	agx: Handle layered block image stores Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	c3a208d6d9	agx: Pack block image store dim correctly Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	da0da5d6f8	agx/nir_lower_texture: Allow disabling layer clamping For background program with layered. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	10b9c2fa36	nir: Support arrays in block_image_store_agx For layered rendering, runs once per layer. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig	f4042afd57	nir: Add layer_id_written_agx sysval We'll implement layer ID reads in the frag shader with a varying read, but if the VS doesn't write the varying we need to return 0 per the spec. Add a sysval to detect that case so we can handle it at runtime without keys. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	d83d24e96a	agx: Insert jmp_exec_none instructions With the exception of the backwards branch for loops, all the control flow we insert during instruction selection just predicates instructions rather than actually jumping around. That means, for example, we execute both sides of the if even for a uniform condition! That's inefficient. The solution is insert jmp_exec_none instructions after control flow in order to skip unexecuted regions, which is much faster than predicating them out. However, jmp_exec_none is costly in itself, so we need to use a heuristic to determine when it's actually beneficial. This uses a very simple heuristic for this purpose. However, it is a massive performance speed-up for Dolphin uber shaders: 39fps -> 67fps at 2x resolution. Nearly a doubling of performance! Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	79c4d4213c	agx: Add agx_prev_block helper Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	dd6106c8bd	agx: Add jumps to block ends jmp_exec_none variant that jumps to the last instruction of the target block, rather than the beginning. This is convenient for skipping over elses, while still executing the block-final pop_exec instruction. Similarly for skipping over loop bodies while still executing the block-final pop_exec, after break instructions. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	22ab505a3d	agx: Augment if/else/while_cmp with a target Add an optional pointer to a target block for these instructions. This does NOT act like a logical branch, and does NOT get added to the logical control flow. It is ignored wholesale until after RA, when physical edges may be inserted by a pass we add later in this series. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	9a894c9a33	agx: Set PIPE_SHADER_CAP_CONT_SUPPORTED So we get adequate testing of continues, rather than lowering them in GLSL. We don't really /want/ to see continues but lowering them away will just make them harder to test... and besides, we should be optimizing them in NIR (not GLSL) so we can get the win on Vulkan too. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	d05657e0d6	agx: Hoist sample_mask/zs_emit Although this is well-motivated, perf effect seems to be neglible for Dolphin. It does prevent the scheduler from making things worse by sinking these instructions though, so as a way to prevent future problems this seems sensible. The kind of problem this affects (late discard) isn't modelled in shader-db. Nevertheless, nothing concerning there: total instructions in shared programs: 1756699 -> 1756722 (<.01%) instructions in affected programs: 10106 -> 10129 (0.23%) helped: 21 HURT: 41 Inconclusive result (value mean confidence interval includes 0). total bytes in shared programs: 11525404 -> 11525452 (<.01%) bytes in affected programs: 72900 -> 72948 (0.07%) helped: 27 HURT: 41 Inconclusive result (value mean confidence interval includes 0). total halfregs in shared programs: 483394 -> 483286 (-0.02%) halfregs in affected programs: 4945 -> 4837 (-2.18%) helped: 88 HURT: 78 Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	0d8362b842	agx: Align the reg file for 256-bit vectors This fixes live range splitting with 3D textureGrad(), which involves vectors larger than the natural 128-bit maximum and hence requires special handling. Fixes this assert with a combination of debug flags and new patches: unsigned int find_best_region_to_evict(struct ra_ctx , unsigned int, unsigned int , unsigned int *): Assertion `(rctx->bound % size) == 0 && "register file size must be aligned to the maximum vector size"' failed Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Alyssa Rosenzweig	cb14cddfa5	asahi: Clamp index buffer extent to what's read This makes for cleaner agxdecodes, I think this matches what I've seen on the macOS side but I might be misremembering. Certainly shouldn't hurt. This only applies for direct draws. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-10-01 12:32:11 -04:00
Friedrich Vock	2be9b66cdd	radv: Fix check in insert_block Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25496>	2023-10-01 13:11:50 +02:00
Friedrich Vock	a0fba17311	radv: Initialize shader freelist on allocation Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25496>	2023-10-01 13:11:43 +02:00
Vitaliy Triang3l Kuzmin	a43ee1ca50	r600: Replace R600_BIG_ENDIAN with UTIL_ARCH_BIG_ENDIAN In particular, removes the dependency of r600_formats.h on r600_pipe.h so it can be shared between Gallium and Vulkan. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24513>	2023-10-01 09:25:50 +00:00
Marek Olšák	43e7285069	winsys/amdgpu: pad gfx and compute IBs with a single NOP packet to minimize CP overhead Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25043>	2023-10-01 08:45:22 +00:00
Marek Olšák	4f660f9937	ac/gpu_info: pad IBs according to ib_size_alignment Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25043>	2023-10-01 08:45:22 +00:00
Marek Olšák	b6f435888b	ac/gpu_info: replace ib_alignment with per-IP IB base and size alignments Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25043>	2023-10-01 08:45:22 +00:00
Eric Engestrom	276caddbd9	ci/deqp-runner: restore exit-on-error after getting deqp-runner's exit code Signed-off-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24738>	2023-10-01 02:00:50 +00:00
Eric Engestrom	f8326d0950	ci/deqp-runner: fix indentation Signed-off-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24738>	2023-10-01 02:00:50 +00:00
Marek Olšák	6b29c16db8	amd: rename GFX110x to NAVI31-33 Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25492>	2023-09-30 23:08:47 +00:00
Marek Olšák	c7e08acd12	ac/llvm: fix flat PS input corruption Fixes: `0a54fbb5b4` - radeonsi/gfx11: interp changes for 32bit Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25492>	2023-09-30 23:08:47 +00:00
Marek Olšák	d50cc2e0cf	ac/gpu_info: don't align IBs to the GL2 cache line size PAL doesn't do it. If drivers want IBs not to share cache lines with other buffers, they should align the size manually. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25492>	2023-09-30 23:08:46 +00:00

1 2 3 4 5 ...

178442 commits