fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 02:48:06 +02:00

Author	SHA1	Message	Date
Paulo Zanoni	4c366ef67b	anv/trtt: set every entry to NULL when we create an L2 table When we create sparse resources the first thing we do is a NULL bind on them, as the Vulkan spec mandates certain behavior even for unbound sparse resources. We do this with the minimal effort possible: if we can get away with marking an L2 pointer as NULL in the L3 table, we just do it and return, instead of going all the way to creating L1 tables and marking all the final entries as NULL. The strategy we were using had a bug that could lead to previously created NULL entries not being marked as NULL anymore. Let's give an example: (before proceeding, keep in mind that a NULL entry in the L3 and L2 tables has bit 1 set, it does not have the value 0) - Create a 64mb buffer that uses an entire L1 table (needs to be properly aligned), which triggers a NULL bind. - Our algorithm will just set the L3 entry (pointing to the L2 table) as NULL. - Create a 64kb buffer that uses the same L2 table (but a different L1 table). - The NULL bind triggered won't do anything as the L2 table is already NULL. - Bind the first buffer to actual memory. This will end up creating the L2 table and the L1 table. The only entry we will set in the L2 table will be the one pointing to the L1 table. All the other values will be 0 (so they won't have neither the NULL or Invalid bits set: access to them will lead to page faults). - Try to use the second buffer, which is still unbound. It was relying on the fact that its L2 table pointer was NULL, but now it's not anymore, so the page walker will fetch the L1 entries in the L2 table and they will all be zero instead of having the NULL bit set. The fix is pretty simple: whenever we create a new L2 table, set every entry to NULL (except the one we're about to set to non-NULL). This preserves behavior for every other NULL resource relying on the L3 entry being set to NULL. We don't need to do this for the L1 table because its entries are different and instead of having bits to signal NULL entries we have a special TR-TT register that we can set that gets compared to check if an entry is NULL, and we conveniently program it to 0: see ANV_TRTT_L1_NULL_TILE_VAL. I am not aware of any real workloads that are triggering this behavior, I found this issue while investigating something else, running a custom sparse program in our pre-silicon environment, and it told us about the page faults. Cc: mesa-stable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30953>	2024-10-15 23:05:30 +00:00
M Henning	537ada2308	nak: Phi coalescing via biased register coloring Reduces code size by -29.08% on shaderdb + nvk-fossils-foss Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31498>	2024-10-15 22:29:11 +00:00
Dylan Baker	38f7ae5288	release: push 24.3 out two weeks I've had a couple of requests to push the release out 1-2 weeks. There have been various reasons for this, but the best one (IMHO) is that this is the week directly after XDC, and many people will be jetlagged and/or suffering from the post-XDC flu. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31637>	2024-10-15 14:59:50 -07:00
Karol Herbst	ff2c4e8f11	zink: add CL CTS result Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31614>	2024-10-15 21:07:07 +00:00
Juston Li	0c9ee0f2b9	android: look for debug/vendor prefixed options Properties from the vendor partition must use a "vendor." prefix from Android T+. Meanwhile the "debug." prefix can be used for local overrides. The order of precedence thus becomes: 1. getenv 2. debug.mesa.* 3. vendor.mesa.* 4. mesa.* (as a fallback for older versions) Signed-off-by: Juston Li <justonli@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31584>	2024-10-15 20:22:17 +00:00
Kenneth Graunke	4cb67cb07a	intel/brw: Use whole 512-bit registers in constant combining on Xe2 Xe2 increased the register size from 256-bits to 512-bits. So we can store 32 16-bit values in a register, rather than 16 values. Prior to this patch, we hadn't updated the pass, so the second half of each of our registers was unused. Backport-to: 24.2 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	d9e5022650	intel/brw: Delete more Gfx8 code from brw_fs_combine_constants These platforms are supported by elk, not brw. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	dea61b7399	intel/brw: Fix register and builder size in emit_barrier() for Xe2 We were manually allocating 1 REG_SIZE for the barrier payload, which is only half a register on Xe2. This should eventually get allocated to a whole register anyway, but it's awkward in the meantime. Also, we were zero-initializing the header using group(8, 0) which only initialized half the register. The rest of the fields are Reserved MBZ, so they're likely unused and unread anyway - but it's better to zero-initialize them so we don't get random undefined, miserable-to-debug behavior. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	7c9eb8b289	intel/brw: Make a ubld temporary in emit_barrier() Saves typing .exec_all() in a lot of places. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	a9d9488788	intel/brw: Delete Gfx7-8 code from emit_barrier() Those are supported by elk, not brw. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	c747c1e1f4	intel/brw: Fix spill/fill count for load/store_scratch in SIMD32 Honestly, I don't know what I was thinking - we are emitting a single spill/fill message here, but were counting it as 2 spill/fills in SIMD32 shaders. So our eventual shader stat reporting would subtract the number of spills and fills from send_count, and get a negative number, wrapping around to just shy of UINT32_MAX. That's way too many sends. This is especially noticable on Xe2 which often uses SIMD32 shaders. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Pavel Ondračka	58d6906f8c	r300/ci: update ci expectations after piglit uprev Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31663>	2024-10-15 17:43:00 +00:00
Faith Ekstrand	03a393d6ca	nak: Handle annotations in legalization Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31665>	2024-10-15 17:13:27 +00:00
Faith Ekstrand	36d9d11882	nak: Remove annotations before calc_instr_deps() Otherwise the annotations might throw off latency information which needs exact instruction counts. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31665>	2024-10-15 17:13:27 +00:00
Aleksi Sapon	9e769a0620	lavapipe: enable alpha-to-coverage dithering This is a common feature on hardware, both Nvidia and Apple GPUs have it always enabled. On OpenGL this can be controlled using NV_alpha_to_coverage_dither_control, but as far as I can tell there is no extension on Vulkan. Metal also has this feature without a control. Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31373>	2024-10-15 16:17:40 +00:00
Aleksi Sapon	ad4635d6ef	llvmpipe: implement alpha-to-coverage dithering Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31373>	2024-10-15 16:17:40 +00:00
Danylo Piliaiev	6d6d5b869c	freedreno/cffdec: Add option to dump bindless descriptors cffdump --bindless would dump bindless descriptors. We don't know what exactly is in the descriptors, so we dump all interpretations for each of them. Example: set[1]: UBO[0]: { BASE_LO = 0x23806420 } { BASE_HI = 0xc \| SIZE = 0x2 } STORAGE/TEXEL/IMAGE[0]: { TILE_MODE = TILE6_LINEAR \| SWIZ_X = A6XX_TEX_Z \| SWIZ_Y = A6XX_TEX_X \| SWIZ_Z = A6XX_TEX_Y \| SWIZ_W = A6XX_TEX_W \| MIPLVLS = 0 \| SAMPLES = MSAA_ONE \| FMT = FMT6_R8_G8B8_2PLANE_420_UNORM \| SWAP = WZYX } { WIDTH = 12 \| HEIGHT = 8 } { STRUCTSIZETEXELS = 1024 \| STARTOFFSETTEXELS = 0 \| PITCHALIGN = 1 \| PITCH = 128 \| TYPE = A6XX_TEX_2D } { ARRAY_PITCH = 4096 \| MIN_LAYERSZ = 0 } { BASE_LO = 0xa5000 } { BASE_HI = 0x1 \| DEPTH = 1 } { MIN_LOD_CLAMP = 0.000000 \| PLANE_PITCH = 128 } { FLAG_LO = 0xa6000 } { FLAG_HI = 0x1 } { FLAG_BUFFER_ARRAY_PITCH = 327680 \| 0xa0000 } { FLAG_BUFFER_PITCH = 64 \| FLAG_BUFFER_LOGW = 0 \| FLAG_BUFFER_LOGH = 0 } { 11 = 0 } { 12 = 0 } { 13 = 0 } { 14 = 0 } { 15 = 0 } SAMPLER[0]: { XY_MAG = A6XX_TEX_NEAREST \| XY_MIN = A6XX_TEX_NEAREST \| WRAP_S = A6XX_TEX_CLAMP_TO_EDGE \| WRAP_T = A6XX_TEX_MIRROR_CLAMP \| WRAP_R = A6XX_TEX_MIRROR_CLAMP \| ANISO = A6XX_TEX_ANISO_2 \| LOD_BIAS = 4.437500 } { COMPARE_FUNC = FUNC_GEQUAL \| MAX_LOD = 4.000000 \| MIN_LOD = 0.000000 } { REDUCTION_MODE = A6XX_REDUCTION_MODE_MIN \| BCOLOR = 0x400080 } { 3 = 0x1 } Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31632>	2024-10-15 15:35:39 +00:00
Danylo Piliaiev	e2e9dd4f21	freedreno/rnndec: Consider array length when finding by reg name Otherwise we get a valid reg base for reg array with OOB index. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31632>	2024-10-15 15:35:39 +00:00
Deborah Brouwer	0007077c11	ci: remove xfail program@build@include-directories Now that build-piglit.sh is no longer removing ‘include_test.h’ this test `program@build@include-directories` is passing which is causing jobs to fail due to this unexpected improvement. Remove this test from expected fails so that the jobs can pass. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31379>	2024-10-15 15:50:47 +01:00
Collabora's Gfx CI Team	68aa78a858	Uprev Piglit to 7ce69da1199d12ed0ddaa251ed489750523798fb `e9ab30aeae...7ce69da119` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31379>	2024-10-15 15:50:47 +01:00
Mike Blumenkrantz	4ac4004816	llvmpipe: expose GL multiview extensions this is a no-op since lavapipe is already doing it Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31590>	2024-10-15 14:01:42 +00:00
Mike Blumenkrantz	f5bd39e0e3	gallium: delete duplicated viewmask member in draw info this was added for lavapipe, but it should have been in the framebuffer state since it is a framebuffer state now the GL multiview extensions are supported with viewmask in the framebuffer struct, which means this is all redundant and should be corrected/deleted Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31590>	2024-10-15 14:01:42 +00:00
Mike Blumenkrantz	8487ecfa44	iris: assert that viewmask is 0 this is not supported by the driver, so it doesn't need to be checked at runtime Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31590>	2024-10-15 14:01:42 +00:00
Mike Blumenkrantz	a82d8e638d	util/framebuffer: add viewmask compare for fb equal Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31590>	2024-10-15 14:01:42 +00:00
Boris Brezillon	e113ce0d87	panvk/csf: Fix the clear-only RUN_FRAGMENT case Issuing a RUN_FRAGMENT with no tiler descriptor is a valid use case when one just needs to clear attachments. Make sure we take that case into account in issue_fragment_jobs(). Fixes: `5544d39f44` ("panvk: Add a CSF backend for panvk_queue/cmd_buffer") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31625>	2024-10-15 13:16:07 +00:00
Boris Brezillon	e9462e77d8	panvk: Advertise dynamic rendering support This was already supported, but not yet exposed. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31625>	2024-10-15 13:16:07 +00:00
Boris Brezillon	66543a111c	panvk/csf: Fix a buffer/stack-overflow when PANVK_DEBUG=sync We're not allocating enough qsubmit slots when force_sync=true in panvk_queue_submit(). Fixes: `5544d39f44` ("panvk: Add a CSF backend for panvk_queue/cmd_buffer") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31625>	2024-10-15 13:16:07 +00:00
Boris Brezillon	195fd67910	panvk/csf: Fix cmd_emit_dcd() in the FB preload logic We need to mask the bound_attachments value with MESA_VK_RP_ATTACHMENT_ANY_COLOR_BITS otherwise we're passing depth/stencil attachments masks too. Fixes: `0bc3502ca3` ("panvk: Implement a custom FB preload logic") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31625>	2024-10-15 13:16:07 +00:00
Boris Brezillon	4199212ebe	panvk/csf: Fix dirty checking in prepare_ds() If the fragment shader changed, we need to re-emit the depth-stencil descriptor. Fixes: `5544d39f44` ("panvk: Add a CSF backend for panvk_queue/cmd_buffer") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31625>	2024-10-15 13:16:07 +00:00
Boris Brezillon	1096adb128	panvk/csf: Fix no-fragment IDVS Fragment shader program/resource table are only set when the shader or descriptor table is updated. But if the first RUN_IDVS happening on the command buffer doesn't require fragment shading, those registers won't be updated, and we might inherit values set by a previous command buffer executed on the same queue, leading to GPU faults if these descriptor buffers have been recycled. Fixes: `5544d39f44` ("panvk: Add a CSF backend for panvk_queue/cmd_buffer") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31625>	2024-10-15 13:16:07 +00:00
Boris Brezillon	ce1562e9cc	panvk: Make panvk_pool_free_mem() error proof It's pretty easy to pass the wrong pool to panvk_pool_free_mem() (was the case in panvk_shader_destroy() and panvk_internal_shader_destroy()), so let's make the existing interface more robust to this kind of mistake by storing the 'owned-by-pool' information at the panvk_priv_mem level. We use the lower 3 bits of the BO pointer for that, since a BO object is guaranteed to be aligned on 8-byte. Fixes: `ce14681ebf` ("panvk: Don't leak vertex shader program descriptors") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31625>	2024-10-15 13:16:07 +00:00
Georg Lehmann	40c4ec881d	radv: call nir_opt_remove_phis in radv_optimize_nir_algebraic Foz-DB Navi31: Totals from 3048 (3.84% of 79395) affected shaders: Instrs: 603535 -> 599281 (-0.70%); split: -0.74%, +0.03% CodeSize: 3074416 -> 3056236 (-0.59%); split: -0.60%, +0.01% Latency: 2851382 -> 2849808 (-0.06%); split: -0.07%, +0.01% InvThroughput: 294247 -> 294201 (-0.02%); split: -0.02%, +0.01% SClause: 18077 -> 18083 (+0.03%); split: -0.03%, +0.07% Copies: 63860 -> 59926 (-6.16%); split: -6.33%, +0.17% Branches: 15901 -> 15899 (-0.01%) PreSGPRs: 62441 -> 61353 (-1.74%) VALU: 291049 -> 291035 (-0.00%); split: -0.01%, +0.00% SALU: 96786 -> 92606 (-4.32%); split: -4.42%, +0.10% Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31360>	2024-10-15 10:01:43 +00:00
Pavel Ondračka	f94087be2c	r300/compiler: reformat using default mesa .clang-format rules Most notably switch from tabs to 3 spaces. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Acked-by: Filip Gawin <filip@gawin.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23771>	2024-10-15 09:24:02 +00:00
Pavel Ondračka	4a6abbc9c1	r300: opt in to clang-format CI enforcement for the compiler Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Acked-by: Filip Gawin <filip@gawin.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23771>	2024-10-15 09:24:02 +00:00
Pavel Ondračka	4e4b124fa9	r300: add .clang-format file for the compiler Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Acked-by: :Filip Gawin <filip@gawin.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23771>	2024-10-15 09:24:02 +00:00
Mary Guillemard	b12c294e7b	panvk: Define primitive size for RUN_TILER/RUN_IDVS We were ignoring line width with line topologies. This also force a value of 1.0f in case point topology is in use while no write in shader is being performed to respect maintenance5 requirements. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31623>	2024-10-15 08:50:19 +00:00
Iago Toral Quiroga	188f1c6cbe	v3dv: rewrite device identification Instead of trying to match device compatible strings like 'brcm,2712-v3d', which may change with product revisions, match the device name, like 'v3d'. This simplifies a bit the matching logic and allows us to have less diverging paths for hardware and simulator. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31619>	2024-10-15 07:57:51 +00:00
Iago Toral Quiroga	23432921b3	v3dv: drop device_id field This was added only to report the DRM device ID of the actual GPU used in the simulated environment but there is no real reason we need to do that, so let's juts keep it simple and provide the device ID of the simulated device instead. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31619>	2024-10-15 07:57:51 +00:00
Tapani Pälli	a3c03b6a96	mesa: fix DXT1 support with EXT_texture_compression_dxt1 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11987 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: David Heidelberg <david@ixit.cz> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31540>	2024-10-15 07:19:55 +00:00
Utku Iseri	271fdedc5a	st/mesa: clamp reported max lod bias mesa clamps lod bias values to -32,31 during quantization, so the reported max value should also be limited to 31. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11977 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31525>	2024-10-15 06:38:48 +00:00
Marek Olšák	0727634443	nir/opt_load_store_vectorize: vectorize load_smem_amd radeonsi+ACO with the new vectorization callback: TOTALS FROM AFFECTED SHADERS (19508/58918) VGPRs: 708672 -> 708864 (0.03 %) Code Size: 31458688 -> 31217160 (-0.77 %) bytes Max Waves: 305960 -> 305952 (-0.00 %) Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	a44e5cfccf	nir/opt_load_store_vectorize: allow a 4-byte hole between 2 loads If there is a 4-byte hole between 2 loads, drivers can now optionally vectorize the loads by including the hole between them, e.g.: 4B load + 4B hole + 8B load --> 16B load All vectorize callbacks already reject all holes, but AMD will want to allow it. radeonsi+ACO with the new vectorization callback: TOTALS FROM AFFECTED SHADERS (25248/58918) VGPRs: 871116 -> 871872 (0.09 %) Spilled SGPRs: 397 -> 407 (2.52 %) Code Size: 43074536 -> 42496352 (-1.34 %) bytes Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	80c156422d	nir/opt_load_store_vectorize: allow overfetching, merge overfetched loads New load merging transformations (first, second), examples: (vec4, vec3) ==> vec8(read=0x7f) (because NIR doesn't have vec7) (vec1, vec8(read=0x7f)) ==> vec8(read=0xff) - the unused component at the end of vec8 is dropped Not merged: vec8(read=0xfe) + vec1 - unused components at the beginning are kept Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	65ace5649b	nir: reject unsupported component counts from all vectorize callbacks If you allow an unsupported component count in the callback for loads, nir_opt_load_store_vectorize will align num_components to the next supported vector size, essentially overfetching. This changes all callbacks to reject it. AMD will enable it in a later commit. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	02923e237d	nir: add hole_size parameter into the vectorize callback It will be used to allow merging loads with a hole between them. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	8ce43b7765	nir/opt_load_store_vectorize: add entry::num_components We will represent vec6..vec7, vec9..vec15 loads with 8 and 16 components respectively, so we need to track how many components we really use. This is a prerequisite for optimal merging up to vec16. Example: Step 1: vec4 + vec3 ==> vec7as8 (last component unused) Step 2: vec1 + vec7as8 ==> vec8 (last unused component dropped) Without using the number of components read, the same example would end up doing: Step 1: vec4 + vec3 ==> vec8 Step 2: vec1 + vec8 ==> vec9 (fail) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Alyssa Rosenzweig	e9303c0952	nir: extract round component helper another nir pass will use this. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Faith Ekstrand	c2684968de	nvk: Advertise 64-bit atomics on buffer views We also add an nvk_format_supports_atomics() helper. This helper lives in NVK for now because it's not just about the format and hardware but also about whether or not we have compiler support in NAK. Fixes: `1d10de539c` ("nvk: Implement VK_EXT_shader_image_atomic_int64") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31633>	2024-10-15 05:21:03 +00:00
Faith Ekstrand	d3d8271620	nvk: Re-sort the features table There were a couple of KHR extensions that got mixed in with the EXTs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31633>	2024-10-15 05:21:03 +00:00
Faith Ekstrand	681f807747	nvk: Only set texture/sampler tables and SLM for enabled engines Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31633>	2024-10-15 05:21:02 +00:00

1 2 3 4 5 ...

196393 commits