fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-29 12:20:10 +01:00

Author	SHA1	Message	Date
Jordan Justen	0088aae481	intel/brw: Add new encode/decode for use with brw_data_type_float/int Rework: * Sushma: Add BF in brw_data_type_encode, brw_data_type_decode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:53 +00:00
Jordan Justen	46e843f76e	intel/brw: Add brw_data_type_float/brw_data_type_int These type encodings were first were used in dpas instructions, but continue to be used in more places. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:52 +00:00
Sushma Venkatesh Reddy	54accefed2	brw: Add BRW_TYPE_BF8 and BRW_TYPE_HF8 for float8 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39007>	2025-12-19 00:09:52 +00:00
Mel Henning	c9ae59dec5	nvk: Set maxStorageBufferRange = maxBufferSize We were previously reporting a larger maxStorageBufferRange than our maxBufferSize, which is weird. Lower maxStorageBufferRange to match maxBufferSize. Fixes crucible stress.limits.buffer-update.range.storage.q0 Fixes: `65f12fde44` ("nvk: Improve address space and buffer size limits") Reviewed-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39021>	2025-12-18 23:51:50 +00:00
Ian Romanick	b967942b64	brw: Do cmod prop again after scheduling Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details After selecting the scheduling mode, do cmod prop again. It's possible that doing cmod prop between performing a schedule and trying to register allocate would cause a different scheduling mode to be selected. However, this would require fully restoring the pre-schedule set of instructions (via cloning). I have tried to implement this, and it's harder than it looks. :( v2: Delete unused variable `progress`. Noticed by Marge. shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19967018 -> 19967006 (<.01%) instructions in affected programs: 10652 -> 10640 (-0.11%) helped: 4 / HURT: 0 total cycles in shared programs: 884129990 -> 884139590 (<.01%) cycles in affected programs: 20334512 -> 20344112 (0.05%) helped: 0 / HURT: 4 fossil-db: Lunar Lake Totals: Instrs: 924967191 -> 924963460 (-0.00%); split: -0.00%, +0.00% Cycle count: 105962414958 -> 105961925594 (-0.00%); split: -0.00%, +0.00% Spill count: 3423582 -> 3423564 (-0.00%); split: -0.00%, +0.00% Fill count: 4877121 -> 4876955 (-0.00%); split: -0.00%, +0.00% Totals from 2511 (0.12% of 2018786) affected shaders: Instrs: 12541707 -> 12537976 (-0.03%); split: -0.03%, +0.00% Cycle count: 4816359238 -> 4815869874 (-0.01%); split: -0.01%, +0.00% Spill count: 179536 -> 179518 (-0.01%); split: -0.03%, +0.02% Fill count: 279407 -> 279241 (-0.06%); split: -0.07%, +0.01% Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown) Totals: Instrs: 980252404 -> 980237686 (-0.00%); split: -0.00%, +0.00% Cycle count: 91758669556 -> 91764028404 (+0.01%); split: -0.00%, +0.01% Spill count: 3664771 -> 3664744 (-0.00%); split: -0.00%, +0.00% Fill count: 4962078 -> 4960482 (-0.03%); split: -0.04%, +0.01% Totals from 8472 (0.38% of 2251522) affected shaders: Instrs: 34977623 -> 34962905 (-0.04%); split: -0.04%, +0.00% Cycle count: 6251857553 -> 6257216401 (+0.09%); split: -0.04%, +0.13% Spill count: 480251 -> 480224 (-0.01%); split: -0.01%, +0.00% Fill count: 676539 -> 674943 (-0.24%); split: -0.28%, +0.05% Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	09450faf6a	brw: Do cmod prop again after post-RA scheduling shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19968728 -> 19963825 (-0.02%) instructions in affected programs: 788014 -> 783111 (-0.62%) helped: 2503 / HURT: 0 total cycles in shared programs: 884112912 -> 884093268 (<.01%) cycles in affected programs: 20017168 -> 19997524 (-0.10%) helped: 1830 / HURT: 52 LOST: 0 GAINED: 6 fossil-db: All Intel platforms had similar results. (Meteor Lake shown) Totals: Instrs: 980768016 -> 980172179 (-0.06%) Cycle count: 91762351767 -> 91757280093 (-0.01%); split: -0.01%, +0.00% Max dispatch width: 37602592 -> 37608768 (+0.02%) Totals from 157150 (6.98% of 2251329) affected shaders: Instrs: 107323207 -> 106727370 (-0.56%) Cycle count: 12696754006 -> 12691682332 (-0.04%); split: -0.04%, +0.00% Max dispatch width: 3708584 -> 3714760 (+0.17%) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	08d71730ca	brw/cmod: Propagate to an instruction with same source Detect cases like mov.nz.f0.0(8) null<1>D g66<8,8,1>D (+f0.0) sel(8) g123<1>UD g87<8,8,1>UD g84<8,8,1>UD mov.nz.f0.0(8) null<1>D g66<8,8,1>D (+f0.0) sel(8) g124<1>UD g88<8,8,1>UD g85<8,8,1>UD Either MOV instruction could also be an equivalent CMP. v2: Require no predicate, groups match, and flags written match. v3: Add some more unit tests. Suggested by Caio. shader-db: All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 17203627 -> 17203590 (<.01%) instructions in affected programs: 51432 -> 51395 (-0.07%) helped: 37 / HURT: 0 total cycles in shared programs: 879884982 -> 879884670 (<.01%) cycles in affected programs: 6014730 -> 6014418 (<.01%) helped: 25 / HURT: 4 fossil-db: Lunar Lake Totals: Instrs: 925092938 -> 925071952 (-0.00%); split: -0.00%, +0.00% Cycle count: 105972157149 -> 105966120894 (-0.01%); split: -0.01%, +0.00% Spill count: 3423592 -> 3423582 (-0.00%) Fill count: 4876743 -> 4877121 (+0.01%); split: -0.00%, +0.01% Max live registers: 193525293 -> 193525251 (-0.00%) Max dispatch width: 49047056 -> 49047088 (+0.00%); split: +0.00%, -0.00% Totals from 17714 (0.88% of 2018791) affected shaders: Instrs: 56708169 -> 56687183 (-0.04%); split: -0.04%, +0.00% Cycle count: 4560530879 -> 4554494624 (-0.13%); split: -0.15%, +0.01% Spill count: 434846 -> 434836 (-0.00%) Fill count: 807443 -> 807821 (+0.05%); split: -0.02%, +0.07% Max live registers: 4332542 -> 4332500 (-0.00%) Max dispatch width: 295248 -> 295280 (+0.01%); split: +0.02%, -0.01% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 995075628 -> 995051291 (-0.00%); split: -0.00%, +0.00% Cycle count: 92060967154 -> 92059311640 (-0.00%); split: -0.00%, +0.00% Spill count: 3664664 -> 3664675 (+0.00%); split: -0.00%, +0.00% Fill count: 4961929 -> 4961874 (-0.00%); split: -0.00%, +0.00% Max live registers: 121480292 -> 121480184 (-0.00%) Max dispatch width: 37947528 -> 37947496 (-0.00%) Totals from 20569 (0.90% of 2278279) affected shaders: Instrs: 57437989 -> 57413652 (-0.04%); split: -0.04%, +0.00% Cycle count: 4297505238 -> 4295849724 (-0.04%); split: -0.06%, +0.03% Spill count: 487508 -> 487519 (+0.00%); split: -0.00%, +0.00% Fill count: 869228 -> 869173 (-0.01%); split: -0.01%, +0.00% Max live registers: 2413028 -> 2412920 (-0.00%) Max dispatch width: 239280 -> 239248 (-0.01%) Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 1012570598 -> 1012546137 (-0.00%); split: -0.00%, +0.00% Cycle count: 85579989052 -> 85589116671 (+0.01%); split: -0.00%, +0.01% Spill count: 3901755 -> 3901748 (-0.00%) Fill count: 6799383 -> 6799367 (-0.00%) Max live registers: 122288761 -> 122288658 (-0.00%) Totals from 20595 (0.90% of 2280449) affected shaders: Instrs: 57764192 -> 57739731 (-0.04%); split: -0.04%, +0.00% Cycle count: 3899898675 -> 3909026294 (+0.23%); split: -0.04%, +0.27% Spill count: 481262 -> 481255 (-0.00%) Fill count: 1057996 -> 1057980 (-0.00%) Max live registers: 2412395 -> 2412292 (-0.00%) Skylake Totals: Instrs: 516619178 -> 516617390 (-0.00%) Cycle count: 57593545602 -> 57592502019 (-0.00%); split: -0.00%, +0.00% Fill count: 860403 -> 860402 (-0.00%) Max live registers: 87553761 -> 87553649 (-0.00%) Totals from 1357 (0.08% of 1730068) affected shaders: Instrs: 3575640 -> 3573852 (-0.05%) Cycle count: 1772148559 -> 1771104976 (-0.06%); split: -0.06%, +0.00% Fill count: 68917 -> 68916 (-0.00%) Max live registers: 131237 -> 131125 (-0.09%) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	50f2cd7366	brw/dce: Don't generate more NULL destinations after brw_lower_3src_null_dest Later commits will call DCE after lowering has been performed. Creating more things that would need lowering is problematic. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	24cd8aa3b8	brw/cmod: Allow FIXED_GRF Later commits will call cmod prop after register allocation. At that time, there is only FIXED_GRF. No shader-db or fossil-db changes on any Intel platform. v2: FIXED_GRF uses subnr instead of offset. Add a unit test to demonstrate the issue. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	d7227b11a1	brw: elk: Disable can_do_cmod for MACH PRMs for G35 (Gfx4) through Ivy Bridge (Gfx7) all say that conditional modifiers are allowed for MACH. Starting with Haswell (Gfx7.5), this seems to be removed. This function doesn't have any way to know the platform, so false is returned for all platforms. No shader-db or fossil-db changes on any Intel platform. Prevents a failure in "brw: Do cmod prop again after post-RA scheduling" in piglit's builtin-uint-mad_sat-1.0.generated.cl. Cc: stable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	ba30794847	brw/cmod: Don't propagate between instructions in different groups The group implicity selects which flags the instruction can write. This was discovered while working on another set of changes that could change some logical operations into predicated MOV instructions. Prevents regressions later in the series in dEQP-VK.graphicsfuzz.cov-loop-fragcoord-identical-condition. No shader-db or fossil-db changes on any Intel platform. v2: Update the comment in the test case. Suggested by Caio. Fixes: `95ac3b1dae` ("i965/fs: don't propagate cmod when the exec sizes differ") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Ian Romanick	c0fb93506b	brw: Add brw_reg::is_grf v2: Add a function comment. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>	2025-12-18 15:15:20 -08:00
Benjamin Cheng	fa8b0b6bbb	radv/video: Enable write combine for decode Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39025>	2025-12-18 15:25:57 -05:00
Dmitry Baryshkov	4315c28739	gfxstream: don't dump genvk.py args to generated files Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Full command lines include full path to the output file, which triggers reproducibility warnings (e.g. in Yocto builds). Drop the args and print only a basename of the script used to generate the file. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38875>	2025-12-18 18:52:19 +00:00
hwandy	ffbe6470a2	anv: fix a memory leak in slab allocator. An example when the memory leak happens: requested_size = 4 and alignment = 65536 in anv_slab_bo_alloc: The alloc_size = 65536 and requested = 4 in this case. The group to allocate the entry is the group of size 65536 based on the entry size, while the group to reclaim the entry is the group of size 4 due to the bo->size is registered as the requested_size=4 and used in anv_slab_bo_free. That means, the entry is allocated in group[order of size 65535]->free, moved from group[order of size 65535]->free to the user, and then moved to group[order of size 4]->reclaim, so the entries is accumulated in group[order of size 4]->reclaim and group[order of size 65535] keeps allocating new entries and leading to OOM. The solution is to use `bo->actual_size` to get the group in pb_slab_bo_free using the allocation size. Fixes: `dabb012423` ("anv: Implement anv_slab_bo and enable memory pool") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14396 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: hwandy <hwandy@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38989>	2025-12-18 18:25:54 +00:00
Alyssa Rosenzweig	61dc9201a1	brw: constant fold before texture lowering This ensures we don't need dynamic stuff. Noticed when debugging weird regressions around the mcs lowering. ARL: total instructions in shared programs: 19857061 -> 19854964 (-0.01%) instructions in affected programs: 91768 -> 89671 (-2.29%) helped: 154 HURT: 0 helped stats (abs) min: 9.0 max: 33.0 x̄: 13.62 x̃: 13 helped stats (rel) min: 0.51% max: 40.91% x̄: 4.66% x̃: 3.36% 95% mean confidence interval for instructions value: -14.04 -13.19 95% mean confidence interval for instructions %-change: -5.49% -3.84% Instructions are helped. total cycles in shared programs: 884538769 -> 884485530 (<.01%) cycles in affected programs: 10508994 -> 10455755 (-0.51%) helped: 116 HURT: 38 helped stats (abs) min: 4.0 max: 15238.0 x̄: 666.22 x̃: 148 helped stats (rel) min: 0.01% max: 34.53% x̄: 2.58% x̃: 1.07% HURT stats (abs) min: 4.0 max: 4027.0 x̄: 632.68 x̃: 302 HURT stats (rel) min: 0.01% max: 32.75% x̄: 3.46% x̃: 0.59% 95% mean confidence interval for cycles value: -631.32 -60.09 95% mean confidence interval for cycles %-change: -2.06% -0.12% Cycles are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39023>	2025-12-18 17:55:29 +00:00
Mel Henning	0df735a619	nvk: Disable compression for image import/export Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36450>	2025-12-18 12:13:05 -05:00
Mohamed Ahmed	cabfdb4404	nvk: Enable compression Enables compression for select images. Additionally, we get large (64K), and huge (2M) pages as a bonus as the hardware can only do compression on these page sizes. However, due to nouveau limitations, this means that we are limited to enabling it on things pinned to VRAM. Fortunately, this works out for us as we can enable it for color, Z/S, and storage images, which are the main types to benefit from compression as they're write heavy. Unfortunately, this means that we need to handle the memory allocation in a delicate way, as the Vulkan API is a bit restrictive in this regard, so we have to use dedicated allocations for compression/larger pages. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36450>	2025-12-18 12:12:47 -05:00
Juan A. Suarez Romero	d656960596	broadcom/ci: set testgroup size for asan Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Set it to 500 tests, as if just only one test fails the asan, all the tests will be marked as fail too. Keeping the size smaller, will allow to process later to bisect searching for the tests that actually expose the issue. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39022>	2025-12-18 16:42:30 +00:00
Juan A. Suarez Romero	cf7e2b9f6b	broadcom/ci: update expected list Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39022>	2025-12-18 16:42:30 +00:00
Mel Henning	b55b8da012	nak: Add a prepass instruction scheduler Totals: CodeSize: 5750619392 -> 5817868528 (+1.17%); split: -0.32%, +1.49% Number of GPRs: 16276896 -> 16342962 (+0.41%); split: -1.00%, +1.41% SLM Size: 8927212 -> 8739732 (-2.10%); split: -2.59%, +0.49% Static cycle count: 1497053946 -> 1412275595 (-5.66%); split: -6.00%, +0.33% Spills to memory: 14248182 -> 14157708 (-0.63%); split: -1.25%, +0.62% Fills from memory: 14248182 -> 14157708 (-0.63%); split: -1.25%, +0.62% Spills to reg: 9143000 -> 9042885 (-1.09%); split: -1.22%, +0.13% Fills from reg: 6892354 -> 6808724 (-1.21%); split: -1.33%, +0.12% Max warps/SM: 6482016 -> 6567500 (+1.32%); split: +1.40%, -0.08% Totals from 189431 (96.40% of 196502) affected shaders: CodeSize: 5739697280 -> 5806946416 (+1.17%); split: -0.32%, +1.50% Number of GPRs: 16114477 -> 16180543 (+0.41%); split: -1.01%, +1.42% SLM Size: 8927180 -> 8739700 (-2.10%); split: -2.59%, +0.49% Static cycle count: 1495006918 -> 1410228567 (-5.67%); split: -6.00%, +0.33% Spills to memory: 14248182 -> 14157708 (-0.63%); split: -1.25%, +0.62% Fills from memory: 14248182 -> 14157708 (-0.63%); split: -1.25%, +0.62% Spills to reg: 9141040 -> 9040925 (-1.10%); split: -1.23%, +0.13% Fills from reg: 6890401 -> 6806771 (-1.21%); split: -1.34%, +0.12% Max warps/SM: 6149140 -> 6234624 (+1.39%); split: +1.47%, -0.08% Reviewed-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33306>	2025-12-18 16:17:05 +00:00
Mel Henning	5caee114ec	nak: Reserve capacity in LiveSet::from_iter,extend Reviewed-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33306>	2025-12-18 16:17:05 +00:00
Mel Henning	f64d2c8557	nak: Factor out prev_multiple_of Reviewed-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33306>	2025-12-18 16:17:04 +00:00
Pierre-Eric Pelloux-Prayer	645fff5dae	ac/descriptors: account for num_storage_samples for gfx10 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This fixes a page fault when nr_samples=4 but nr_storage_samples=2. Based on si_is_format_supported this is only supported for color formats and when has_eqaa_surface_allocator is true (< GFX11). The referenced commit below didn't introduce the issue but it exposed it by forcing the gfx blit path to be used. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13255 Fixes: `3424e16ece` ("radeonsi: add decision code to select when to use CB_RESOLVE for performance") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38925>	2025-12-18 10:45:49 +00:00
Pierre-Eric Pelloux-Prayer	7fc5267d08	hud: add new 'dev' pseudo-graph It displays the renderer string and the PCIe bus info. It's not a real graph because hud_graph is built to draw numbers and 'dev' is the only use case so far where we just want to draw a string. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38925>	2025-12-18 10:45:49 +00:00
Pierre-Eric Pelloux-Prayer	3b4b5761aa	hud: only increase y if the pane contains graphs This makes the layout of "fps,cpu" identical to "fps,stdout,cpu". Without this change, the ',' separator after 'stdout' would increase y and we would have a gap between the fps and cpu graphs. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38925>	2025-12-18 10:45:49 +00:00
Pierre-Eric Pelloux-Prayer	f521a6270b	mesa: consider Attrib.MinLayer in do_blit_framebuffer Otherwise a blit from a fbo with a GL_COLOR_ATTACHMENT0 using a GL_TEXTURE_2D view of a GL_TEXTURE_2D_ARRAY will always read from layer 0. See https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/1060 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13527 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38925>	2025-12-18 10:45:49 +00:00
Martin Roukala (né Peres)	13783fe2ef	ci: disable the valve-kws farm Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details We are having problem establishing connections to the s3.freedesktop.org web server, so let's disable the farm until we can figure it out. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39017>	2025-12-18 12:35:39 +02:00
Lucas Stach	57dc4cf4fb	etnaviv: don't emit steering state when uniforms are unchanged The steering bits tell the GPU which caches to invalidate on the subsequent uniform state writes. There is no point in writing those steering bits when there are no uniforms to emit. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38998>	2025-12-18 09:05:39 +00:00
Boris Brezillon	d7d690b47f	panvk: Fix set_compute_sysval() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details BITSET_SET_RANGE() was passed sysval_fau_start() instead of sysval_fau_end() as a 3rd argument. Fixes: `ae76a6a045` ("panvk: Pack push constants") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14489 Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38980>	2025-12-18 08:14:14 +01:00
Kenneth Graunke	d83c699045	brw: Convert GS pulled inputs to use URB intrinsics We leave GS pushed inputs using load_per_vertex_input for now - they're relatively simple, and using load_attribute_payload doesn't work well since it's assumed to be convergent (for TES, FS inputs) while GS inputs are divergent. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:02 +00:00
Kenneth Graunke	eae3bd19d4	brw: Move GS URB Read Length limiting to brw_nir_lower_gs_inputs() We're going to be deciding on push vs. pull in the NIR lowering pass soon, so move the code to limit our register usage from brw's thread payload code to brw_nir_lower_gs_inputs(). Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:02 +00:00
Kenneth Graunke	8889802271	brw: Make max_push_bytes a parameter to URB lowering data This allows us to program something other than a stage-based constant. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:02 +00:00
Kenneth Graunke	f62f7d80e2	brw: Update try_load_push_input to handle dword-unit offsets too We don't need this case today, but it's trivial to handle. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38990>	2025-12-18 06:39:01 +00:00
Job Noorman	f601aa5ce7	ir3/bisect: fix off-by-one issues while bisecting Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes two separate issues: - Getting stuck when ending up with a list of 2 ids; - Removing a potential bad id. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38993>	2025-12-18 04:43:16 +00:00
Marek Olšák	3c5c96fedb	radv: double pixel throughput in certain cases of PS without interpolated inputs Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This reduces the number of initialized VGPRs by 1 when no barycentric coordinates are used. I have verified with zink that this indeed increases performance for cases where sysvals like frag_coord and front_face are used without interpolated PS inputs. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38936>	2025-12-18 03:37:58 +00:00
Marek Olšák	8cf154d2eb	radeonsi: don't load sampler states for buffer and MS samplers They don't use them. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38915>	2025-12-18 03:20:13 +00:00
Marek Olšák	5acabdd1f8	radeonsi: double pixel throughput in certain cases of PS without inputs When no barycentric VGPRs are needed, we always enabled one of the pairs (e.g. PERSP_SAMPLE_ENA) because it's a HW requirement. However, the requirement says that LINE_STIPPLE_TEX_ENA can be enabled instead, which occupies only 1 VGPR. To get maximum pixel throughput, we can only have 2 initialized VGPRs at most. By reducing initialized VGPRs from 2 (with PERSP_SAMPLE_ENA) to 1 (with LINE_STIPPLE_TEX_ENA), we can have 1 additional initialized VGPR for free with maximum pixel throughput, such as POS_FIXED_PT for frag_coord.xy without MSAA. Only ACO gets this perf improvement because the change would be more complicated with LLVM. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38915>	2025-12-18 03:20:13 +00:00
Timothy Arceri	6592a18cd7	util/driconf: add workaround for Interstellar Rift Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This game sets the reset isolation bit which causes the GL context creation to fail as Mesa doesn't support the GLX_ARB_robustness_application_isolation extension. Here we override and clear the bit. According to the spec says: "The GLX_ARB_robustness_application_isolation and GLX_ARB_robustness_share_group_isolation extensions do not provide guarantees for graphics resets caused by applications which did not create their contexts with both the LOSE_CONTEXT_ON_RESET_ARB reset notification strategy and the GLX_CONTEXT_RESET_ISOLATION_BIT_ARB bit." And the game doesn't set LOSE_CONTEXT_ON_RESET_ARB so technically we could ignore the reset isolation bit even if Mesa did support the extension. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13336 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38668>	2025-12-17 23:35:25 +00:00
Timothy Arceri	67eeee43e0	driconf: add a way to override GLX_CONTEXT_RESET_ISOLATION_BIT_ARB This allows us to override and clear the reset isolation bit. It will be used in the following patch to override missing support for GLX_CONTEXT_RESET_ISOLATION_BIT_ARB. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38668>	2025-12-17 23:35:24 +00:00
Dylan Baker	f5351afbc8	docs: update calendar for 25.3.2 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39010>	2025-12-17 22:11:17 +00:00
Dylan Baker	bb8d00e4b2	docs: Add checksums for 25.3.2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39010>	2025-12-17 22:11:17 +00:00
Dylan Baker	7e53a239aa	docs: add release notes for 25.3.2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39010>	2025-12-17 22:11:17 +00:00
Lucas Stach	075eefc422	etnaviv: blt: fix tile count calculation for in-place resolve A in-place resolve via the BLT engine is only supposed to fill the tiles of a single layer of a resource, so the size to calculate the number of tiles is the layer stride, same as done for the in-place resolve via the RS engine in `8df11f3fad` ("etnaviv: fix in-place resolve tile count.") CC: mesa-stable Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39005>	2025-12-17 21:55:13 +00:00
Emma Anholt	c00ebca5c4	ir3: Improve spilling of NIR vars to scratch. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Previously, we would spill at the NIR level any temp array over 16 vec4s. This had two problems: 1) We wouldn't spill for the worst case scenario: a MAD accessing a dst array and 3 different src arrays (that all get fully unspilled, rather than just reloading the specific reg in the operand). This would fail to register allocate. We haven't seen this in practice. 2) We would spill vec4[17] and larger arrays that weren't necessary to get the shader to register allocate. This occurred on a FS for in Stray that had a vec4[24] array and just 4 vec4s of register pressure other than the array. Instead, use NIR scratch spilling when the worst case set of vars to reference in an instruction would overflow GPR space. This makes the shader in Stray go from 11ms to .5ms, by eliminating all spilling and leaving the array in GPRs. On the other hand, if leaving the arrays unspilled in NIR means that we cause spilling in ir3, the fact that ir3 spills/reloads work on the whole array may cause the amount of spilling to increase. However, we can see the effect is very small in terms of number of shaders affected in shader-db and an overwhelmingly positive effect on spills: MaxWaves: 22522470 -> 22520664 (-0.01%) Instrs: 396093281 -> 396122221 (+0.01%); split: -0.00%, +0.01% STPs: 218915 -> 182907 (-16.45%) LDPs: 155374 -> 153364 (-1.29%); split: -2.79%, +1.50% Totals from 496 (0.03% of 1561298) affected shaders: MaxWaves: 3792 -> 1986 (-47.63%) Instrs: 441224 -> 470164 (+6.56%); split: -0.00%, +6.57% CodeSize: 926164 -> 976734 (+5.46%); split: -0.05%, +5.52% NOPs: 58896 -> 52765 (-10.41%); split: -14.95%, +4.60% MOVs: 16314 -> 57901 (+254.92%) COVs: 3293 -> 5146 (+56.27%) Full: 12876 -> 23632 (+83.54%) (ss): 18613 -> 11573 (-37.82%); split: -47.53%, +9.71% (sy): 2539 -> 2505 (-1.34%); split: -10.75%, +9.41% (ss)-stall: 40682 -> 26413 (-35.07%); split: -47.90%, +12.80% (sy)-stall: 147862 -> 117004 (-20.87%); split: -37.65%, +16.69% STPs: 38566 -> 2558 (-93.37%) LDPs: 5060 -> 3050 (-39.72%); split: -85.77%, +45.93% Cat0: 65593 -> 59487 (-9.31%); split: -13.42%, +4.15% Cat1: 19667 -> 63105 (+220.87%) Cat2: 155958 -> 157879 (+1.23%); split: -0.05%, +1.28% Cat6: 105228 -> 94910 (-9.81%); split: -12.36%, +2.54% Cat7: 2480 -> 2485 (+0.20%); split: -0.08%, +0.28% Subgroup size: 31872 -> 31744 (-0.40%) The primary impacted application from shader-db is gfxbench aztec ruins. A quick test of it showed no significant performance improvement (n=3). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Emma Anholt	0d9428736b	ir3/ra: Make a helper to get RA register pressure limits. I'll be reusing this to let vars_to_scratch keep bigger arrays in register space. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Emma Anholt	d5cb38e457	ir3: Move the compute shader threadsize forcing earlier. With this, we can look at real_wavesize while running NIR passes and know if we have to be doubled because of the shader info coming in. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Emma Anholt	5a09abe890	nir: Introduce nir_lower_vars_to_scratch_global(). This lets the driver make a more informed decision about which vars to lower to scratch based on the vars available to spill. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Emma Anholt	059d301c79	nir: Drop the mode argument of nir_lower_vars_to_scratch(). It only makes sense for function temps, and that's the only way it's been used. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Yiwei Zhang	962bed2dd6	vulkan: update ALLOWED_ANDROID_VERSION for api level 36 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38988>	2025-12-17 19:22:47 +00:00

1 2 3 4 5 ...

216355 commits