Commit graph

3907 commits

Author SHA1 Message Date
Georg Lehmann
ec331cc48a nir: replace lower_ldexp with has_ldexp
I can be bothered to fix all the backends that don't set lower_ldexp,
and only two backends have ldexp anyway.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>
2026-03-20 08:15:08 +00:00
Karol Herbst
f3d091439a nak: make nak_mem_vectorize_cb create only aligned and supported vectors
While the idea of being aggressive sounds like a good one, in practise it
creates vectorized load/stores that are not optimal.

This makes it so that we only ever create aligned and supported vector
sizes that prevents those issues.

Totals:
CodeSize: 8662362848 -> 8662362240 (-0.00%); split: -0.00%, +0.00%
Number of GPRs: 47508046 -> 47508014 (-0.00%)
Static cycle count: 4713321839 -> 4713285952 (-0.00%); split: -0.00%, +0.00%
Spills to memory: 45073 -> 45061 (-0.03%)
Fills from memory: 45073 -> 45061 (-0.03%)
Max warps/SM: 50564816 -> 50564832 (+0.00%)

Totals from 689 (0.06% of 1163204) affected shaders:
CodeSize: 26314320 -> 26313712 (-0.00%); split: -0.02%, +0.02%
Number of GPRs: 60914 -> 60882 (-0.05%)
Static cycle count: 156504342 -> 156468455 (-0.02%); split: -0.05%, +0.02%
Spills to memory: 15453 -> 15441 (-0.08%)
Fills from memory: 15453 -> 15441 (-0.08%)
Max warps/SM: 18640 -> 18656 (+0.09%)

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40293>
2026-03-18 12:13:03 +00:00
Mary Guillemard
d00965651a nvk: Broacast viewport0 and scissor0 in case of FSR on Turing
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On Turing, the hardware rely on the viewport index for FSR.
If not all viewports are defined, we will end up not rendering
anything when selecting the primitive shading rate.

This patch makes it that we now broadcast the viewport and scissor 0
likes the proprietary driver.

This fixes "dEQP-VK.mesh_shader.ext.builtin.primitive_shading_rate_*" on
Turing.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 2fb4aed9 ("nvk: Advertise VK_KHR_fragment_shading_rate")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40314>
2026-03-18 08:14:17 +00:00
Mary Guillemard
56e31d8145 nvk: Move viewport and scissor emit to their own function
We are going to need to reuse those functions to fix FSR support on
Turing.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40314>
2026-03-18 08:14:17 +00:00
Karol Herbst
21aac29da8 nak: vectorize f2f16 even more
Totals:
CodeSize: 8662212288 -> 8662208848 (-0.00%)
Static cycle count: 4713275320 -> 4713273530 (-0.00%)

Totals from 91 (0.01% of 1163204) affected shaders:
CodeSize: 1936288 -> 1932848 (-0.18%)
Static cycle count: 644443 -> 642653 (-0.28%)

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40298>
2026-03-18 05:22:06 +00:00
Karol Herbst
b5a2685cf8 nak: vectorize f2f16
Totals:
CodeSize: 8662332112 -> 8662212288 (-0.00%); split: -0.00%, +0.00%
Number of GPRs: 47508046 -> 47507734 (-0.00%); split: -0.00%, +0.00%
SLM Size: 1203000 -> 1202992 (-0.00%)
Static cycle count: 4713330409 -> 4713275320 (-0.00%); split: -0.00%, +0.00%
Spills to memory: 45073 -> 45059 (-0.03%)
Fills from memory: 45073 -> 45059 (-0.03%)
Max warps/SM: 50564816 -> 50564980 (+0.00%)

Totals from 1498 (0.13% of 1163204) affected shaders:
CodeSize: 20737136 -> 20617312 (-0.58%); split: -0.63%, +0.05%
Number of GPRs: 97659 -> 97347 (-0.32%); split: -0.33%, +0.01%
SLM Size: 13104 -> 13096 (-0.06%)
Static cycle count: 100260225 -> 100205136 (-0.05%); split: -0.17%, +0.11%
Spills to memory: 262 -> 248 (-5.34%)
Fills from memory: 262 -> 248 (-5.34%)
Max warps/SM: 50504 -> 50668 (+0.32%)

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40298>
2026-03-18 05:22:06 +00:00
Karol Herbst
f2fa7d0e9c nak: allow vector sources for f2f16 conversions
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40298>
2026-03-18 05:22:06 +00:00
Karol Herbst
458923803d nak: enable vectorize_vec2_16bit
This is intended for backends that do SIMD within a register, like we do.

Helps with register pressure. This will also prevent f2f from being
scalarized, which will help on Ampere+ as a later patch will use F2FP for
those.

Totals:
CodeSize: 8662362848 -> 8662332112 (-0.00%); split: -0.00%, +0.00%
Static cycle count: 4713321839 -> 4713330409 (+0.00%); split: -0.00%, +0.00%
Spills to reg: 149117 -> 149128 (+0.01%)
Fills from reg: 170680 -> 170693 (+0.01%)

Totals from 19 (0.00% of 1163204) affected shaders:
CodeSize: 732208 -> 701472 (-4.20%); split: -4.22%, +0.02%
Static cycle count: 1670226 -> 1678796 (+0.51%); split: -0.10%, +0.61%
Spills to reg: 517 -> 528 (+2.13%)
Fills from reg: 486 -> 499 (+2.67%)

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40298>
2026-03-18 05:22:05 +00:00
Faith Ekstrand
381bc06c4a nak: Report progress from nak_nir_rematerialize_load_const()
Fixes: 8fffcdb18b ("nak/nir: Re-materialize load_const instructions in use blocks")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40464>
2026-03-17 16:11:38 +00:00
Mary Guillemard
ef8fd44b5f nvk: Validate push constant offset in nvk_root_descriptor_table
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We requires this to be aligned to a 8 byte granuality.
This is something that came up with mesh shader enablement so let's
avoid this footgun with some assertion.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40407>
2026-03-13 18:04:42 +00:00
Mary Guillemard
37d73fa4f3 nvk/mme: Enable testing for Kepler
Test Kepler as it was commented out but everything is
running fine.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40407>
2026-03-13 18:04:42 +00:00
Mary Guillemard
4fa2f6e0b3 nvk: Put nvk_mme in the nouveau test suite
Not sure why it was missing but it should be part of it.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40407>
2026-03-13 18:04:42 +00:00
Mary Guillemard
32895657b4 nvk/mme: Add missing nullcheck in nvk_mme_test_state_state
Needed for some FSR macro changes I want to test.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 7d6cc15ab8 ("nvk/mme: Add a unit test framework for driver macros")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40407>
2026-03-13 18:04:42 +00:00
Mary Guillemard
73dba1e151 nir, nvk, nak: Add base to isbewr_nv and isberd_nv
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On SM86+, we can use a 16-bit unsigned offset along side the register
for it.

This adds a new base indice that will be used for it, integration with
nir_opt_offsets and a lowering pass to get ride of the base on
unsupported generations.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>
2026-03-11 19:41:34 +00:00
Mary Guillemard
1a46233a07 nak/nvdisasm_tests: Test ISBERD and ISBEWR
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>
2026-03-11 19:41:34 +00:00
Mary Guillemard
63a9a5e921 nak: Implement ISBEWR and extend ISBERD implementation
ISBERD/ISBEWR allow raw manipulation of the various ISBE spaces
where attributes are stored.

This extends the implementation of ISBERD to support the additional
elements added in its intrinsic and implement ISBEWR intrinsic while
extending the ISBE space sharing detection pass.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>
2026-03-11 19:41:34 +00:00
Mary Guillemard
a1996f6985 nak: Legalize ISBERD
This instruction can only take GPRs.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>
2026-03-11 19:41:34 +00:00
Mary Guillemard
6a8d09972e nir: Add isbewr_nv intrinsic and extends isberd_nv
Adds a new intrinsic allowing to do raw write in the various ISBE spaces
where attributes are stored.

This also adapt isberd_nv to map to what we have since SM70+.

This will be used to support mesh shaders.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>
2026-03-11 19:41:33 +00:00
Karol Herbst
bd552b41cc nvk: skip lowering load_global_constant_bounded on turing inside lower_load_intrinsic
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40272>
2026-03-10 00:10:05 +00:00
Karol Herbst
f7ad45e5fc nak: support has_load_global_bounded on turing and newer
Totals:
CodeSize: 9401446416 -> 8663482432 (-7.85%); split: -7.85%, +0.00%
Number of GPRs: 47297665 -> 47508294 (+0.45%); split: -0.14%, +0.59%
SLM Size: 1202912 -> 1203000 (+0.01%); split: -0.09%, +0.10%
Static cycle count: 5984801035 -> 4714013561 (-21.23%); split: -21.24%, +0.00%
Spills to memory: 44482 -> 45073 (+1.33%); split: -1.68%, +3.01%
Fills from memory: 44482 -> 45073 (+1.33%); split: -1.68%, +3.01%
Spills to reg: 184822 -> 149129 (-19.31%); split: -21.54%, +2.23%
Fills from reg: 223885 -> 170692 (-23.76%); split: -25.49%, +1.73%
Max warps/SM: 50642520 -> 50564740 (-0.15%); split: +0.03%, -0.19%

Totals from 185510 (15.95% of 1163204) affected shaders:
CodeSize: 3910084048 -> 3172120064 (-18.87%); split: -18.88%, +0.01%
Number of GPRs: 10625243 -> 10835872 (+1.98%); split: -0.63%, +2.61%
SLM Size: 659568 -> 659656 (+0.01%); split: -0.17%, +0.19%
Static cycle count: 3920553863 -> 2649766389 (-32.41%); split: -32.42%, +0.01%
Spills to memory: 8498 -> 9089 (+6.95%); split: -8.81%, +15.77%
Fills from memory: 8498 -> 9089 (+6.95%); split: -8.81%, +15.77%
Spills to reg: 109049 -> 73356 (-32.73%); split: -36.51%, +3.77%
Fills from reg: 116031 -> 62838 (-45.84%); split: -49.18%, +3.34%
Max warps/SM: 6885584 -> 6807804 (-1.13%); split: +0.25%, -1.38%

This also helps significantly reduce shader compile times since it reduces
the number of basic blocks.  With DragonAge: The Veilguard, it reduces
shader compile times by around 20%.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40272>
2026-03-10 00:10:05 +00:00
Karol Herbst
7722bde53b nak: use ldg input predicate in nak_nir_lower_non_uniform_ldcx
Totals:
CodeSize: 9442133184 -> 9401446416 (-0.43%); split: -0.43%, +0.00%
Number of GPRs: 47300490 -> 47297665 (-0.01%); split: -0.01%, +0.00%
Static cycle count: 6120907718 -> 5984801035 (-2.22%); split: -2.22%, +0.00%
Spills to reg: 184810 -> 184822 (+0.01%); split: -0.01%, +0.02%
Fills from reg: 223860 -> 223885 (+0.01%); split: -0.01%, +0.02%
Max warps/SM: 50641540 -> 50642520 (+0.00%); split: +0.00%, -0.00%

Totals from 12079 (1.04% of 1163204) affected shaders:
CodeSize: 461892048 -> 421205280 (-8.81%); split: -8.81%, +0.00%
Number of GPRs: 1060493 -> 1057668 (-0.27%); split: -0.43%, +0.16%
Static cycle count: 922257513 -> 786150830 (-14.76%); split: -14.76%, +0.00%
Spills to reg: 14704 -> 14716 (+0.08%); split: -0.14%, +0.22%
Fills from reg: 24213 -> 24238 (+0.10%); split: -0.08%, +0.19%
Max warps/SM: 320540 -> 321520 (+0.31%); split: +0.39%, -0.08%

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40272>
2026-03-10 00:10:05 +00:00
Karol Herbst
9d90cbc314 nak: add input predicate to load_global_nv and OpLd
This is new in SM75 (Turing). Let's use it because it allows us to get rid
of the if/else around bound checked global loads.

There are some changes in fossils, but it seems that's mostly due to CFG
optimizations doing things a bit differently?

Totals:
CodeSize: 9442152688 -> 9442133184 (-0.00%); split: -0.00%, +0.00%
Static cycle count: 6120910991 -> 6120907718 (-0.00%); split: -0.00%, +0.00%
Spills to reg: 184789 -> 184810 (+0.01%)
Fills from reg: 223831 -> 223860 (+0.01%); split: -0.00%, +0.01%

Totals from 334 (0.03% of 1163204) affected shaders:
CodeSize: 22020752 -> 22001248 (-0.09%); split: -0.10%, +0.01%
Static cycle count: 26582978 -> 26579705 (-0.01%); split: -0.01%, +0.00%
Spills to reg: 3110 -> 3131 (+0.68%)
Fills from reg: 3401 -> 3430 (+0.85%); split: -0.03%, +0.88%

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40272>
2026-03-10 00:10:05 +00:00
Karol Herbst
d2bf824baf nak: replace legalize_ext_instr with explicit lowering
legalize_ext_instr wasn't doing anything besides lowering uniform sources
and panicing on a bunch of Source types.

Having a common helper looping over all sources doesn't make much sense,
because all the instructions are widly different in regards to UGPRs. The
panics will be hit while emitting the sources as well, so this helper
provided little help and wasn't flexible enough for what we need.

Furthermore some instructions like LDG also take an additional input
predicate that legalize_ext_instr can't handle.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40272>
2026-03-10 00:10:04 +00:00
Karol Herbst
95f19bd5eb nak: invalidate loop analysis with nak_nir_lower_load_store
We'll start to lower load_global_bounded there and that will invalidate
loop analysis, because the amount of instructions will change within a
block.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40272>
2026-03-10 00:10:04 +00:00
Mel Henning
1371c53e6a nvk: Expose VK_KHR_depth_clamp_zero_one
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Promoted from EXT

Reviewed-By: Thomas H.P. Andersen <phomes@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39812>
2026-03-08 17:16:26 -04:00
Mel Henning
8e2707950b nvk: Use the MME for cond rendering on Turing+
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We can avoid the stalls from subc switches by avoiding using the copy engine
during vkCmdBeginConditionalRenderingEXT. Implement this by loading the
cond render value using the MME, since the hardware doesn't have a
suitable 32-bit comparison itself.

This brings the Sascha Willems conditionalrender demo from
from 1661 to 8334 fps on my blackwell system with all meshes disabled.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40277>
2026-03-08 17:31:32 +00:00
Mel Henning
905557ab31 nvk: Use SET_GLOBAL_RENDER_ENABLE
This brings the Sascha Willems conditionalrender demo from
927 to 1661 fps on my blackwell system with all meshes disabled.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40277>
2026-03-08 17:31:32 +00:00
Karol Herbst
65c5c4e1a1 nvk: run nir_opt_large_constants before nir_lower_load_const_to_scalar
nir_opt_large_constants isn't able to deal with complex derefs and
nir_lower_load_const_to_scalar e.g. splits up vectors to scalars.

This prevented nir_opt_large_constants from extracting large constants
in shaders that e.g. use a array of vector constant table.

Totals:
CodeSize: 9460341008 -> 9443435056 (-0.18%); split: -0.20%, +0.02%
Number of GPRs: 47363466 -> 47300498 (-0.13%); split: -0.13%, +0.00%
SLM Size: 5409320 -> 1202912 (-77.76%)
Static cycle count: 6130972462 -> 6121193466 (-0.16%); split: -0.20%, +0.04%
Spills to reg: 184840 -> 184828 (-0.01%); split: -0.01%, +0.01%
Fills from reg: 223889 -> 223874 (-0.01%); split: -0.01%, +0.00%
Max warps/SM: 50637796 -> 50641540 (+0.01%); split: +0.01%, -0.00%

Totals from 32429 (2.79% of 1163204) affected shaders:
CodeSize: 824883920 -> 807977968 (-2.05%); split: -2.25%, +0.20%
Number of GPRs: 2413077 -> 2350109 (-2.61%); split: -2.61%, +0.00%
SLM Size: 4437016 -> 230608 (-94.80%)
Static cycle count: 1208715713 -> 1198936717 (-0.81%); split: -1.02%, +0.21%
Spills to reg: 11934 -> 11922 (-0.10%); split: -0.20%, +0.10%
Fills from reg: 14118 -> 14103 (-0.11%); split: -0.14%, +0.04%
Max warps/SM: 1035736 -> 1039480 (+0.36%); split: +0.37%, -0.01%

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14993
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Tested-by: Thomas H.P. Andersen <phomes@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40282>
2026-03-07 23:21:40 +00:00
Karol Herbst
faea742c3a nouveau/drm-shim: implement get_zcull_info
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Tested-by: Thomas H.P. Andersen <phomes@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40282>
2026-03-07 23:21:40 +00:00
Eric Engestrom
91b4341e61 nvk/ci: document recent flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40243>
2026-03-05 18:29:34 +00:00
Eric Engestrom
53ebb59e30 nvk/ci: update expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40243>
2026-03-05 18:29:29 +00:00
Mel Henning
102fa924c2 nvk: Remove unused cmd.tls_space_needed
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33861>
2026-02-25 22:42:07 +00:00
Mel Henning
c24963d8da nvk: Enable zcull for VK_ATTACHMENT_LOAD_OP_LOAD
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33861>
2026-02-25 22:42:06 +00:00
Mel Henning
5e04c965de nvk: Enable basic zcull support
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12596
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33861>
2026-02-25 22:42:06 +00:00
Mel Henning
0920e0afb5 nil: Add zcull support
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33861>
2026-02-25 22:42:06 +00:00
Mel Henning
88cc4df9a3 nouveau/headers: Preserve _ before 0-9 in to_camel
This makes some zcull enums a lot more readable - previously Z_4X8_2X2
would become Z4X82X2 but this gives us Z4x8_2x2 instead.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33861>
2026-02-25 22:42:06 +00:00
Mel Henning
76f9a51660 nouveau/winsys: Fetch zcull_info on device create
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33861>
2026-02-25 22:42:06 +00:00
Rhys Perry
f44de53586 nir: only set fp_math_ctrl if meaningful
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809>
2026-02-18 14:04:22 +00:00
Marek Olšák
61a96be494 nir/lower_non_uniform_access: add an option not to lower tex & image queries
AMD can do non-uniform queries. The RADV change will be in a separate commit.

NFC for drivers.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743>
2026-02-16 12:59:36 +00:00
Thomas H.P. Andersen
331af5e746 nvk: add app workaround layer
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This adopts the device internal app workaround layer from radv

The layer allows to fix up game input in the layer instead of
adding workarounds within the driver.

Initially this only includes the workaround for Metro exodus as
I have verified that it fixes a crash on NVK. Follow up commits
can add the other relevant workarounds when the fixes are verified
to be needed for NVK.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39870>
2026-02-14 08:33:11 +00:00
Thomas H.P. Andersen
0a6509e94c nvk: prepare for driver internal layers
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39870>
2026-02-14 08:33:11 +00:00
Mel Henning
cbec12627b nvk: VK_KHR_copy_memory_indirect
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39869>
2026-02-13 20:53:47 +00:00
Mel Henning
6824004a0b nouveau/headers: Add P_IMMD_WORD()
P_IMMD() is annoying because it always uses two words in unoptimized
builds but sometimes uses one word with optimizations on. This can make
it difficult to allocate the correct pushbuf size.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39869>
2026-02-13 20:53:47 +00:00
Mel Henning
47e7119796 nouveau/headers: Use UINT64_C in drf.h
These ULL literals are all meant to refer to 64-bit types. Use the
UINT64_C macro so they work on platforms with 128-bit long long.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39869>
2026-02-13 20:53:47 +00:00
Mel Henning
d133ff8313 nouveau/headers: Don't use 128-bit comparisons
These always fit in 64-bits so there's no need for the wide constant.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39869>
2026-02-13 20:53:47 +00:00
Mel Henning
407bdcdf27 nvk: Don't include u_math.h in generated headers
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39869>
2026-02-13 20:53:47 +00:00
Lorenzo Rossi
c7e2b84661 nvk,nak: Add nir_printf_fmt
This helps debugging, a LOT.

v2 (mhenning): Use nak_constant_offset_info,
               enable based on NDEBUG, and
               fix use after free in nir pass

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39869>
2026-02-13 20:53:47 +00:00
Mel Henning
0716055734 nak: Remove some unused fs_key parameters
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39869>
2026-02-13 20:53:47 +00:00
Mel Henning
013b21d52f nvk,nak: Store offsets in a const extern struct
Move some of this constant data out of fs_key and into a constant
struct. This reduces the size of the fs_key and gives us a spot to add
additional non-fs offsets like printf buffers.

Passing this through a const global may be a little odd but it has the
benefit that we don't need to hash the offsets as additinal state
relevant to a compilation, since they can never change without
a modification to the binary.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39869>
2026-02-13 20:53:47 +00:00
Mary Guillemard
0ea139523f nvk: Early return in draw commands when no draw will be performed
Follows what other drivers do, if we do not need to emit a draw, just
bail out early.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39783>
2026-02-12 17:44:56 +00:00