fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-28 03:28:10 +02:00

Author	SHA1	Message	Date
Job Noorman	c784af5ca0	ir3: always use byte offset for @load/store_global_ir3 Before a7xx, ldg/stg.a use an offset in units of their type size while on a7xx and later, the offset is always in bytes. Currently, @load/store_global_ir3 take their offset in dwords (32-bits). This has a few downsides: offsets need an extra shl during codegen on a7xx and addressing sub-dword-aligned addresses is only possible by doing 64-bit math on the base address. Improve the situation by always using a byte offset for @load/store_global_ir3 and adding the offset_shift index to support type units pre-a7xx. While we're at it, add the base index as well to support all ldg/stg.g features in @load/store_global_ir3. Supporting these renewed intrinsics consists of two parts: - ir3_nir_lower_io_offsets legalizes the offset_shift on a6xx: for ldg.a/stg.a, the offset has to be in units of the type size so extra shifts are inserted to accomplish this if necessary. On a7xx, offsets are always in bytes so nothing needs to be done. - The intrinsics are emitted as ldg/stg if the offset is a small enough constant and as ldg.a/stg.a otherwise. a6xx supports an extra shift for ldg.a/stg.a that only applies to the GPR offset (not the immediate base); NIR is pattern matched at this point to extract this if possible. All users of @load/store_global_ir3 are updated to generate the offset in units of bytes. ir3_nir_analyze_ubo_ranges is updated to take the new offset_shift into account. Totals from 2029 (1.15% of 176266) affected shaders: MaxWaves: 26728 -> 26660 (-0.25%); split: +0.01%, -0.26% Instrs: 1314089 -> 1278603 (-2.70%); split: -2.72%, +0.02% CodeSize: 2739108 -> 2633236 (-3.87%); split: -3.87%, +0.01% NOPs: 197537 -> 200843 (+1.67%); split: -1.62%, +3.30% MOVs: 43771 -> 44025 (+0.58%); split: -1.11%, +1.69% Full: 31849 -> 31948 (+0.31%); split: -0.03%, +0.34% (ss): 37965 -> 42027 (+10.70%); split: -3.47%, +14.17% (sy): 13752 -> 13566 (-1.35%); split: -4.04%, +2.68% (ss)-stall: 154238 -> 170353 (+10.45%); split: -1.72%, +12.16% (sy)-stall: 804442 -> 806518 (+0.26%); split: -4.65%, +4.91% Preamble Instrs: 326728 -> 293488 (-10.17%) Cat0: 217926 -> 220947 (+1.39%); split: -1.58%, +2.96% Cat1: 50182 -> 50446 (+0.53%); split: -0.97%, +1.49% Cat2: 460987 -> 452101 (-1.93%); split: -2.26%, +0.33% Cat3: 390696 -> 361271 (-7.53%) Cat7: 39148 -> 38688 (-1.18%); split: -1.24%, +0.06% Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41342>	2026-05-05 06:25:49 +02:00
Faith Ekstrand	84bbfaa7e5	pan/bi: Delete the old texel buffer intrinsics Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:16 +00:00
Faith Ekstrand	7d5cb2884c	pan/bi: Allow setting the table on lea_attr_pan Also allow us to set AUTO32 while we're at it. Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:16 +00:00
Faith Ekstrand	2369808cd1	pan,nir: Add Bifrost texturing intrinsics These are funky enough that they make more sense as intrinsics than texture opcodes. Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:16 +00:00
Faith Ekstrand	337aaa0ab9	pan,nir: Add cube face intrinsics Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:15 +00:00
Marek Olšák	e49f29f25e	nir: add frag_coord_xy to strengthen and simplify pixel_coord lowering Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>	2026-05-03 13:03:00 +00:00
Simon Perretta	57791c4a99	pco: track how many tg4/raw sample comps are needed Rather than always emitting and swizzling 16 components for raw samples, scale it by the number actually needed as defined by the selected tg4 channel/components. Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>	2026-04-28 12:04:03 +01:00
Simon Perretta	af1669d9e2	pco: reserve additional outputs for trilinear sampled coeffs Sampling coeffs with trilinear filtering will output 2x sets of data. Whether bilinear or trilinear filtering is in use can't be determined without checking state words, so unconditionally reserve 2x to avoid clobbering output regs. Fixes: `7df32ba09d` ("pco: initial texture/sampler compiler support") Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Frank Binns <frank.binns@imgtec.com> Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41051>	2026-04-27 11:32:29 +00:00
squidbus	a41f0e62bb	asahi,nir: Move asahi dynamic clipz pass to common. Acked-by: Alyssa Rosenzweig <alyssa@rosenz.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41088>	2026-04-27 11:00:59 +00:00
Patrick Lerda	9815901f86	r600: implement tes and tcs instanced gl_PrimitiveID support This change extends r600_lds_constant_buffer to implement a fully conformant gl_PrimitiveID at the tes and tcs stages. This change was tested on cayman and barts. Here are the tests fixed: spec/arb_tessellation_shader/execution/tcs-primitiveid-instanced: fail pass spec/arb_tessellation_shader/execution/tes-no-tcs-primitiveid-instanced: fail pass spec/arb_tessellation_shader/execution/tes-primitiveid-instanced: fail pass khr-gl4[4-6]/tessellation_shader/tessellation_shader_tessellation/gl_invocationid_patchverticesin_primitiveid: fail pass khr-gles31/core/tessellation_shader/tessellation_shader_tessellation/gl_invocationid_patchverticesin_primitiveid: fail pass khr-glesext/tessellation_shader/tessellation_shader_tessellation/gl_invocationid_patchverticesin_primitiveid: fail pass Signed-off-by: Patrick Lerda <patrick9876@free.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40297>	2026-04-20 13:21:55 +00:00
Samuel Pitoiset	0b016f4bff	nir: add new system values for descriptor heap RT traversal inputs Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39483>	2026-04-14 10:10:22 +00:00
Alyssa Rosenzweig	4356ad1bf5	nir: add pixel_coord_intel This is a 2x16 bitpacked version of load_pixel_coord which maps directly to the hardware value and is much easier for Jay to consume due to the sadness that is true 16-bit on Intel. Jay will lower to this internally. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Alyssa Rosenzweig	bd6d210386	nir: add shuffle_intel Jay will use this to lower & optimize subgroup shuffles. This is closer to how Intel hardware works but still much higher level than the hardware primitive. This gets us NIR optimizations on the multiply however. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Alyssa Rosenzweig	b840b178af	nir: add Intel RT write intrinsic This exposes the underlying render target write message directly, which Jay will use to lower RT writes in NIR. I'm still on the fence about what exactly this should look like but this is good enough for GLES3.0 (so, multiple render targets but not necessarily dual source blending). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Alyssa Rosenzweig	566047222e	nir: add frag_coord_w_rcp intrinsic This maps directly to what Intel's thread payload gives us, allowing us to optimize out frcp's in some cases. Jay will use this. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>	2026-04-10 18:21:21 +00:00
Kenneth Graunke	0b99c88337	nir, brw: lower scratch in NIR Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This will let us share a common scratch swizzling between brw and jay. Changes by Ken: - Use an immediate SIMD width when known so we don't need to re-lower - Switch to load_simd_width_intel because it may not match info->api_subgroup_size on Vulkan without VK_EXT_subgroup_size_control - Stop using DWord Scattered Write messages for scratch. These take an offset in DWords, and our offsets are now always in bytes. This also means that we no longer create MEMORY_OPCODE_* IR with inconsistent units of either bytes or dwords. Yikes. We use byte scattered messages now. fossil-db stats on Battlemage: Instrs: 500477504 -> 500450056 (-0.01%); split: -0.01%, +0.00% CodeSize: 7807432368 -> 7806786192 (-0.01%); split: -0.01%, +0.00% Cycle count: 62404008370 -> 62398437734 (-0.01%); split: -0.01%, +0.00% Fill count: 546690 -> 546695 (+0.00%); split: -0.00%, +0.00% Max live registers: 141257956 -> 141258100 (+0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 72350283 -> 72336544 (-0.02%) Totals from 99 (0.01% of 1581969) affected shaders: Instrs: 366593 -> 339145 (-7.49%); split: -7.58%, +0.09% CodeSize: 6425936 -> 5779760 (-10.06%); split: -10.06%, +0.00% Cycle count: 2412009876 -> 2406439240 (-0.23%); split: -0.26%, +0.03% Fill count: 19675 -> 19680 (+0.03%); split: -0.02%, +0.04% Max live registers: 17600 -> 17744 (+0.82%); split: -0.09%, +0.91% Non SSA regs after NIR: 37894 -> 24155 (-36.26%) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>	2026-04-09 21:02:16 +00:00
Samuel Pitoiset	a41724e923	nir: remove nir_intrinsic_global_addr_to_descriptor It's no longer emitted by vk_nir_lower_descriptor_heaps(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40826>	2026-04-08 09:46:01 +00:00
Samuel Pitoiset	74aa40f6ed	nir: remove resource/sampler heap ptrs sysvals They are no longer used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40768>	2026-04-07 18:55:49 +00:00
Lionel Landwerlin	22b16d54ab	nir: add heap variant of load_param_intel Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40729>	2026-04-01 12:56:43 +00:00
Samuel Pitoiset	9d059a60f5	nir: introduce nir_descriptor_type for Vulkan like descriptors This removes a Vulkan dependency in NIR core. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40670>	2026-03-31 07:16:20 +00:00
Konstantin Seurer	b127c11be9	spirv,nir: Preserve more information about the descriptor type Descriptor heap mappings need the information to selectively apply mappings (descriptor type masks). Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>	2026-03-30 06:51:25 +00:00
Faith Ekstrand	f117b81435	nir: Add intrinsics for descriptor heaps Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>	2026-03-30 06:51:22 +00:00
Faith Ekstrand	c29d8dd4ff	nir: Add sampler and resource heap system values Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40649>	2026-03-30 06:51:20 +00:00
Lorenzo Rossi	c0e0591999	pan/compiler: Replace frag_coord_zw_pan with var_special_pan Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Just a bit cleaner, and we can unify point size too. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40677>	2026-03-27 19:23:02 +00:00
Georg Lehmann	0d8e2354ed	nir: add fp_math_ctrl to convert_alu_types Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:50 +00:00
Georg Lehmann	35ca85176c	nir: add fp_math_ctrl to cmat alu ops Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:50 +00:00
Georg Lehmann	5d2be211ea	nir: add fp_math_ctrl to ddx/ddy Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:49 +00:00
Georg Lehmann	854911aeab	nir: add fp_math_ctrl as intrinsic index Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>	2026-03-26 13:15:49 +00:00
Marek Olšák	2283244975	nir: change export_amd intrinsics to use target instead of base Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>	2026-03-23 06:10:49 +00:00
Marek Olšák	b75a3112fd	nir: change export_amd intrinsics to use enabled_channels instead of write_mask Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>	2026-03-23 06:10:49 +00:00
Connor Abbott	ec37fed52b	tu, ir3, nir: Plumb through driver param for alpha-to-coverage We will need this when alpha-to-coverage is dynamic and we need to emulate it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>	2026-03-20 18:09:49 +00:00
Faith Ekstrand	3418525a82	pan/bi: Lower VS outputs in NIR Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:32 +00:00
Lorenzo Rossi	c730e41ed5	pan/bi: Add is_psiz_store flag in bi_instr This removes the previous hack that searched the psiz write by looking for 16-bit stores with the correct pseudo segment. We also add a new intrinsic that mimicks global stores but tags psiz writes, this will be used later in the series. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:30 +00:00
Faith Ekstrand	de338dc908	pan,nir: Rework converted_mem_pan intrinsics First, rename them to make them a bit more clear. They act on global memory so they should be _global and they map to ld/st_cvt so so _cvt is nice and obvious. Second, they don't need IO semantics as they're not IO. But they do need ACCESS so that we can better control things like CAN_REORDER. Third, add a src_type to store_global_cvt even though it won't be used just yet because we'll want it for lowering VS stores. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:29 +00:00
Faith Ekstrand	d2f430bea9	pan/bi: Add new FS input load intrinsics Unlike load[_interpolated]_input, which has to deal with all sorts of ABI nonsense between driver and compiler, these new intrinsics are dumber than bricks. They're literally just the HW ops as NIR intrinsics. These will allow us do the lowering in NIR and put the driver in total control over what goes down what path. Among other things, a driver could choose to lower some things to ld_var and others to ld_var_buf. Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>	2026-03-19 11:25:28 +00:00
Caio Oliveira	a2cbdfbde3	nir: Add intrinsics for ShuffleUpINTEL and ShuffleDownINTEL Move lowering to nir_lower_subgroups. At some point Intel backend might want to skip that and lower at the backend IR boundary, but for now lowering always applies. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40376>	2026-03-17 17:21:52 +00:00
Mary Guillemard	73dba1e151	nir, nvk, nak: Add base to isbewr_nv and isberd_nv Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details On SM86+, we can use a 16-bit unsigned offset along side the register for it. This adds a new base indice that will be used for it, integration with nir_opt_offsets and a lowering pass to get ride of the base on unsupported generations. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>	2026-03-11 19:41:34 +00:00
Mary Guillemard	6a8d09972e	nir: Add isbewr_nv intrinsic and extends isberd_nv Adds a new intrinsic allowing to do raw write in the various ISBE spaces where attributes are stored. This also adapt isberd_nv to map to what we have since SM70+. This will be used to support mesh shaders. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>	2026-03-11 19:41:33 +00:00
Lionel Landwerlin	f508c6acbb	brw/nir: improve shader_indirect_data_intel handling Use is_scalar to know if we can do transpose loading. Also enable vectorization if 2 intrinsics share the same source (it means the only difference is the base). Fixes: `e14d6b535c` ("brw/nir: add new intrinsics to load data from the indirect address") Tested-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40308>	2026-03-10 18:24:04 +00:00
Georg Lehmann	452025f75e	nir: add free bits in nir_io_semantics for future use Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40299>	2026-03-10 07:46:22 +00:00
Georg Lehmann	a25f00eaed	nir: merge xfb and xfb2 into one 64bit intrinsic index Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40299>	2026-03-10 07:46:22 +00:00
Georg Lehmann	4ba581887e	nir: support intrinsic indicies larger than 32 bits Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40299>	2026-03-10 07:46:21 +00:00
Karol Herbst	9d90cbc314	nak: add input predicate to load_global_nv and OpLd This is new in SM75 (Turing). Let's use it because it allows us to get rid of the if/else around bound checked global loads. There are some changes in fossils, but it seems that's mostly due to CFG optimizations doing things a bit differently? Totals: CodeSize: 9442152688 -> 9442133184 (-0.00%); split: -0.00%, +0.00% Static cycle count: 6120910991 -> 6120907718 (-0.00%); split: -0.00%, +0.00% Spills to reg: 184789 -> 184810 (+0.01%) Fills from reg: 223831 -> 223860 (+0.01%); split: -0.00%, +0.01% Totals from 334 (0.03% of 1163204) affected shaders: CodeSize: 22020752 -> 22001248 (-0.09%); split: -0.10%, +0.01% Static cycle count: 26582978 -> 26579705 (-0.01%); split: -0.01%, +0.00% Spills to reg: 3110 -> 3131 (+0.68%) Fills from reg: 3401 -> 3430 (+0.85%); split: -0.03%, +0.88% Reviewed-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40272>	2026-03-10 00:10:05 +00:00
Lionel Landwerlin	e14d6b535c	brw/nir: add new intrinsics to load data from the indirect address This address is delivered on Gfx12.5+ in compute/mesh/task shaders from the command stream instruction. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>	2026-03-06 06:34:43 +00:00
Lionel Landwerlin	7b1533414a	brw/nir: enable constant offsets for global_constant_uniform_block_intel Will be useful to retain the base offset added in `0e9453291c` ("brw: improve push constant loading using base offsets") once we move push constant data loading into NIR. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>	2026-03-06 06:34:43 +00:00
Lionel Landwerlin	7f19814414	brw/nir: handle inline_data_intel more like push_data_intel It's pretty much the same mechanism, except it's a different register location. With this change we gain indirect loading support. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:09 +00:00
Faith Ekstrand	e3dc3dccd6	pan/fb: Add a common FB load shader builder One of the advantages to this new FB load shader, apart from it being common, is that it's able to properly handle partial tile loads. Instead of doing the force_preload/clear dance that PanVK is currently doing, these shaders are clever enough to detect whether or not they're inside the Vulkan render area and clear the inside while loading the border pixels. In order for this to work, there are two new intrinsics which provide the framebuffer bounding box and the clear values. We need this in order to handle partial loads correctly. Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Acked-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39759>	2026-02-23 21:00:01 +00:00
Marek Olšák	a9df891bc6	nir: allow get_ssbo_size to return a 64-bit result to match get_ubo_size, and to support HW where SSBOs can have a 64-bit size. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743>	2026-02-16 12:59:36 +00:00
Marek Olšák	c151402f35	nir: add ACCESS to get_ubo_size so that we can set NON_UNIFORM Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743>	2026-02-16 12:59:36 +00:00
Marek Olšák	0a9bdcac79	ac: lower load_workgroup_ids for ACO in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>	2026-02-13 15:33:19 +00:00

1 2 3 4 5 ...

709 commits