fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-25 01:48:18 +02:00

Author	SHA1	Message	Date
Kenneth Graunke	73cbb35442	brw: Move into a new src/intel/compiler/brw subdirectory Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This keeps the directory structure a bit more organized: - brw specific code - elk specific code - common NIR passes that could be used in both places It also means that you can now 'git grep' in the brw directory without finding a bunch of elk code, or having to "grep thing b*". Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37755>	2025-10-09 07:01:47 +00:00
Lionel Landwerlin	6dbcc81c85	brw: simplify texture surface/sampler handle sources We had twice surface/sampler sources for no good reason, just add a boolean to tell whether they are bindless or not. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37527>	2025-09-23 15:37:40 +00:00
Lionel Landwerlin	06cf911ab4	brw: lower shader opcode into tex_instr Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37527>	2025-09-23 15:37:40 +00:00
Lionel Landwerlin	bddfbe7fb1	brw/blorp: lower MCS fetching in NIR One advantage here of moving a bunch of stuff to NIR is that we can now have consistent payload types straight from the NIR conversion to BRW. This massively simplifies the BRW lowering code and avoids type errors that are quite common to make in the backend. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37527>	2025-09-23 15:37:40 +00:00
Alyssa Rosenzweig	36bd06ebab	intel: drop clamp_fragment_color handling This is all dead code since we weren't even seting the cap in iris/crocus! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37447>	2025-09-18 14:14:11 +00:00
Lionel Landwerlin	1f86a4ee37	brw: remove unused RT write code With `4fda724fd4` ("brw: Avoid invalid access when compacting out-of-bounds JIP/UIP") this stuff isn't needed anymore. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `fe38fb858c` ("brw: workaround broken indirect RT messages on Gfx11") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37326>	2025-09-16 07:49:07 +00:00
Caio Oliveira	df2b5fb03f	brw: Add brw_fb_write_inst Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:25:04 +00:00
Caio Oliveira	d06c0a370e	brw: Add brw_urb_inst Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:25:04 +00:00
Caio Oliveira	90967e7b16	brw: Add brw_load_payload_inst Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:25:03 +00:00
Caio Oliveira	09a26526cc	brw: Add brw_mem_inst Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:25:02 +00:00
Caio Oliveira	f0f1e63f99	brw: Add brw_tex_inst Incorporate some "control sources" directly into the instruction. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:25:02 +00:00
Caio Oliveira	0fcce2722f	brw: Add brw_send_inst Move all the SEND specific fields from brw_inst into brw_send_inst. This new instruction kind will contain all variants of SENDs plus the virtual opcodes that were already relying on those SEND fields. Use the `as_send()` helper to go from a brw_inst into the brw_send_inst when applicable. Some of the code was changed to use the brw_send_inst type directly. Until other kinds are added, all the instructions are allocated the same amount of space as brw_send_inst. This ensures that all brw_transform_inst() calls are still valid. This will change after a few patches so that BASE instructions can use less memory. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:25:01 +00:00
Caio Oliveira	339a4e8680	brw: Remove the extra function call when lowering samplers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:25:00 +00:00
Caio Oliveira	e194909b3f	brw: Add and use brw_transform_inst() The new function takes care of changing an instruction opcode and sources, which will allow later patches to tweak how allocations are done in those cases. Like the instruction allocation, this also takes a shader (or a builder, for it to get a shader). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:24:59 +00:00
Caio Oliveira	29c12bbebf	brw: Centralize brw_inst allocation Add and use brw_new_inst() and brw_clone_inst() and do not use stack allocated brw_insts. The builder was changed to not use the temporary ones either. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36730>	2025-09-12 00:24:56 +00:00
Lionel Landwerlin	23a4aef14a	Revert "brw: move texture offset packing to NIR" Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This reverts commit `4346210ae6`. Fixes: `4346210ae6` ("brw: move texture offset packing to NIR") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37050>	2025-08-29 06:29:14 +00:00
Lionel Landwerlin	3362b8dcb5	brw: use a scalar builder for the load_payload on transpose loads I noticed SIMD32 shaders have that kind of pattern : mov(32) g94<1>D 0D { align1 WE_all }; send(1) g15UD g94UD nullUD 0x6210d500 0x02010000 ugm MsgDesc: ( load, a32, d32, V16, transpose, L1STATE_L3MOCS dst_len = 1, src0_len = 1, src1_len = 0 bti ) BTI 2 base_offset 16 { align1 WE_all 1N I@5 $1 }; Why use a 32 wide register for a SEND that is only going to read the first lane? We can stick a single physical register and reduce register pressure. DG2 fossils-db results : Totals: Instrs: 157417515 -> 157417796 (+0.00%); split: -0.00%, +0.00% Cycle count: 15362185116 -> 15363086774 (+0.01%); split: -0.05%, +0.05% Max live registers: 29059141 -> 29051166 (-0.03%) Max dispatch width: 5071256 -> 5075720 (+0.09%); split: +0.33%, -0.24% Totals from 82132 (14.43% of 569221) affected shaders: Instrs: 26564632 -> 26564913 (+0.00%); split: -0.00%, +0.00% Cycle count: 4630907475 -> 4631809133 (+0.02%); split: -0.16%, +0.18% Max live registers: 5425037 -> 5417062 (-0.15%) Max dispatch width: 128384 -> 132848 (+3.48%); split: +12.92%, -9.45% LNL fossils-db results : Totals: Instrs: 141870413 -> 141870745 (+0.00%); split: -0.00%, +0.00% Cycle count: 20176018818 -> 20191262632 (+0.08%); split: -0.07%, +0.14% Max live registers: 44858167 -> 44838370 (-0.04%) Totals from 51859 (10.55% of 491590) affected shaders: Instrs: 16834547 -> 16834879 (+0.00%); split: -0.00%, +0.00% Cycle count: 5761980106 -> 5777223920 (+0.26%); split: -0.24%, +0.50% Max live registers: 5893878 -> 5874081 (-0.34%) Perf A/B testing only reported a 0.5% improvement on DG2 on one trace, no changes on BMG. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36958>	2025-08-26 12:03:22 +00:00
Lionel Landwerlin	fe38fb858c	brw: workaround broken indirect RT messages on Gfx11 Unfortunately we cannot use the indirect descriptor on Gfx11, it appears to just drop writes. Other platforms appear to be fine. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36883>	2025-08-20 15:01:50 +00:00
Caio Oliveira	cebac156c4	brw: Only access valid sources in lower_btd_logical_send() Only the SHADER_OPCODE_BTD_SPAWN_LOGICAL has sources, so only reach for them when handling that instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36817>	2025-08-19 13:54:43 +00:00
Lionel Landwerlin	c871a62a75	brw: move URB channel mask shifting to the lowering pass For example Xe2 uses the LSC and doesn´t need the shifting, so let's just apply it where it's needed. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36757>	2025-08-13 12:01:49 +00:00
Kenneth Graunke	47fe9d28e7	brw: Enumerate SHADER_OPCODE_SEND sources and standardize how many This introduces enums for SHADER_OPCODE_SEND[_GATHER] sources, similar similar to what we've done for most of the newer logical opcodes. This allows us to use actual names for sources rather than remembering their order, or leaving ourselves comments like /* ex_desc */ all over. It will also make it easier to add or reorder sources in the future. While we're at it, we also standardize on the number of sources. Previously, we allowed SHADER_OPCODE_SEND to have either 3 (monosend) or 4 (split send) sources, but this is mostly for haphazard historical reasons. We now specify all sources every time, eliminating the need for careful inst->source checks before accessing the last source. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>	2025-08-08 22:12:08 +00:00
Kenneth Graunke	00d38b980d	brw: Properly resolve non-sendable sources in a few logical opcodes Sources decorated with source modifiers, immediates, or particular stride combinations may not be directly usable as SEND operands. We have to resolve them to an ordinary VGRF first. Most opcodes do this as part of broader payload construction, but these send directly because the messages are very simple. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>	2025-08-08 22:12:06 +00:00
Kenneth Graunke	90dbbc69bb	brw: Use BAD_FILE instead of ARF null for second send payload A number of places emit monolithic sends, where the second payload is empty. Some places were using a BAD_FILE register, while others were specifying the hardware ARF null register. Switch to BAD_FILE for consistency - this is usually what we do for "source isn't present". Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>	2025-08-08 22:12:04 +00:00
Lionel Landwerlin	4c65aef155	brw: implement ACCESS_COHERENT on Gfx12.5+ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36595>	2025-08-08 08:44:22 +00:00
Rohan Garg	c978394e00	intel/compiler: use the WA framework when emitting WA 14014595444 Fixes: `d276ad4` "intel/compiler: implement Wa_14014595444 for DG2" Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36262>	2025-08-06 13:34:28 +00:00
Qiang Yu	260bdad074	all: rename gl_shader_stage_is_rt to mesa_shader_stage_is_rt Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:41 +08:00
Paulo Zanoni	257e1515e3	brw: null-tile sends don't need to skip L3 on Xe2 and newer Despite the information in "Overview of Memory Access" (57046), the L3 seems to be smarter on Xe2+. See `4aa3b2d3ad` ("anv: LNL+ doesn't need the special flush for sparse"). The behavior is the same both with vm_bind and TR-TT. v2: Add some comments (Caio). Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>	2025-08-01 18:47:37 +00:00
Paulo Zanoni	80f01c03ba	brw: remove unnecessary casts to unsigned after calling LSC_CACHE() The macro already casts the values to unsigned. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>	2025-08-01 18:47:37 +00:00
Paulo Zanoni	4bb41156b9	brw: mark 'volatile' sends as uncached on LSC messages The residencyNonResidentStrict property requires that writes to unbound memory be ignored and reads return zero. We need this property, otherwise vkd3d will claim we don't support DX12. If a shader writes to a variable associated with an unbound memory region (i.e., mapped to a null tile), reads it back (in the same shader) and expects the value be 0 instead of what is wrote, it has to use the 'volatile' access qualifier to the variable associated with the access, otherwise the compiler will be allowed to optmize things and use the non-zero value. This is explained in the "Accessing Unbound Regions" section of the Vulkan spec. Our hardware adds an extra problem on top of the above. BSpec page "Overview of Memory Access" (47630, 57046) says: "If a read from a Null tile gets a cache-hit in a virtually-addressed GPU cache, then the read may not return zeroes." So, when we detect this type of access, we have to turn off the caching. There's a proposed Vulkan CTS test that does exactly the above. No shaders on shader_db seem to be using 'volatile'. v2: - Reorder commit order - Rewrite commit message v3: - Rework the patch after Caio pointed out the interaction with 'coherent'. - Remove previous R-B tags due to the patch differences. v4: - Rework the patch and commit message again after further discussions. v5: - Check for atomic first so we don't regress DG2 atomic tests. Fixes future test: dEQP-VK.sparse_resources.buffer.ssbo.read_write.sparse_residency_non_resident_strict Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>	2025-08-01 18:47:37 +00:00
Paulo Zanoni	f7581e4a38	brw: consider 'volatile' memory access when doing CSE The GLSL spec says (among other things): "When a volatile variable is read, its value must be re-fetched from the underlying memory, even if the shader invocation performing the read had previously fetched its value from the same memory. When a volatile variable is written, its value must be written to the underlying memory, even if the compiler can conclusively determine that its value will be overwritten by a subsequent write." The SPIR-V spec says (among other things): "Accesses to volatile memory cannot be eliminated, duplicated, or combined with other accesses." So in this commit we make sure that both writes and reads marked as volatile can't be affected by CSE. v2: Reorder patches in the series. Credits-to: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1) Reviewed-by: Iván Briano <ivan.briano@intel.com> (v1) Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>	2025-08-01 18:47:36 +00:00
Antonio Ospite	ddf2aa3a4d	build: avoid redefining unreachable() which is standard in C23 In the C23 standard unreachable() is now a predefined function-like macro in <stddef.h> See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in And this causes build errors when building for C23: ----------------------------------------------------------------------- In file included from ../src/util/log.h:30, from ../src/util/log.c:30: ../src/util/macros.h:123:9: warning: "unreachable" redefined 123 \| #define unreachable(str) \ \| ^~~~~~~~~~~ In file included from ../src/util/macros.h:31: /usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition 456 \| #define unreachable() (__builtin_unreachable ()) \| ^~~~~~~~~~~ ----------------------------------------------------------------------- So don't redefine it with the same name, but use the name UNREACHABLE() to also signify it's a macro. Using a different name also makes sense because the behavior of the macro was extending the one of __builtin_unreachable() anyway, and it also had a different signature, accepting one argument, compared to the standard unreachable() with no arguments. This change improves the chances of building mesa with the C23 standard, which for instance is the default in recent AOSP versions. All the instances of the macro, including the definition, were updated with the following command line: git grep -l '[^_]unreachable(' -- "src/**" \| sort \| uniq \| \ while read file; \ do \ sed -e 's/$[^_]$unreachable(/\1UNREACHABLE(/g' -i "$file"; \ done && \ sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>	2025-07-31 17:49:42 +00:00
Lionel Landwerlin	343f3dd3c1	brw: fix non constant BTI accesses with offsets Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `e103afe7be` ("brw: run the nir_opt_offsets pass and set the maximum offset size") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35822>	2025-07-02 01:04:06 +03:00
Ian Romanick	b83f618fb2	brw: Fully write temporary destinations Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Consider an innocuous instruction like: and(1) v250:UD, g0.3<0,1,0>:UD, 4294967264u NoMask group0 If register allocation decides to spill v250, it will see this instruction and say, "Oh no! The other components of v250 aren't set, so I'd better add a fill before that instruction!" But it gets even worse than that... if register coalesce decided to merge two of these, the live range gets massively extended because the writes don't fully initialize the value. This causes the need to spill these registers in the first place. Changing that instruction to SIMD16 on Xe2 or SIMD8 on other platforms alleviates these issues. shader-db: Lunar Lake total instructions in shared programs: 17118324 -> 17113191 (-0.03%) instructions in affected programs: 93701 -> 88568 (-5.48%) helped: 42 / HURT: 6 total cycles in shared programs: 895422566 -> 895079488 (-0.04%) cycles in affected programs: 30111338 -> 29768260 (-1.14%) helped: 35 / HURT: 40 total spills in shared programs: 3588 -> 3304 (-7.92%) spills in affected programs: 285 -> 1 (-99.65%) helped: 10 / HURT: 0 total fills in shared programs: 2218 -> 1663 (-25.02%) fills in affected programs: 556 -> 1 (-99.82%) helped: 10 / HURT: 0 Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results. (Meteor Lake shown) total instructions in shared programs: 20059218 -> 20053563 (-0.03%) instructions in affected programs: 96938 -> 91283 (-5.83%) helped: 43 / HURT: 6 total cycles in shared programs: 884174588 -> 883536475 (-0.07%) cycles in affected programs: 22105268 -> 21467155 (-2.89%) helped: 35 / HURT: 27 total spills in shared programs: 5032 -> 4679 (-7.02%) spills in affected programs: 355 -> 2 (-99.44%) helped: 12 / HURT: 0 total fills in shared programs: 4782 -> 4113 (-13.99%) fills in affected programs: 671 -> 2 (-99.70%) helped: 12 / HURT: 0 Skylake total instructions in shared programs: 19097658 -> 19097665 (<.01%) instructions in affected programs: 14202 -> 14209 (0.05%) helped: 0 / HURT: 5 total cycles in shared programs: 862058109 -> 862058267 (<.01%) cycles in affected programs: 3450244 -> 3450402 (<.01%) helped: 7 / HURT: 11 fossil-db: Lunar Lake Totals: Cycle count: 31439652246 -> 31439652272 (+0.00%) Totals from 2 (0.00% of 707091) affected shaders: Cycle count: 2602 -> 2628 (+1.00%) No other Intel platforms had any fossil-db changes. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35721>	2025-06-26 17:59:47 +00:00
Rohan Garg	0186113640	brw: encode the offset into the message descriptor for Xe2 Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:24 +00:00
Rohan Garg	937d37f0b1	brw: introduce MEMORY_LOGICAL_ADDRESS_OFFSET to encode address offsets Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:24 +00:00
Sagar Ghuge	821c1bfa7e	intel/compiler: Fix stackIDs on Xe2+ Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details For Xe2+, from Bspec 64643, bit field "StackID": The maximum number of StackIDs can be 2^12- 1. Cc: mesa-stable Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34709>	2025-04-29 17:03:35 +00:00
Caio Oliveira	7ae638c0fe	brw: Add brw_builder::uniform() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34355>	2025-04-04 23:07:21 +00:00
Lionel Landwerlin	4346210ae6	brw: move texture offset packing to NIR That way we can deal with upcoming non constant values for VK_KHR_maintenance8. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>	2025-03-29 02:15:18 +00:00
Matt Turner	1ab5b4f7db	intel/compiler: Use FALLTHROUGH Reported by clang's `-Wimplicit-fallthrough`. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34014>	2025-03-13 20:11:09 +00:00
Caio Oliveira	32e562ae01	brw: Simplify brw_builder "insert before inst" constructor Since brw_inst now has the block it belongs and the block can reach the shader, the only necessary information to create a builder is the brw_inst itself. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>	2025-03-06 23:33:38 +00:00
Kenneth Graunke	88309a9818	brw: Rename shared function enums for clarity Our name for this enum was brw_message_target, but it's better known as shared function ID or SFID. Call it brw_sfid to make it easier to find. Now that brw only supports Gfx9+, we don't particularly care whether SFIDs were introduced on Gfx4, Gfx6, or Gfx7.5. Also, the LSC SFIDs were confusingly tagged "GFX12" but aren't available on Gfx12.0; they were introduced with Alchemist/Meteorlake. GFX6_SFID_DATAPORT_SAMPLER_CACHE in particular was confusing. It sounds like the SFID to use for the sampler on Gfx6+, however it has nothing to do with the sampler at all. BRW_SFID_SAMPLER remains the sampler SFID. On Haswell, we ran out of messages on the main data cache data port, and so they introduced two additional ones, for more messages. The modern Tigerlake PRMs simply call these DP_DC0, DP_DC1, and DP_DC2. I think the "sampler" name came from some idea about reorganizing messages that never materialized (instead, the LSC came as a much larger cleanup). Recently we've adopted the term "HDC" for the legacy data cluster, as opposed to "LSC" for the modern Load/Store Cache. To make clear which SFIDs target the legacy HDC dataports, we use BRW_SFID_HDC0/1/2. We were also citing the G45, Sandybridge, and Ivybridge PRMs for a compiler that supports none of those platforms. Cite modern docs. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33650>	2025-02-27 08:49:24 +00:00
Lionel Landwerlin	d0c980caa7	brw: avoid setting up the sampler header bits when unused Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33704>	2025-02-26 17:19:04 +00:00
Sagar Ghuge	6f7a76e9d9	intel/compiler: Zero out the header for texel fetch It looks like even if we pass the header not present in the sampler descriptor, it's not helping with the correct behavior of texelFetch. Experiment on real HW shows that if we just zero out the header and include it in the message, it helps with the correct behavior. I'm not sure if there is a valid HW workaround for this one. We can skip masking the sampler message header bits 4:0 but masking them out doesn't hurt in this case. Increasing number of parameter impact sampler performance, For example, a sample message using 5 parameters will not be able to sustain the same throughput as a sample message with only 4 valid parameters. We should look out for any perf impact with respect to texel fetch. This patch fixes ~3k tests involving texelFetch instruction on Xe3+ Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33562>	2025-02-26 00:23:49 +00:00
Caio Oliveira	ff44f4d278	intel/brw: Update outdated comments Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Caio Oliveira	cf3bb77224	intel/brw: Rename fs_visitor to brw_shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Caio Oliveira	352a63122f	intel/brw: Rename files brw_fs.cpp/h to brw_shader.cpp/h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Caio Oliveira	f8a979466b	intel/brw: Rename and move thread_payload types to own header Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Kenneth Graunke	ae60338142	brw: Lower MEMORY_FENCE and INTERLOCK in lower_logical_sends We teach lower_logical_sends to lower these to SHADER_OPCODE_SEND and drop all the corresponding generator and eu_emit code. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>	2025-02-08 01:07:22 +00:00
Kenneth Graunke	accef5e8f5	brw: Replace fs_inst::target field with logical FB read/write sources We can just specify this as a source to the logical FB read/write opcodes. Notably FB reads had no sources before; now they have one. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>	2025-02-08 01:07:22 +00:00
Kenneth Graunke	32dd722ff3	brw: Replace fs_inst::last_rt with a logical control source Rather than using a bit in the generic fs_inst data structure, we can simply set a source on our logical FB write messages. (We already do so for many other cases.) In the repclear shader, setting this wasn't actually having an effect, as we were setting it on a SHADER_OPCODE_SEND message which ignored it. (We had already correctly set the bit in the message descriptor.) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>	2025-02-08 01:07:22 +00:00

1 2 3 4

195 commits