Commit graph

3464 commits

Author SHA1 Message Date
Caio Oliveira
06b553f02c intel/elk: Remove compiler specific devinfo hash
This more coarse-grained hash information for compiler (vs. full
devinfo), used only by Iris and Anv, and relevant for more recent
platforms.  Remove it from elk.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27563>
2024-02-24 00:24:30 +00:00
Caio Oliveira
0083585fc5 intel/elk: Compile ELK library, tests and tools
For now is not linked to any driver.  The tools were renamed to use elk
prefix to avoid conflicting with the brw ones.  The run-test.py script
was also updated due to that change.

Before the new compiler can be linked together with the old (going to be
done for Iris and other tools), the symbol conflicts need to be fixed
first.  This will happen in a later commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27563>
2024-02-24 00:24:30 +00:00
Caio Oliveira
d44462c08d intel/elk: Fork Gfx8- compiler by copying existing code
Based on code from commit c3ceec6cd8.

Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27563>
2024-02-24 00:24:30 +00:00
Kenneth Graunke
c12300844d intel/fs: Don't rely on CSE for VARYING_PULL_CONSTANT_LOAD
In the past, we didn't have a good solution for combining scalar loads
with a variable index plus a constant offset.  To handle that, we took
our load offset and rounded it down to the nearest vec4, loaded an
entire vec4, and trusted in the backend CSE pass to detect loads from
the same address and remove redundant ones.

These days, nir_opt_load_store_vectorize() does a good job of taking
those scalar loads and combining them into vector loads for us, so we
no longer need to do this trick.  In fact, it can be better not to:
our offset need only be 4 byte (scalar) aligned, but we were making it
16 byte (vec4) aligned.  So if you wanted to load an unaligned vec2,
we might actually load two vec4's (___X | Y___) instead of doing a
single load at the starting offset.

This should also reduce the work the backend CSE pass has to do,
since we just emit a single VARYING_PULL_CONSTANT_LOAD instead of 4.

shader-db results on Alchemist:
- No changes in SEND count or spills/fills
- Instructions: helped 95, hurt 100, +/- 1-3 instructions
- Cycles: helped 3411 hurt 1868, -0.01% (-0.28% in affected)
- SIMD32: gained 5, lost 3

fossil-db results on Alchemist:
- Instrs: 161381427 -> 161384130 (+0.00%); split: -0.00%, +0.00%
- Cycles: 14258305873 -> 14145884365 (-0.79%); split: -0.95%, +0.16%
- SIMD32: Gained 42, lost 26

- Totals from 56285 (8.63% of 652236) affected shaders:
- Instrs: 13318308 -> 13321011 (+0.02%); split: -0.01%, +0.03%
- Cycles: 7464985282 -> 7352563774 (-1.51%); split: -1.82%, +0.31%

From this we can see that we aren't doing more loads than before
and the change is pretty inconsequential, but it requires less
optimizing to produce similar results.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27568>
2024-02-20 23:16:27 -08:00
Caio Oliveira
8ae528331c intel/compiler: Use "intel" prefix for walk_order enum
Will be used later in non-brw specific code in Iris.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27646>
2024-02-21 00:38:35 +00:00
Kenneth Graunke
1497f4e0c2 intel/fs: Don't include sync.nop in instruction count statistics
With the advent of software scoreboarding, we emit sync instructions
in various places to synchronize the execution pipelines.  This results
in assembly being littered with a bunch of sync.nop instructions.  That
means that when you reorder anything in the program, the scoreboarding
changes, and the number of sync.nops can vary wildly - even if the code
isn't really materially better or worse.  This makes it hard to use
tools like shader-db or fossil-db on post-Icelake platforms.

For now, exclude sync.nops from the instruction count statistic.  One
day we may want to consider improving the software scoreboarding pass
to emit fewer redundant sync.nop instructions, at which point tracking
this as a separate stat might be useful.  For now though, it's simply
cluttering and confusing our results.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27701>
2024-02-20 22:26:09 +00:00
Lionel Landwerlin
0eb3c850c6 intel/clc: workaround LLVM17 opaque pointers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: b52e25d3a8 ("anv: rewrite internal shaders using OpenCL")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27637>
2024-02-20 14:41:43 +00:00
Lionel Landwerlin
62baa4df5f intel/clc: lower temp function/shader variables together
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: 4fd7495c69 ("intel/clc: add ability to output NIR")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27637>
2024-02-20 14:41:43 +00:00
Lionel Landwerlin
cf193af762 anv: fixup push descriptor shader analysis
There are a couple mistakes here :

   - using a bitfield as an index to generate a bitfield...

   - in anv_nir_push_desc_ubo_fully_promoted(), confusing binding
     table access of the descriptor buffer with actual descriptors

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ff91c5ca42 ("anv: add analysis for push descriptor uses and store it in shader cache")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27504>
2024-02-19 11:10:29 +00:00
Caio Oliveira
ae50ac46d1 intel: Remove brw_ prefix from process debug function
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27644>
2024-02-16 22:35:05 +00:00
Caio Oliveira
c773898f39 intel/compiler: Rename brw_gfx_ver_enum.h to intel_gfx_ver_enum.h
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27644>
2024-02-16 22:35:05 +00:00
Caio Oliveira
d8f9a05f32 intel/compiler: Rename the passes and files related to intel_nir.h
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27644>
2024-02-16 22:35:05 +00:00
Caio Oliveira
dc76cfc781 intel/compiler: Collect NIR-only passes in intel_nir.h
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27644>
2024-02-16 22:35:05 +00:00
Mark Janes
c4ce1ca847 intel/compiler: generate a hash function to use with the shader cache
Currently, Intel's shader cache incorporates PCI ID into shader cache
keys.

Many devices with different PCI IDs have identical shader compilation
functionality.  Using PCI ID as a component of the shader cache hash
means that a multi-platform shader cache will have redundant,
identical entries for similar platforms.

All Intel compiler functionality is selected based on device
configuration in `struct intel_device_info`.  intel_device_info.py
flags all fields accessed by intel/compiler.

This commit generates a hash function incorporating intel/compiler
device info fields.  Using this hash function in place of PCI ID will
produce a multiplatform cache with no duplicated content.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26844>
2024-02-15 16:58:15 -08:00
Sagar Ghuge
f55f9272e4 intel/compiler: Fix disassembly of URB message descriptor on Xe2+
URB messages follow the LSC message descriptor so we are already
disassembling the descriptor/extended descriptor, we don't have to
duplicate it.

Without this change:
   urb MsgDesc: ( store, a32, d32, V4, L1UC_L3WB dst_len = 0, src0_len = 2, src1_len = 8 flat )  mlen 2 ex_mlen 8 rlen 0 { align1 1H $1 };

with this change:
   urb MsgDesc: ( store, a32, d32, V4, L1UC_L3WB dst_len = 0, src0_len = 2, src1_len = 8 flat )  base_offset 0  { align1 1H $1 };

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27498>
2024-02-15 19:46:55 +00:00
Caio Oliveira
0b751a2134 intel: Rename i965_{asm,disasm} tools to brw_{asm,disasm}
And move them inside the compiler since they (especially asm) rely on
a bunch of internal types.

Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27579>
2024-02-15 09:26:46 +00:00
Caio Oliveira
5992185c8d intel/compiler: Merge intel_disasm.[ch] into corresponding brw files
Rename the functions to match the existing ones.

Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27579>
2024-02-15 09:26:46 +00:00
Caio Oliveira
468a0ffe9c intel/compiler: Include brw_disasm_info.h where its used
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27579>
2024-02-15 09:26:46 +00:00
Caio Oliveira
ff95f00883 intel/compiler: Move disassemble functions to own header file
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27579>
2024-02-15 09:26:46 +00:00
Caio Oliveira
5732c9d269 intel/compiler: Rename brw_cs_dispatch_info to intel_cs_dispatch_info
And move to the intel_shader_enums.h file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>
2024-02-14 22:31:23 -08:00
Caio Oliveira
c5b80de583 intel/compiler: Rename brw_vue_map to intel_vue_map
And move to the intel_shader_enums.h file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>
2024-02-14 22:31:23 -08:00
Caio Oliveira
7d85d2c7fd intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_*
And move to the intel_shader_enums.h file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>
2024-02-14 22:31:23 -08:00
Caio Oliveira
aeda865b6d intel/compiler: Rename BRW_TESS_* enums to INTEL_TESS_*
And move to the intel_shader_enums.h file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>
2024-02-14 22:31:23 -08:00
Caio Oliveira
26dd1f0bba intel/compiler: Rename BRW_WM_MSAA_* enums to INTEL_MSAA_*
And move to the intel_shader_enums.h file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>
2024-02-14 22:31:23 -08:00
Caio Oliveira
a88084f8be intel/compiler: Rename brw_image_param to isl_image_param
And move them to ISL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>
2024-02-14 22:31:23 -08:00
Jordan Justen
c6e855b64b intel/compiler: Verify SIMD16 is used for xe2 BTD/RT dispatch
Ref: HSD 14011192593

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27529>
2024-02-14 20:07:13 +00:00
Jordan Justen
820e04ead4 intel/compiler: Implement nir_intrinsic_load_topology_id_intel for xe2
Rework:
 * Sagar: Rework BRW_TOPOLOGY_ID_DSS, BRW_TOPOLOGY_ID_EU_THREAD_SIMD
   calculations

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27529>
2024-02-14 20:07:13 +00:00
Jordan Justen
b533bf7361 intel/compiler: Set branch shader required-width as 16 for xe2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27529>
2024-02-14 20:07:13 +00:00
Lionel Landwerlin
1e31fd5f42 meson: add option to install intel-clc
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
2024-02-13 00:06:45 +00:00
Lionel Landwerlin
e6b5196079 intel-clc: print text input
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
2024-02-13 00:06:45 +00:00
Lionel Landwerlin
4fd7495c69 intel/clc: add ability to output NIR
This will be used to generate a serialized NIR of functions for
internal shaders.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
2024-02-13 00:06:45 +00:00
Lionel Landwerlin
2bae1b6b66 intel-clc: move ISA generation to its own function
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
2024-02-13 00:06:45 +00:00
Lionel Landwerlin
2a1ff08376 intel/compiler: make default NIR compiler options visible
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
2024-02-13 00:06:45 +00:00
Lionel Landwerlin
2437556d83 intel/fs: rerun divergence prior to lowering non-uniform interpolate at sample
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 74a40cc4b6 ("intel/fs: move lower of non-uniform at_sample barycentric to NIR")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
2024-02-13 00:06:44 +00:00
Lionel Landwerlin
8f5a7f57df intel/fs: indent lowering code to make it more readable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
2024-02-13 00:06:44 +00:00
Sagar Ghuge
98b62434bd intel/compiler: Lower texture operation to combine LOD and AI
We have to push the lowering of texture operations a bit further in
pipeline since nir_lower_tex gets invoked twice and if there is no LOD
source present, nir_lower_tex adds that as a source. Once that's all
done we can easily combine the LOD and array index into a single 32-bit
value.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>
2024-02-12 21:25:48 +00:00
Sagar Ghuge
15129c7634 intel/compiler: Use nir_tex_src_backend1 to pack LOD and array index
Since this lowering is totally Intel specific, we don't have to
introduce the new texture source. We can use the nir_tex_src_backend1
source to pack LOD/LOD Bias and array index into 32 bit single value.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>
2024-02-12 21:25:48 +00:00
Sagar Ghuge
73a3257968 intel/compiler: Add texture operation lowering pass
This pass combines the LOD or LOD bias and array index into a single
32-bit value since Xe2+ sampler messages requires us to do that.

v2: (Alyssa)
- Use nir_iand_imm instead of nir_iand and nir_imm_int
- Use nir_trim_vector instead of nir_swizzle

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>
2024-02-12 21:25:48 +00:00
Jordan Justen
c3a0483f5b intel/compiler: Lower DPAS instructions on ARL except ARL-H
Ref: bspec 55414
Ref: 951e08fc18 ("intel/compiler: Disable DPAS instructions on MTL")
Suggested-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27352>
2024-02-06 21:23:19 +00:00
Ian Romanick
68da9e4dff intel/compiler/xe2: Set SIMD mode for sampler messages
Since SIMD8 no longer exists, the SIMD modes enums have different names
and different values.

v2 (Francisco Jerez): Rebase on 07b9bfacc7 ("intel/compiler: Move
logical-send lowering to a separate file").

v3: Update brw_disasm.c with SIMD descriptions.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:10 +00:00
Ian Romanick
84de7a88d3 intel/compiler/xe2: Emit texture instructions w/ combined LOD and array index
The extra assertions are just there to help validate
pack_lod_and_array_index (in nir_lower_tex.c).

v2: Split got_lod_or_bias into two variables. This simplifies some
changes that Sagar is working on. Suggested by Sagar.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:10 +00:00
Ian Romanick
78e7f7b377 intel/compiler/xe2: Use new sample_*_mlod messages
Note: a future commit will expand the sampler message type to the 6 bits
used on Xe2.

v2 (Francisco Jerez): Rebase on 07b9bfacc7 ("intel/compiler: Move
logical-send lowering to a separate file").

v3: Drop XE2_SAMPLER_MESSAGE_SAMPLE_BIAS_MLOD as it does not actually
exist. This resulted in some bigger changes in brw_disasm.c. Noticed
by Sagar.

v4: Now that XE2_SAMPLER_MESSAGE_SAMPLE_MLODc conflicts with
GFX7_SAMPLER_MESSAGE_SAMPLE_GATHER4_PO_C, the determination of
min_lod_is_first must include devinfo->ver or previous platforms will
break.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:09 +00:00
Sagar Ghuge
8690a6b546 intel/compiler/xe2: Handle 6-bit message type for Gfx20+
Message types are expanded to 6-bit encoding now. 5 bits are still the
same field from the Sampler Message Descriptor. The most significant bit
is now bit 31 of the Sampler Message Descriptor. The messages that have
'1 in bit 6 are only to support programmable offsets and those would
require message header. If a sampler type shows only 5 bits encoding, it
is implied bit 6 equal to 0 and there is no requirement for header.

v2 (idr): Trivial formatting changes.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:09 +00:00
Ian Romanick
a9ed9cf88b intel/fs: Move opcode modification before the switch that emits srcs
This small refactor simplifies a later commit that will optionally emit
some opcodes before the switch (as is already done with the shadow
comparitor).

v2 (Francisco Jerez): Rebase on 07b9bfacc7 ("intel/compiler: Move
logical-send lowering to a separate file").

v3 (Jordan): SHADER_OPCODE_TXL => SHADER_OPCODE_TXL_LZ (was
SHADER_OPCODE_TXF_LZ).

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:09 +00:00
Ian Romanick
7441af803f intel/compiler/xe2: Update get_sampler_lowered_simd_width
The Bspec also says, "The table below describes the SIMD modes which
are supported. SIMD32 and SIMD64 are used for media-type operations
only."  Perhaps this commit should just add

    if (devinfo->ver >= 20)
        return 16;

instead.

v2: Use reg_unit in get_sampler_lowered_simd_width. Suggested by Sagar.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:09 +00:00
Ian Romanick
118e0bdc1f intel/rt: Don't directly generate umul_32x16
The optimization pass will (eventually) turn the imul into a
umul_32x16. In many cases, the multiply will be converted to something
else.

I also tried cloning a bunch of existing imul algebraic patterns for
[iu]mul_32x16. This produced the same result, but it was a lot more
churn.

All of the shaders affected were ray tracing shaders in Q2RTX. This is
the only ray tracing workload in my fossil-db.

DG2
Totals:
Instrs: 191995626 -> 191995079 (-0.00%); split: -0.00%, +0.00%
Cycles: 14003803561 -> 14003798040 (-0.00%); split: -0.00%, +0.00%
Spill count: 108320 -> 108288 (-0.03%)
Fill count: 200695 -> 200663 (-0.02%)
Scratch Memory Size: 8755200 -> 8754176 (-0.01%)

Totals from 7 (0.00% of 652118) affected shaders:
Instrs: 14998 -> 14451 (-3.65%); split: -3.94%, +0.29%
Cycles: 137222 -> 131701 (-4.02%); split: -4.10%, +0.07%
Spill count: 32 -> 0 (-inf%)
Fill count: 32 -> 0 (-inf%)
Scratch Memory Size: 19456 -> 18432 (-5.26%)

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27161>
2024-02-02 00:02:05 +00:00
Eric Engestrom
92c24191d4 tree-wide: use __normal_user() everywhere instead of writing the check manually
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27346>
2024-01-30 12:45:54 +00:00
Caio Oliveira
4af079960d intel/compiler: Enable lower_rotate_to_shuffle in subgroup lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27272>
2024-01-25 19:07:42 +00:00
Kenneth Graunke
2e38024fd8 intel: Use hardware generated compute shader local invocation IDs
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27167>
2024-01-25 08:43:04 +00:00
Kenneth Graunke
5e7f4ff97f intel: Add driver support for hardware generated local invocation IDs
This adds a few new fields in the brw_cs_prog_data struct and then
uses them to fill in the relevant COMPUTE_WALKER fields.

Although the Tile Layout field theoretically has different settings for
32/64/128bpe, it appears that the recommended programming is to always
pick either TileY 32bpe or Linear.  It's not very practical to look at
the surface formats involved, anyway.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27167>
2024-01-25 08:43:04 +00:00