Commit graph

13862 commits

Author SHA1 Message Date
Valentine Burley
09f86df938 intel/ci: Convert iris-kbl-piglit to deqp-runner suite
This was the last job using the piglit-runner.sh script.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>
2025-04-11 07:05:06 +00:00
Lionel Landwerlin
06ad9a25e5 brw: fix Wa_22013689345 emission
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
2 problems :
  - not detecting null destination correctly
  - applied too late using SHADER_OPCODE_MEMORY_FENCE, when lowering
    already happened

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34319>
2025-04-10 16:44:28 +00:00
Lionel Landwerlin
e321c438dc anv: fix self dependency computation
Some upcoming changes in the runtime will make it impossible to rely
on the pipeline or runtime information to know whether a fragment
shader has input attachments.

Instead we gather that information at compile time and store it in our
shader bind_map.

At runtime we check whether the fragment shader has input attachments
and whether those map to the runtime depth/stencil input attachments
to set the 3DSTATE_PS_EXTRA::PixelShaderKillsPixel.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d2f7b6d5a7 ("anv: implement VK_KHR_dynamic_rendering_local_read")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
2025-04-10 13:17:53 +00:00
Paulo Zanoni
fdbdfaed01 anv: add ANV_SYS_MEM_LIMIT for debugging system memory restrictions
If you suspect a workload is failing because it needs more memory, you
can set ANV_SYS_MEM_LIMIT=100 to give it all the memory available.
This could make, for example, certain games start working (it really
depends on how much RAM you have and how much the game wants).

If you suspect a workload is too resource hungry, you can try to limit
it with ANV_SYS_MEM_LIMIT=30 (or some other value) to see if it can
deal with the more restricted environment and behave accordingly.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>
2025-04-09 22:48:18 +00:00
Paulo Zanoni
ec4b2ce664 anv: restore the old behavior of up to 75% of RAM for the system heap
"We paid for sixteen gigs of RAM, so we gonna use the whole damn
   sixteen gigs of RAM!"
     - My Mom

First, some history:

The Anv 50%-or-75% rule was originally added in 2017 by 060a6434ec
("anv: Advertise larger heap sizes"). When i915.ko started reporting
memory sizes in its ioctls, it didn't impose any restrictions: 100% of
SRAM was reported as available, so the restriction was in Mesa. When
xe.ko was introduced, it only reported 50% of the SRAM as available
through its ioctls, so commit b571ae6e7a ("intel: Make memory heaps
consistent between KMDs") adapted the code to not take an extra 25% of
the 50% that was already cut, and restricted i915.ko to 50% instead of
the 50%-or-75%. In Kernel commit d2d5f6d57884 ("drm/xe: Increase the
XE_PL_TT watermark"), xe.ko changed to reporting 100% of SRAM through
its ioctls, so we adapted Mesa to do the right thing depending on
which Kernel version was running.

While this was all happening, we were discussing about which behavior
was actually the best: restrict everything to 50% in order to avoid
issues when many things are running in parallel, or keep the
restriction only at 75% in order to allow high demanding workloads to
make full use of the hardware.

The way I see, if parallel applications are causing the system to run
out of resources, the user always has the option to kill applications
and use one thing at a time. On the other hand, if a single
application needs more than 50% of the SRAM and we don't allow it in
our heaps, the application will never work (unless, of course, the
user patches Mesa). So in this commit we go back to allowing
high-demanding applications to work by restoring the 50%-or-75% rule.

This commit is especially useful in systems with integrated graphics,
like LNL, where the option to upgrade RAM is not present.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>
2025-04-09 22:48:18 +00:00
Paulo Zanoni
02e896bc49 anv/xe: detect the newer xe.ko memory reporting model and act accordingly
Kernel commit d2d5f6d57884 ("drm/xe: Increase the XE_PL_TT watermark")
changed how xe.ko reportes memory: its ioctls now report 100% of the
system RAM as available. Since our policy is to report 50% of the SRAM
as available for the heaps, add some code to check the amount reported
by xe.ko against the amount reported by the system, then act
accordingly.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>
2025-04-09 22:48:18 +00:00
Paulo Zanoni
3db8931d4a intel/i915: restrict the RAM size restrictions to Anv
Before commit b571ae6e7a ("intel: Make memory heaps consistent
between KMDs"), we had the following policy for reporting Sytem RAM
memory sizes:

- For OpenGL, we reported the total available RAM.
- For Vulkan, we reported the total available RAM as:
  - 50% of the total RAM if the total RAM was <= 4GB,
  - 75% otherwise
  - In addition, the Memory Budget (for VK_EXT_memory_budget) is 90%
    of the "free" memory, which can be an extra 10% off of the 50% or
    75%.

When xe.ko was added, one key difference was noted: while i915.ko
reported the "real" RAM memory sizes in its ioctls, xe.ko reported
only 50% of the system RAM as available. Because of that (and other
reasons, see this discussion on MR 28513), commit b571ae6e7a decided
to unify the behavior by changing the Anv i915.ko rule to "always 50%"
instead of "50% or 75%". This also changed the Iris rule to 50%
instead of 100%.

In my research, I couldn't find any reason why this restriction should
also apply to Iris, so here we revert back to handling these size
restrictions on Anv only.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>
2025-04-09 22:48:18 +00:00
Ian Romanick
cb69d019cf brw/nir: Use offset() for all uses of offs in emit_pixel_interpolater_alu_at_offset
This is necessary to appropriately uniformize the first component
access of a convergent vector. Without this, this is produced:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0:F, 0.5f
    add(32) %22:F, %18+2.0<0>:F, 0.5f

This is the correct code:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0<0>:F, 0.5f
    add(32) %22:F, %18+2.0<0>:F, 0.5f

Without 38b58e286f, the code generated was more incorrect, but happened
to work for this test case:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0<0>:F, 0.5f
    add(32) %22:F, %18+0.4<0>:F, 0.5f

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 38b58e286f ("brw/nir: Fix source handling of nir_intrinsic_load_barycentric_at_offset")
Closes: #12969
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34427>
2025-04-09 22:21:18 +00:00
Caleb Callaway
64b5ee3001 intel/tools: fix 32b build for EU stall tool
Fixes: 610ad8d3 ("intel/tools: create intel_monitor for sampling eu stalls")

Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34439>
2025-04-09 21:40:46 +00:00
Caio Oliveira
7457c4ecfd brw: Make brw_range use half-open ranges
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
6509f8139d brw: Use brw_range::last() to explicit get the last valid IP
This is a preparation to change what is stored in brw_range::end.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
596bbb2c95 brw: Use brw_range to store Vars ranges
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
0b4a3c0ff6 brw: Use brw_range to store VGRF ranges
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
e644b42e59 brw: Use brw_range when operating with live ranges
Makes the intention of some comparisons clearer by using the named
helper functions.  Add commentary when the straightforward range is not
the one used, e.g. VGRF interference.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
f56a5cf1eb brw: Use brw_range in IP ranges analysis
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
fb50461220 brw: Add brw_range struct
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:48 +00:00
Caio Oliveira
8d9155e34d brw: Clean up saturate propagation after non-defs version removal
Remove now unused analysis and no need to walk blocks in reverse
after the non-defs version of the pass was removed.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:48 +00:00
Caio Oliveira
cfc4067b0e brw: Add a few basic tests for register coalesce
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:48 +00:00
Tapani Pälli
0750c4c5f1 intel/dev: update mesa_defs.json from internal database
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34430>
2025-04-09 15:44:22 +00:00
Lionel Landwerlin
76096d04bb anv: relax restriction on variable count descriptors
VUID-VkDescriptorSetAllocateInfo-pSetLayouts-09380 says that :

   "If pSetLayouts[i] was created with an element of pBindingFlags
   that includes VK_DESCRIPTOR_BINDING_VARIABLE_DESCRIPTOR_COUNT_BIT,
   and VkDescriptorSetVariableDescriptorCountAllocateInfo is included
   in the pNext chain, and
   VkDescriptorSetVariableDescriptorCountAllocateInfo::descriptorSetCount
   is not zero, then
   VkDescriptorSetVariableDescriptorCountAllocateInfo::pDescriptorCounts[i]
   must be less than or equal to
   VkDescriptorSetLayoutBinding::descriptorCount for the corresponding
   binding used to create pSetLayouts[i]"

But applications like are not following the spec. RADV doesn't apply
that limit and allocates if there is enough space in the pool. Let's
just do the same.

Note that this issue got resolved with a vkd3d-proton change :

a7ac1a7d2f

But since this change is deleting more code than it adds, might as
well go with it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12185
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32305>
2025-04-09 16:29:21 +03:00
Lionel Landwerlin
19e4dda9a2 brw: fix shuffle with scalar/uniform index
The fixes commit isn't actually the source of the bug but likely the
biggest enabler because it creates scalar values that more easily end
up in the shuffle operations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1b24612c57 ("brw/nir: Treat load_*_uniform_block_intel as convergent")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12927
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12688
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12570
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12905
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12734
Reviewed-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34393>
2025-04-08 20:14:11 +00:00
Felix DeGrood
610ad8d378 intel/tools: create intel_monitor for sampling eu stalls
Created stand alone tool for sampling gfx data on regular
intervals. Tool has inner loop that performs sampling every N
useconds. Press any key to end sampling. Results will be dumped
when intel_monitor exits.

First application of intel_monitor will be to collect eu stall
data. Perhaps more applications can be added at a later date.

How to use:
 0. Set sysctl dev.xe.observation_paranoid=0
 1. Clean shader cache and launch gfx INTEL_DEBUG=shaders-lineno.
    Redirect stderr to asm.txt.
 2. When gfx app ready to monitor, begin capturing eustall data by
    launching `intel_monitor -e > eustall.csv` in separate console.
 3  When done collected, close intel_monitor by pressing any key.
 4. Correlate eustall data in eustall.csv with shader instructions in
    asm.txt by matching instruction offsets. Use data to determine which
    instructions are stalling and why.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142>
2025-04-08 19:39:53 +00:00
Felix DeGrood
2a828c35a1 intel/perf: add eu stall sampling support
Xe2+ GPUs have support for eu stall sampling perf debug feature.
This feature allows driver to collect count and reasons for why
EUs are stalled on GPU. Stall data is cross referenced with ip
address within individual shaders so it is possible to know which
instructions in which shaders are generating stalls. This should
be a very useful feature for debugging performance of slow shaders.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142>
2025-04-08 19:39:53 +00:00
Felix DeGrood
d6a379f7a7 intel/perf: remove unnused argument from xe_perf_stream_read_error
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142>
2025-04-08 19:39:53 +00:00
Felix DeGrood
a09ddc3b77 anv: add INTEL_DEBUG=shaders-lineno
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142>
2025-04-08 19:39:53 +00:00
Felix DeGrood
7a3de9e877 intel/brw: support for dumping shader line numbers
Add support for dumping shader asm containing instruction line numbers
matching offsets within instruction state pool buffer. Offsets
should match values collected from eu stall sampling. This is
required for match eu stall data with individual shader instructions.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142>
2025-04-08 19:39:53 +00:00
Renato Pereyra
7190949927 perfetto/android: align datasource names with tooling expectations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
A few Android tools are based on/assume the datasource names
gpu.renderstages and gpu.counters. It is less effort to align with that
naming for Android builds than to chase down those tools and fix them,
not to mention account for new tools that may be created in the future.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34330>
2025-04-08 18:29:10 +00:00
Faith Ekstrand
436f175187 intel/compiler: Use nir_split_conversions()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34266>
2025-04-07 17:45:21 -05:00
Caio Oliveira
bf9ad36f2d brw: Properly handle cooperative matrices created with constants
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Expand constant sources to cover the region read by DPAS, and also
use NULL register as accumulator when possible.

Reviewed-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34373>
2025-04-07 14:27:43 -07:00
Collabora's Gfx CI Team
fcf19bf335 Uprev ANGLE to 3818d37d5e94317f01810053b8f28c1f1e8b98e6
1b34d2a18a...3818d37d5e

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34378>
2025-04-07 18:16:00 +00:00
Ian Romanick
f33faa4648 brw/nir: Allow b2f(not(X)) optimization on Gfx12.5+
Since there are no type conversions, no restrictions are violated.

No shader-db or fossil-db changes on any Gfx12 or older Intel
platforms.

shader-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 16956077 -> 16944933 (-0.07%)
instructions in affected programs: 1957573 -> 1946429 (-0.57%)
helped: 4629 / HURT: 35

total cycles in shared programs: 915668518 -> 915684808 (<.01%)
cycles in affected programs: 341925598 -> 341941888 (<.01%)
helped: 3040 / HURT: 1305
helped stats (abs) min: 2 max: 23034 x̄: 205.36 x̃: 16
helped stats (rel) min: <.01% max: 41.21% x̄: 1.28% x̃: 0.48%
HURT stats (abs)   min: 2 max: 68820 x̄: 490.88 x̃: 22
HURT stats (rel)   min: <.01% max: 103.69% x̄: 2.29% x̃: 0.37%
95% mean confidence interval for cycles value: -50.28 57.78
95% mean confidence interval for cycles %-change: -0.35% -0.07%
Inconclusive result (value mean confidence interval includes 0).

LOST:   40
GAINED: 42

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 209828027 -> 209790349 (-0.02%); split: -0.03%, +0.01%
Cycle count: 30504938008 -> 30514045408 (+0.03%); split: -0.06%, +0.09%
Spill count: 512182 -> 512168 (-0.00%)
Fill count: 623432 -> 623426 (-0.00%); split: -0.00%, +0.00%
Max live registers: 65465029 -> 65464959 (-0.00%)

Totals from 57895 (8.19% of 706589) affected shaders:
Instrs: 50144907 -> 50107229 (-0.08%); split: -0.11%, +0.03%
Cycle count: 7549692606 -> 7558800006 (+0.12%); split: -0.25%, +0.37%
Spill count: 58834 -> 58820 (-0.02%)
Fill count: 102324 -> 102318 (-0.01%); split: -0.01%, +0.01%
Max live registers: 9129045 -> 9128975 (-0.00%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
853ead2073 brw/nir: Optimize b2f(not(X)) using logical operations instead of arithmetic
Funny story... this is how regular b2f was implemented before Curro
implmented the `MOV dst:F -src:D` method 9 years ago (see
3ee2daf23d).

Eliminating the type conversion in the arithmetic operation enables the
next commit.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
3d23496fd9 brw/copy: Copy prop -X into Y&1
This commit prevents code quality regressions in the next
commit. Without this, some fragment shaders in Batman: Arkham Origins
have code like:

    shr(8)          g51<1>UW        g1.28<1,8,0>UB  0x76543210V
    ...
    and(8)          g52<1>UD        ~g51<8,8,1>UW   0x0001UW
    ...
    add(8)          g56<1>D         -g52<8,8,1>D    1D

transformed to

    shr(8)          g51<1>UW        g1.28<1,8,0>UB  0x76543210V
    ...
    and(8)          g52<1>UD        ~g51<8,8,1>UW   0x0001UW
    ...
    mov(8)          g56<1>D         -g52<8,8,1>D
    ...
    and(8)          g57<1>UD        ~g56<8,8,1>D    0x00000001UD

Propagating through the negation allows the added MOV to be deleted.

shader-db:

All Intel platforms had simlar results. (Lunar Lake shown)
total instructions in shared programs: 16968020 -> 16968019 (<.01%)
instructions in affected programs: 281 -> 280 (-0.36%)
helped: 1 / HURT: 0

total cycles in shared programs: 914598850 -> 914598832 (<.01%)
cycles in affected programs: 5398 -> 5380 (-0.33%)
helped: 1 / HURT: 0

A single Blender vertex shader was affected.

fossil-db:

Lunar Lake, Tiger Lake, Ice Lake, and Skylake had similar results. (Lunar Lake shown)
Totals:
Instrs: 209894650 -> 209894651 (+0.00%)
Cycle count: 30545958586 -> 30545952860 (-0.00%)

Totals from 2 (0.00% of 706657) affected shaders:
Instrs: 3582 -> 3583 (+0.03%)
Cycle count: 1875100 -> 1869374 (-0.31%)

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Subgroup size: 9906400 -> 9906416 (+0.00%)

Totals from 2 (0.00% of 805770) affected shaders:
Subgroup size: 16 -> 32 (+100.00%)

Two compute shaders in Hogwarts Legacy were affected.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
e82464e6e0 brw/copy: Refactor source modifier type checking
This simplifies the next commit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
dee49f4206 brw/algebraic: Optimize derivative of convergent value
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is mostly defensive. If a convergent value ever ended up as a
source of a DDX or DDY, the eu_emit code will ignore the stride. This
will result in bad code being generated.

No shader-db or fossil-db changes on any Intel platform.

v2: DDX and DDY will always be float, but brw_imm_for_type only works
with integer types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Ken
Fixes: d5d7ae22ae ("brw/nir: Fix up handling of sources that might be convergent vectors")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33007>
2025-04-07 17:16:34 +00:00
Ian Romanick
5656682344 brw/nir: Eliminate default parameter to get_nir_src
The vast majority of the callers want channel = 0. During the
development process, using this default parameter value saved a lot of
pain in rebasing. However, it seems to be more trouble than it's worth.

Issue #12464 occurred because LNL was merged while this code was in
review. As a result, one caller of get_nir_src that wanted channel = -1
was not inspected closely, and it got the default channel = 0 instead.

To prevent this happening in the future (with possible branches still
yet to be merged, for example), remove the default parameter. This will
force the inspection of any callers that don't have an explicit channel
parameter. Hopefully that will prevent more problems.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33007>
2025-04-07 17:16:34 +00:00
Ian Romanick
38b58e286f brw/nir: Fix source handling of nir_intrinsic_load_barycentric_at_offset
The source of nir_intrinsic_load_barycentric_at_offset is a vector, so
-1 should be passed to get_nir_src. This is also done for texture
sampling intrinsics.

I skimmed the other user of get_nir_src, and I believe they are
correct. This one was just missed as LNL support landed an many, many
rebases of the original MR occurred.

v2: Fix another get_nir_src call. Suggested by Lionel.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: d5d7ae22ae ("brw/nir: Fix up handling of sources that might be convergent vectors")
Closes: #12464
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33007>
2025-04-07 17:16:34 +00:00
Caio Oliveira
6a55581d41 intel/executor: Fix check for open() failure
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 71ae31dbd8 ("intel/executor: Allow selecting a device to use")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34400>
2025-04-06 19:43:51 -07:00
Caio Oliveira
9845693912 brw: Fix memory leak in EU validation tests
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 62323a934b ("brw: Add BRW_TYPE_BF validation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34395>
2025-04-06 06:26:03 +00:00
Caio Oliveira
c33ee4adae brw: Fix invalid memory access in scoreboard test
Fixes: 03aca2d248 ("brw: Use new bld/exp style in scoreboard tests")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34394>
2025-04-05 22:58:23 -07:00
Caio Oliveira
7ae638c0fe brw: Add brw_builder::uniform()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34355>
2025-04-04 23:07:21 +00:00
Caio Oliveira
f33d93da11 brw: Remove HSW specific code from brw_compile_cs.cpp
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34355>
2025-04-04 23:07:21 +00:00
Sushma Venkatesh Reddy
8f90b10b63 intel/tools: Improve memory allocation failure handling in aubinator_error_decode_xe
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Ensure proper cleanup when memory allocation fails during HWCTX and VMA
parsing in `read_xe_data_file`. This ensures graceful error handling by
preventing potential memory leaks.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34371>
2025-04-04 22:09:27 +00:00
Caio Oliveira
03aca2d248 brw: Use new bld/exp style in scoreboard tests
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:53 +00:00
Caio Oliveira
7ee673c195 brw: Add parser of SWSB annotations to use in tests
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:53 +00:00
Caio Oliveira
81dd3e1527 brw: Return actual progress in brw_lower_scoreboard
This will be useful later for tests to be used in conjunction with the
EXPECT_PROGRESS / EXPECT_NO_PROGRESS helpers.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:53 +00:00
Caio Oliveira
3e727000dd brw: Stop setting SFID in scoreboard tests
They won't affect the scoreboard, and will get in the
way of a later change.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:53 +00:00
Caio Oliveira
bcea076aca brw: Use SIMD16 shaders in scoreboard tests for Xe2+
Some tests changed to avoid unintended overlap between operands which
would change the SWSB assigned.  In some cases also changed the Gfx12
matching test so they remain equal.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:52 +00:00
Caio Oliveira
cd486cda48 brw: Use control flow helpers in scoreboard tests
Also update WHILE to optionally take a predicate (default to NONE).  And
make the predicate in the IF optional (default to NORMAL).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:52 +00:00
Lionel Landwerlin
72bc74f0be anv: add shader-hash debug option
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Emits a dummy MI_STORE_DATA_IMM with the shader hash in front of :
   - 3DSTATE_VS
   - 3DSTATE_HS
   - 3DSTATE_DS
   - 3DSTATE_HS
   - 3DSTATE_PS
   - COMPUTE_WALKER / GPGPU_WALKER

Example :

0x00000000:  0x10000002:  MI_STORE_DATA_IMM
0x00000000:  0x10000002 : Dword 0
    DWord Length: 2
    Force Write Completion Check : false
    Store Qword: 0
    Use Global GTT: false
0x00000004:  0xffffe0c0 : Dword 1
    Core Mode Enable: 0
0x00000008:  0x0000effe : Dword 2
    Address: 0xeffeffffe0c0
0x0000000c:  0x126e815a : Dword 3  <------------ shader hash
0x00000010:  0x78100007 : Dword 4
    Immediate Data: 309231962
0x00000000:  0x78100007:  3DSTATE_VS
0x00000000:  0x78100007 : Dword 0
    DWord Length: 7
0x00000004:  0x00000000 : Dword 1
0x00000008:  0x00000000 : Dword 2
    Kernel Start Pointer: 0x00000000
0x0000000c:  0x00040000 : Dword 3
    Software Exception Enable: false
    Accesses UAV: false

It'll correlate with the value emitted in the pipeline stats from fossil replay :

  $ grep -i 126e815a /tmp/stats.csv
  fossilize.aab93c5c3f965151.1.foz,GRAPHICS,de1b925dec8a8083,507378,498283,303434,vertex,8,50,4,0,1826,0,0,0,8,17,0,0x00000000126e815a,15

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34332>
2025-04-04 15:18:28 +00:00