The user must have used INTEL_FORCE_PROBE to force the device to be
loaded, so they specifically opted-in to enabled unsupported device
support.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31011>
The intention here was to get include the common intel_gem.h to
get the intel_ioctl() signature.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31026>
Placing mtl-fw.json in src/intel/ci/mtl-fw.json works for the
mesa build, but it fails to fetch in drm-ci. Move it to the
.gitlab-ci directory so it is included in the artifacts used
for building the kernel/rootfs in drm-ci.
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30947>
Surfaced after recent improvements on SWSB handling, the previous
assembly code was gracefully lowering the $1 into $1.dst.
Fixes: 37674196221 ("intel: Add executor tool")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30960>
Literally inside an if-statement (about 26 lines before this hunk)
that checks for !nir_src_is_const(instr->src[1]).
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
This prevents some regressions later in the MR. Once load_const
operations are marked as is_scalar, they will cesase to get the
automatic constant propagation that occurs in try_rebuild_source.
No shader-db or fossil-db changes on any Intel platform.
v2: Slightly relax source restrictions on
SHADER_OPCODE_UNALIGNED_OWORD_BLOCK_READ_LOGICAL. Add a comment
explaining the restriction.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
The is_partial_write check is too strict because it tests two separate
things. It tests whether or not the instruction always writes a value
(i.e., is it predicated), and it tests whether or not the instruction
writes a complete register. This latter check is problematic as it
perevents cmod propagation in SIMD1, and it prevents cmod propagation in
SIMD8 when the destination size is 16 bits.
This check is unnecessary. Cmod propagation already checks that the
region written and region read overlap. It also already checks that the
execution sizes of the instructions match. Further restriction based on
the specific parts of the register written only generates false
negatives.
v2: Relax all of the calls to is_partial_write. Suggested by Caio.
No shader-db changes on any Intel platform.
fossil-db:
Meteor Lake
Totals:
Instrs: 151505520 -> 151502923 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17201385104 -> 17194901423 (-0.04%); split: -0.06%, +0.02%
Spill count: 80827 -> 80837 (+0.01%)
Fill count: 152693 -> 152692 (-0.00%); split: -0.01%, +0.01%
Totals from 346 (0.05% of 630198) affected shaders:
Instrs: 1257205 -> 1254608 (-0.21%); split: -0.21%, +0.00%
Cycle count: 5532845647 -> 5526361966 (-0.12%); split: -0.18%, +0.06%
Spill count: 32903 -> 32913 (+0.03%)
Fill count: 64338 -> 64337 (-0.00%); split: -0.03%, +0.03%
DG2
Totals:
Instrs: 151531440 -> 151528055 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17200238927 -> 17197996676 (-0.01%); split: -0.03%, +0.02%
Spill count: 81003 -> 80971 (-0.04%); split: -0.04%, +0.00%
Fill count: 152975 -> 152912 (-0.04%); split: -0.05%, +0.01%
Totals from 346 (0.05% of 630198) affected shaders:
Instrs: 1260363 -> 1256978 (-0.27%); split: -0.27%, +0.00%
Cycle count: 5532019670 -> 5529777419 (-0.04%); split: -0.09%, +0.05%
Spill count: 33046 -> 33014 (-0.10%); split: -0.11%, +0.01%
Fill count: 64581 -> 64518 (-0.10%); split: -0.13%, +0.03%
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149972324 -> 149972289 (-0.00%)
Cycle count: 15566495293 -> 15565151171 (-0.01%); split: -0.01%, +0.00%
Totals from 16 (0.00% of 629912) affected shaders:
Instrs: 351194 -> 351159 (-0.01%)
Cycle count: 3922227030 -> 3920882908 (-0.03%); split: -0.04%, +0.00%
Skylake
Totals:
Instrs: 140787999 -> 140787983 (-0.00%); split: -0.00%, +0.00%
Cycle count: 14665614947 -> 14665515855 (-0.00%); split: -0.00%, +0.00%
Spill count: 58500 -> 58501 (+0.00%)
Fill count: 102097 -> 102100 (+0.00%)
Totals from 16 (0.00% of 625685) affected shaders:
Instrs: 343560 -> 343544 (-0.00%); split: -0.01%, +0.01%
Cycle count: 3354997898 -> 3354898806 (-0.00%); split: -0.01%, +0.01%
Spill count: 16864 -> 16865 (+0.01%)
Fill count: 27479 -> 27482 (+0.01%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
Without this, the next commit tiggers assertions.
v2: Unconditionally do the lowering after brw_nir_optimize. Suggested by
Caio.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
The specific pattern from the unit test was observed in ray tracing
trampoline shaders.
v2: Refactor the is_raw_move tests out to a utility function. Suggested
by Ken.
v3: Fix a regression caused by being too picky about source
modifiers. This was introduced somewhere between when I did initial
shader-db runs an v2.
v4: Fix typo in comment. Noticed by Caio.
shader-db:
All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19734086 -> 19733997 (<.01%)
instructions in affected programs: 135388 -> 135299 (-0.07%)
helped: 76 / HURT: 2
total cycles in shared programs: 916290451 -> 916264968 (<.01%)
cycles in affected programs: 41046002 -> 41020519 (-0.06%)
helped: 32 / HURT: 29
fossil-db:
Meteor Lake, DG2, and Skylake had similar results. (Meteor Lake shown)
Totals:
Instrs: 151531355 -> 151513669 (-0.01%); split: -0.01%, +0.00%
Cycle count: 17209372399 -> 17208178205 (-0.01%); split: -0.01%, +0.00%
Max live registers: 32016490 -> 32016493 (+0.00%)
Totals from 17361 (2.75% of 630198) affected shaders:
Instrs: 2642048 -> 2624362 (-0.67%); split: -0.67%, +0.00%
Cycle count: 79803066 -> 78608872 (-1.50%); split: -1.75%, +0.25%
Max live registers: 421668 -> 421671 (+0.00%)
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149995644 -> 149977326 (-0.01%); split: -0.01%, +0.00%
Cycle count: 15567293770 -> 15566524840 (-0.00%); split: -0.02%, +0.01%
Spill count: 61241 -> 61238 (-0.00%)
Fill count: 107304 -> 107301 (-0.00%)
Max live registers: 31993109 -> 31993112 (+0.00%)
Totals from 17813 (2.83% of 629912) affected shaders:
Instrs: 3738236 -> 3719918 (-0.49%); split: -0.49%, +0.00%
Cycle count: 4251157049 -> 4250388119 (-0.02%); split: -0.06%, +0.04%
Spill count: 28268 -> 28265 (-0.01%)
Fill count: 50377 -> 50374 (-0.01%)
Max live registers: 470648 -> 470651 (+0.00%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
Commit f900b763b1 we started to dirty MS as WM changes. However
later on things changed with eebb6cd236, we need to dirty with
BLEND_STATE now.
Fixes: eebb6cd236 ("anv: stop using 3DSTATE_WM::ForceThreadDispatchEnable")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30920>
Use the CLAMP macro to clamp the value and simplify the sampler count
encoding.
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30922>
We could be trying to extract a D/UD from a Q/UQ, for example. We were
ignoring the top 32-bits, which is incorrect.
Fixes: 580e1c592d ("intel/brw: Introduce a new SSA-based copy propagation pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30884>
This function never expands a type - it only narrows it. As such, we
don't need to ever sign extend to fill additional new bits. I think
this code was left over from earlier versions of my optimization pass
that was buggy and trying to handle cases it should not have.
Fixes: 580e1c592d ("intel/brw: Introduce a new SSA-based copy propagation pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30884>
When INTEL_DEBUG=ann is also set, the disassembler would annotate the
output with either a string or the string verison of a NIR instruction.
This was done by keeping two pointers (but only using one at a time).
Change the code to print the instruction into a string instead of
keeping it pointer around (peg the string to the shader). That way,
only one pointer is needed for annotations. Because that serialization
is not free, only do that when the environment variable is set.
Since we are here, move the annotation string field to the end, moving
it to the least commonly used cacheline. Further packing might allow
the entire fs_inst to fit in two cachelines.
For release builds, don't even add the debug annotation to the struct.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30822>
The alignment required for the second union (has 64-bit size) causes
a hole between the first and second union. Move the remaining data
there.
In 64-bit build, shrinks brw_reg from 24 bytes to 16 bytes. And by
consequence, shirnks fs_inst from 200 bytes to 160 bytes, making it
use one less cacheline.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30822>
Introduce skqp testing on ADL generation. Only one job on the pre-merge, and
no fraction needed, so not required to set up a job for nightly runs.
Introduced the initial expectation files with fails, flakes, and skips.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26831>
Introduce testing coverage for Angle in ANV driver on ADL generation. One job
in the pre-merge fraction, and another for the full coverage on the nightly
runs. Introduced the initial expectation files with fails, flakes, and skips.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26831>
Introduce testing coverage for ANV driver on ADL generation. Sharded in 4 jobs
the pre-merge fraction, and with 5 jobs the full coverage on the nightly runs.
Introduced the initial expectation files with fails, flakes, and skips.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26831>
Introduce a new runner tag from a hidden job for ADL (Alder Lake Intel
generation), known as brya.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26831>
The LOAD/STORE opcodes take a vector size, while the LOAD/STORE_CMASK
opcodes take a channel mask. The two are mutually exclusive. So we
can just have the lsc_msg_desc() helper take one or the other in the
same parameter. This more closely matches the actual descriptor.
We couldn't do this until the previous commit, since we were previously
relying on the lsc_msg_desc() function to calculate a cmask out of the
number of vector components. But now we don't need it to do that.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30632>
The LOAD/STORE opcodes take a vector size (number of components), while
the LOAD/STORE_CMASK opcodes take a channel mask. For some reason, we
were passing a number of channels to lsc_msg_desc(), then using it to
construct a channel mask with all channels enabled, and always using the
CMASK message variants.
Considering we don't actually want to mask off any channels, we should
probably just use the regular LOAD/STORE opcodes, as they're more
flexible anyway.
One exception is that typed messages on Xe2 apparently only support
LOAD_CMASK/STORE_CMASK and not regular LOAD/STORE. So we keep using
those there. (Thanks to Sagar Ghuge for catching this!)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30632>
When barriers are used in invalid shaders with non-uniform control flow
we might get a hang. Forcing 32-wide group can help by making it more
probable that barrier instruction is executed by at least one channel
in each thread, and thus hang will be avoided. This shouldn't affect
Xe2+, where active-thread-only barriers are used anyway.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11497
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>
If p_atomic_cmpxchg doesn't set the ray_query_shadow_bos[bucket] to new_bo
allocated by this thread, it returns the bucket BO allocated by the other
thread and we use it. But due to a mistake, we also release that BO, not
the candidate just allocated by this thread and never used again.
Fixes: 5d3e4193 ("anv: enable ray queries")
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>
Truncation is needed for overwriting correctly in cases when old file is
bigger than the one we want to dump (e.g. when the old one was edited
inplace). Also, creation permissions are way too broad.
Fixes: 4f41c44d ("intel/compiler: Add variable to dump binaries of all compiled shaders")
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>
The crash will happen if the client tries to use ray queries without
enabling the KHR_ray_query extension. Add an assert to be able to catch
this sooner.
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30581>
The benchmarks we're tracking tend to prefer clearing depth buffers to
0.0f when the depth buffers are part of images with multiple aspects.
Otherwise, they tend to prefer clearing depth buffers to 1.0f.
Replace the ANV_HZ_FC_VAL constant with a function which implements this
heuristic.
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30767>