The code in question multiplies `uint32_t`s together and assigns them to
a `uint64_t`. It seems rather unlikely at there would be an overflow,
but we might as well do the cast.
CID: 1649587
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38289>
Our exposed limits say we shouldn't be able to, but let's add an assert
in case something changes, and to help Coverity out.
CID: 1662103
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37583>
When available, query and use explicit layout info otherwise fallback to
implicit layout with tiling query.
This fixes aligned layouts of multi-planar formats that were getting
misaligned when adding surfaces with implicit layouts.
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38039>
When EXT_descriptor_buffer is not enabled, we can assume we're in
legacy descriptor mode and not do any switching for secondaries.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38256>
We skip the stall emission for STATE_BASE_ADDRESS since this one can
be skipped on Gfx12.5+ and instead add a new sba tracepoint that has
valid timestamps.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0147908a89 ("anv: predicate emission of STATE_BASE_ADDRESS")
Reviewed-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38256>
Some new CTS tests have geometry shader looking like this :
void main()
{
gl_Position = gl_in[0].gl_Position;
EmitVertex();
EndPrimitive();
// <-- some storage buffer write
}
The generate shader has :
- a message to write the position
- a message to write to the storage buffer
- a final message to end the thread
This generates an empty EOT URB messages which is apparently not legal
(simulation complains, HW hangs) :
send(8) nullUD g126UD nullUD 0x04088007 0x00000000
urb MsgDesc: offset 0 SIMD8 write masked mlen 2 ex_mlen 0 rlen 0 { align1 1Q A@1 EOT };
Instead emit a write with actual data and the mask set at 0 to discard
the effect :
mov(8) g127<1>UD 0x00000000UD { align1 WE_all 1Q };
mov(8) g125<1>UD 0x00000000UD { align1 1Q };
send(8) nullUD g126UD g125UD 0x04088007 0x00000040
urb MsgDesc: offset 0 SIMD8 write masked mlen 2 ex_mlen 1 rlen 0 { align1 1Q A@1 EOT };
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38243>
pps initializes perf counter multiple times, once from
GpuDataSource::register_data_source and once from
GpuDataSource::OnSetup. This is fine, except we should replace
failing assert with skip on second call.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38224>
Note that currently autostrip is disabled globally with
Wa_14021490052 for some gfx versions and steppings.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37975>
These registers need to be whitelisted by kernel so that we can use
it to disable autostrip at will. This is about Wa_14024997852.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37975>
Coverity notices that in this case part of the decision to go down the
locked path invovles reading a flag, which is turn set inside the
protected code. Additional threads can decide they need to go down the
locked path, and then wait on the lock, even though the first thread
willse the device_registered to true, which would have otherwise
prevented them from going down the locked path.
What Coverity doesn't know, is that it is a violation of the Vulkan API
contract to call this function from two different threads, so in
practice that cann't happen. We'll move the setting of the
image_device_registered out of the locked area to see if that pacifies
Coverity, and if not then we'll just ignore it.
CID: 1662067
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37582>
No shader-db changes on any Intel platform.
fossil-db:
Skylake
Intel(R) HD Graphics 530 (SKL GT2)
Totals:
Cycle count: 57669758527 -> 57669757913 (-0.00%); split: -0.00%, +0.00%
Totals from 10 (0.00% of 1736875) affected shaders:
Cycle count: 274949 -> 274335 (-0.22%); split: -0.36%, +0.14%
This change is likely due to subtle differences of different registers
being allocated.
In addition, fossils/google-meet-clvk/BgBlur.1f58fdf742c27594.1.foz and
fossils/google-meet-clvk/Relight.1f58fdf742c27594.1.foz stopped failing
EU validation on Gfx9 platforms.
Closes: #14171
Fixes: e7b7d572b3 ("intel/fs/ra: Re-arrange interference setup")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38122>
I'm not really sure why Coverity doesn't tag the `delete[]` as a
potential leak since it also happens after ASSERT macros, like it did
with the call to `fclose()`.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37744>
Coverity points out that if the asserts fail, then the file won't be
closed, and therefore wont be deleted. We'd like to avoid littering the
temp directory with useless files.
This uses GTEST's `TEST_F` feature with a custom class to manager the
creation and destruction of the tmpfile.
CID: 1666502
CID: 1666525
CID: 1666579
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37744>
Slight differences due to different optimization order.
Totals from 135 (0.17% of 79839) affected shaders: (Navi48)
Instrs: 287852 -> 287527 (-0.11%); split: -0.15%, +0.03%
CodeSize: 1522972 -> 1521764 (-0.08%); split: -0.12%, +0.04%
Latency: 1806803 -> 1825754 (+1.05%); split: -0.08%, +1.12%
InvThroughput: 242693 -> 244703 (+0.83%); split: -0.02%, +0.84%
VClause: 4092 -> 4084 (-0.20%)
SClause: 7462 -> 7478 (+0.21%)
Copies: 20509 -> 20401 (-0.53%); split: -0.74%, +0.21%
Branches: 6395 -> 6386 (-0.14%)
PreSGPRs: 7334 -> 7337 (+0.04%); split: -0.03%, +0.07%
PreVGPRs: 6375 -> 6382 (+0.11%)
VALU: 151787 -> 151595 (-0.13%); split: -0.15%, +0.02%
SALU: 52967 -> 52910 (-0.11%); split: -0.23%, +0.12%
VMEM: 6704 -> 6696 (-0.12%)
SMEM: 12099 -> 12129 (+0.25%)
Tested on a small collection of 2518 shaders from Dredge with callgrind using RADV:
baseline:
nir_opt_algebraic was called 12917 times from radv_optimize_nir()
nir_opt_cse was called 15204 times from radv_optimize_nir()
relative time spent in radv_optimize_nir(): 31.48%
total instruction fetch cost: 28,642,638,021
with nir/algebraic: ad-hoc constant-fold ALU instructions
nir_opt_algebraic was called 12797 times from radv_optimize_nir()
nir_opt_cse was called 12963 times from radv_optimize_nir()
relative time spent in radv_optimize_nir(): 30.63%
total instruction fetch cost: 28,284,386,123
=> ~1.27% improvement in total compile times
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
In Gfx9+ the destination should be set to ARF null in all those cases, the
use of IP was a requirement of old versions only. The already zeroed
bits will encode ARF null, so no need to set.
Skipping the helper avoids setting unwanted bits (like hstride), which
in Gfx12+ are MBZ.
This patch adjust the expectations of the asm tests to remove the dst
type and dst stride fields -- will expect them all zeroed.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36454>
This is better than using the generic helper since will not set unwanted
bits (e.g. hstride) and it is already handling their case for Gfx12+
anyway.
There's an extra helper now for the case where src1 is not used. In
Gfx9-11 it needs to be set to ARF but with a matching type of src0.
Assembler was updated to follow the same approach.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36454>
Even though some platforms support int64 they don't support indirect
movs with 64-bit values. Effectively this is only supported for non-LP
Gfx9.
This fixes various tests in dEQP-VK.spirv_assembly.instruction.compute.untyped_pointers.*.push_constant.*64*
on BMG.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38125>
Use same approach as the other code checking for this vstride. Argument
could be made we want to reuse the same enum value for both the encoded
and decoded version, but for now follow the existing practice.
This will cause
dEQP-VK.spirv_assembly.instruction.compute.untyped_pointers.vulkan_memory_model.type_punning.load.push_constant.int64_to_uint64
and similar tests to fail validation on BMG. Later patch will fix that.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38125>
Replace direct FILE* operations (fputs/fprintf to stdout) with the
mesa_log_stream API for pipe control debug output.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38157>
The compressed and uncompressed Tile4 modifiers are supported on Xe2+.
The uncompressed TileY and Tile4 modifiers are easily supported on older
platforms.
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38095>
Intel & AMD Direct3D drivers modify their rounding behaviour for texturing to
match Direct3D expectations. Such behaviour is not conformant in Vulkan, and
Intel hardware lacks a reasonable way to get NVIDIA's behaviour (which uniquely
works for Vulkan & Direct3D). The second best choice is to use
Direct3D-compatible behaviour for Proton (via driconf) and our current
Vulkan-conformant behaviour everywhere else. Given the APIs diverge and there is
no Vulkan extension to control the behaviour explicitly, driconf'ing on the
engineName is the reasonable solution.
anv already has a anv_force_filter_addr_rounding driconf option to force
Direct3D behaviour for certain Direct3D titles. Here we simply apply it to all
D3D10+ titles, aligning us with the Windows driver.
Note that D3D9 does not have this behaviour. We therefore use standard Vulkan
behaviour for D3D9 to avoid breaking D3D9 titles, even though the engineName is
the same as D3D10+.
This is the same solution radv uses, they call it radv_disable_trunc_coord. We
could unify the driconf entries later.
See https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38098#note_3166306
for a more detailed analysis, as well as the linked references:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27337https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25911https://github.com/HansKristian-Work/vkd3d-proton/pull/1884
This fixes misrendering in piles of Direct3D games run on anv via Proton,
including Assassin's Creed Valhalla.
Cc: mesa-stable
Closes: #13886
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Co-authored-by: Calder Young <cgiacun@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38114>
Remove two unused fields, and move a lonely boolean a bit up to plug the
remaining hole.
Because I was looking around and it bothered me.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38116>