That way we can look at the SBT entries for debug purposes.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17908>
v2: Drop ANV_PIPE_UNTYPED_DATAPORT_CACHE_FLUSH_BIT from invalidate bits (Lionel)
Add utrace support
Expand on comment about PIPE_CONTROL::UntypedDataPortCache
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16905>
Set write back L1 cache policy in STATE_BASE_ADDRESS instruction for A64
messages.
v2: Also set the value in genX_state.c (Lionel)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16905>
For a RW L1 cache, both reads and writes are cached in the L1, at high
priority (MRU position). For a RO L1 cache, reads are cached at higher
priority and writes bypass the cache.
v1: (Ken)
- Set caching policy for buffer surfaces too
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16905>
We're missing a condition that is currently papered over by having
ANV_PIPE_HDC_PIPELINE_FLUSH_BIT in the invalidate bits.
v2: rework with simplication (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16905>
This improves clover from crashing to just failing, but I mainly
want it this to cleanup the nir code first
It's also important the shaders coming from the state tracker
for feedback get images lowered when they are draw shaders now.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10641>
Wa_14010455700 is dependent on the format and sample count, but our
code to track whether or not it had been applied was only dependent on
the format.
As a result, we failed to enable the workaround when an app used a D16
2xMSAA buffer, then a D16 1xMSAA buffer right afterwards.
Make the workaround tracking code sample-dependent to fix this.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17859>
Bspec 53421 says:
"A URB fence memory is typically performed prior the thread
exit message, so that the next thread dispatch that reads
that URB memory will see it."
Cc: 22.1 <mesa-stable>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16665>
Found by code inspection.
There's an assert later checking that we haven't overflown
this array, so this change probably doesn't matter for any
workload.
Cc: 22.1 <mesa-stable>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16665>
This allows the software scoreboarding pass, scheduler, and so on
to handle the individual instructions and handle them, rather than
trusting in the generator to do scoreboarding correctly when expanding
the virtual instruction to multiple actual instructions.
By using SHADER_OPCODE_READ_SR_REG, we also correctly handle the
software scoreboarding workaround when reading DMask/VMask.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17530>
The hwconfig api may change unexpectedly prior to public release of
new platforms. Also, public documentation of the hwconfig api
sometimes lags the release.
For these reasons, warnings about unhandled hwconfig keys are noisy,
likely to occur, and unhelpful to most users. This commit drops those
warnings, in favor of a separate internal process for tracking
hwconfig api changes.
Suggested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17846>
This is a missed format to properly support media interop for Android.
Currently only used when layering GL atop Vulkan on Android, but will
be used directly with Vulkan when the platform default renderer has
switched to skiavk in modern Android.
Test: CtsMediaTestCases and CtsVideoTestCases with angle on venus on anv
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17808>
When the compiler pads a data structure, the padded bytes will not be
initialized. Shader keys are compared with memcmp and unitialized
bytes within the structure breaks this mechanism.
Explicitly pad the structures with members, so the compiler is forced
to initialize them. Add a warning to indicate if a change to
alignment in any of the data structures requires additional padding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17749>
Fixes the following EU validation error:
ERROR: Header must be present for all URB messages.
The message header is ignored for URB fence messages, so I doubt that
this actually matters in practice. But we should probably mark it as
present, because you have to send something, and according to the
documentation, there is a message header, it's just ignored.
Fixes: e6a9501aa2 ("intel/fs: Add the URB fence message")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17624>
When this rule started causing issues, I looked it up in the
documentation, and found the rule for 64-bit destinations and
integer DWord multiplication, but there was no mention of floating
point destinations, as the text in brackets suggested. The actual
restriction text had been updated, so this led to some confusion
where I thought the conditions had been changed in newer docs.
However, what's actually going on is that there are two separate
conditions, each listed in separate rows of the table. One lists
64-bit destinations or integer DWord multiplication, and the other
mentions floating-point destinations. In both cases, the actual
restrictions are identical, so we handle them together in the code.
Try to update the comment to avoid future confusion.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17624>
Recently, we started using <1;1,0> register regions for consecutive
channels, rather than the <8;8,1> we've traditionally used, as the
<1;1,0> encoding can be compacted on XeHP. Since then, one of the
EU validator rules has been flagging tons of instructions as errors:
mov(16) g114<1>F g112<1,1,0>UD { align1 1H I@2 compacted };
ERROR: Register Regioning patterns where register data bit locations are changed between source and destination are not supported except for broadcast of a scalar.
Our code for this restriction checked three things:
#1: vstride != width * hstride ||
#2: src_stride != dst_stride ||
#3: subreg != dst_subreg
Destination regions are always linear (no replicated values, nor
any overlapping components), as they only have hstride. Rule #1 is
requiring that the source region be linear as well. Rules #2-3 are
straightforward: the subregister must match (for the first channel to
line up), and the source/destination strides must match (for any
subsequent channels to line up).
Unfortunately, rules #1-2 weren't working when horizontal stride was 0.
In that case, regions are linear if width == 1, and the stride between
consecutive channels is given by vertical stride instead.
So we adjust our src_stride calculation from
src_stride = hstride * type_size;
to:
src_stride = (hstride ? hstride : vstride) * type_size;
and adjust rule #1 to allow hstride == 0 as long as width == 1.
While here, we also update the text of the rule to match the latest
documentation, which apparently clarifies that it's the location of
the LSB of the channel which matters.
Fixes: 3f50dde8b3 ("intel/eu: Teach EU validator about FP/DP pipeline regioning restrictions.")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17624>
When the EU validator encountered an error, it would add an annotation
to the disassembly. Unfortunately, the code to insert an error assumed
that the next instruction would start at (offset + sizeof(brw_inst)),
which is not true if the instruction with an error is compacted.
This could lead to cascading disassembly errors, where we started trying
to decode the next instruction at the wrong offset, and getting lots of
scary looking output:
ERROR: Register Regioning patterns where [...]
(-f0.1.any16h) illegal(*** invalid execution size value 6 ) { align1 $7.src atomic };
(+f0.1.any16h) illegal.sat(*** invalid execution size value 6 ) { align1 $9.src AccWrEnable };
illegal(*** invalid execution size value 6 ) { align1 $11.src };
(+f0.1) illegal.sat(*** invalid execution size value 6 ) { align1 F@2 AccWrEnable };
(+f0.1) illegal.sat(*** invalid execution size value 6 ) { align1 F@2 AccWrEnable };
(+f0.1) illegal.sat(*** invalid execution size value 6 ) { align1 $15.src AccWrEnable };
illegal(*** invalid execution size value 6 ) { align1 $15.src };
(+f0.1) illegal.sat.g.f0.1(*** invalid execution size value 6 ) { align1 $13.src AccWrEnable };
Only the first instruction was actually wrong - the rest are just a
result of starting the disassembler at the wrong offset. Trash ensues!
To fix this, just pass the instruction size in a few layers so we can
record the next offset properly.
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17624>
Instead of having 2 VkMemoryType pointing to the same VkMemoryHeap, we
have each VkMemoryType with VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT (one
host visible, the other not) point to its own VkMemoryHeap. For the
local heap that is host visible, we'll use the
I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS flag at GEM BO creation.
When the smallbar uAPI is not available we fallback to a single heap
and do not use I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS.
v2: Handle probed_cpu_visible_size == probed_size (Matthew)
v3:
* Jordan: Use region info from devinfo
v4: Also make the vram host visible heap as local (Ken)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16739>
No shader-db changes on any Intel platform
Fossil-db results:
Tiger Lake
Instructions in all programs: 156926440 -> 156926470 (+0.0%)
Instructions hurt: 15
Cycles in all programs: 7513099349 -> 7513099402 (+0.0%)
Cycles hurt: 15
Ice Lake and Skylake had similar results. (Ice Lake shown)
Cycles in all programs: 9099036492 -> 9099036489 (-0.0%)
Cycles helped: 1
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>
The changes to fs_visitor::validate() helped track down a place where I
initially forgot to convert a message to the new sources layout. This
had caused a different validation failure in
dEQP-GLES31.functional.tessellation.tesscoord.triangles_equal_spacing,
but this were not detected until after SENDs were lowered.
Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 19951145 -> 19951133 (<.01%)
instructions in affected programs: 2429 -> 2417 (-0.49%)
helped: 8 / HURT: 0
total cycles in shared programs: 858904152 -> 858862331 (<.01%)
cycles in affected programs: 5702652 -> 5660831 (-0.73%)
helped: 2138 / HURT: 1255
Broadwell
total cycles in shared programs: 904869459 -> 904835501 (<.01%)
cycles in affected programs: 7686744 -> 7652786 (-0.44%)
helped: 2861 / HURT: 2050
Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
Instructions in all programs: 141442369 -> 141442032 (-0.0%)
Instructions helped: 337
Cycles in all programs: 9099270231 -> 9099036492 (-0.0%)
Cycles helped: 40661
Cycles hurt: 28606
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>
VK_KHR_format_feature_flags2 is core and implicitly enabled in 1.3.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17442>