The HW not returning the depth value we would like for
VK_EXT_sliced_view_of_3d, we can pull that value by reading the
RENDER_SURFACE_STATE struct directly.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23868>
On Alchemist, the FF_MODE2 documentation says that we must set the
FF_MODE2 timer values for GS and HS to 224. The hardware performance
tuning guide also recommends setting the TDS timer to 4.
On Tigerlake, i915 applies workarounds to set the GS timer to 224
(failing to do so can cause HS/DS unit hangs), and the TDS timer to 4
(for performance). It doesn't currently apply a HS timer there, and
I'm not sure if it's strictly necessary, but given that Alchemist
needed it, and the other two settings matched, let's assume that it
ought to match as well.
Unfortunately, there has been a bug in the i915 workarounds
infrastructure for non-masked context registers where writing one
field of the register zeroes out all the others. So, I believe the
Tigerlake TDS timer value of 4 isn't being applied correctly there,
though the register is also not readable on that platform which
makes it hard to verify. So, this may also speed up tessellation.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9233
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23839>
This enables L3 partial write merging for a number of cases that seem
to be getting accidentally disabled by the kernel, which was causing a
serious performance bottleneck on DG2 and MTL platforms. The
"Compressible Partial Write Merge Enable", "Coherent Partial Write
Merge Enable" and "Cross-Tile Partial Write Merge Enable" bits in
L3SQCREG5 were expected to be enabled by default (and confusingly,
they even read off as enabled if you ran 'intel_reg read 0xb158' on an
idle system), but they are getting clobbered during 3D context
initialization by an i915 workaround.
Enabling L3 partial write merging of compressible surfaces in
particular seems to increase rendering fillrate by over 3x in some
cases (e.g. the
"VulkanFillRate/FillRateGPU/resolution:1[0-3]/format:*/blend:0"
fillrate-bound microbenchmarks). Significant improvements can also be
reproduced in most real-world workloads we've tested so far,
e.g. Counter Strike GO improves by ~11%, Shadow Of the Tomb Raider
improves by ~5.5%, and AztecRuins-VK improves by ~6.5% on DG2-512 --
Thanks a lot to Caleb Callaway for these figures. No regressions have
been observed so far.
Even though this patch might strike as surprisingly simple for such a
large payoff, it's the result of Felix DeGrood and I trying to
root-cause the rendering performance gap of DG2 on Linux vs Windows on
and off during the last year, and some of the OA statistics captured
by Felix early this month were greatly helpful for me to connect the
last few dots, so Felix deserves a big chunk of the credit for this
work.
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23783>
Neither RENDER_SURFACE_STATE nor VDENC_SURFACE_CONTROL_BITS have a
Tiled Resource Mode field anymore. The RENDER_SURFACE_STATE field
was also overlapping with the L1 Cache Control settings field.
This also drops the assignment of that field in isl, because we were
just explicitly setting it to NONE (0) which is already the default
value genxml packing will give us. That saves us some ifdefs.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23449>
Most of them are length of each instruction and the rest are
some corrections on specific gens.
v1. Added a default value to DWordLength of each instruction.
( Lionel Landwerlin <lionel.g.landwerlin@intel.com> )
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22202>
There are same fields across gens but the existing xmls are not exactly same,
which needs to be fixed.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22202>
Actually the first bit is a bit of protected mask (or reserved)
and the next 6 bits are for MOCS but they are being handled together
currently in isl_device_setup_mocs. So we need to fix some MOCS fields
defined as 6 bits to 7 bits.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22202>
I wonder if the docs are correct for Gfx11 because this is the
generation that gave us the Bindless Sampler Heap of 4Gb. So it would
make sense that the border colors can also be placed anywhere in that
4Gb heap.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21600>
These values are taking from runtime interrogation of the media driver.
It would be nice to know if they are correct, but they work.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20782>
Setting default expected values as default in the xml.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21105>
Wa_14015360517 mentions situations where HW produces invalid
occlusion query results when "Pixel Shader Does not write to RT"
bit is set.
"When Pixel Shader Kills Pixel is set, SW must perform a dummy render
target write from the shader and not set this bit, so that Occlusion
Query is correct."
Another situation is when writing to UAV or to NULL render target.
Patch sets field as 'must be zero' to discourage possible use of it.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20849>
This is a global register which isn't settable by userspace contexts.
It also shouldn't appear in any of our aubinator decodes from error
states or aub dumps, as no userspace batch should be setting it.
So it's not very valuable to have here. Just makes us think we can
set it. Plus, a lot of the field definitions changed a bunch, and
would need updating.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20627>
It was not meant to be used(Iris have assert for it) and it was
removed from Pipe_Control instruction in gen12.5 and newer.
BSpec: 47112
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20444>
The sampler's decompressor seems to lack support for some types of
format re-interpretation. Use the more capable decompressor for these
cases. This will be needed to avoid regressing piglit's
arb_texture_view-rendering-formats in later commits.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19937>
On Gen11 and above, the 3DPRIMITIVE command takes an optional additional
three DWORDs of data as "extended parameters". These extended
parameters only exist in the packet if "Extended Parameters Present" is
set. Because our packing code doesn't handle variable-length commands
well, this commit adds a second version of the command which isn't real
but is just a copy of 3DPRIMITIVE with the additional dwords where the
"Extended Parameters Present" defaults to true and "DWord Length" is
adjusted by 3 as needed. The 3DPRIMITIVE command is then the gen4-9
version which still works fine but doesn't have the new parameters.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20295>
The main goal is to be able to generate genX_bits.h for those
structures so we can get generated field offsets.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20011>
We want to avoid those settings so that we do not have to emit a tile
fence to implement Wa_22013689345.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19322>
Implement Wa_1508744258:
Disable RHWO by setting 0x7010[14] by default except during resolve
pass.
Disable the RCC RHWO optimization at all times except when resolving
single sampled color surfaces. MCS partial resolves are done via
software (i.e., not via a HW bit) and so are not expected to need this
workaround.
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19360>
So we have less stuff in the global namespace
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
Instead of crewating output or validating in the process code, just
process, then let main handle the rest
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
This makes two relatively small changes, first it addes the encoding to
the xml delcaration, and switches the quote style. Second, it changes
the final newline. These seemed minor enough to not warrent patches to
make the old wrter do the same thing as the new writer.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
This removes a bunch of hand-written code, and allows for fewer corner
cases. The resulting code has some minor differences (no empty newlines
and the encoding is declared in the xml declaration)
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
When we move to using etree to print this it won't have them, so this
minimizes the diff further.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
ElementTree.write will do this, and we want to minimize the diff when we
switch from our own writer to the builtin one.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>