we need to apply depth bias for tris with point/line poly mode (according to
offset_point/offset_line), but we need to NOT apply depth bias for API-level
points/lines. weirdly the hw is sensitive to that last part.
fixes z-fighting with FreeCAD.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
we need to gather tex masks / lower mediump io before lowering textures for our
detection to work ... also we want driver-side i/o soon lowering for Marek's
thing anyway. do some code motion / pass reordering to make this doable.
in doing so, we get rid of agx_uncompiled_shader_info which is probably good.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
Blender needs more samplers to render the "wanderer" scene. While our binding
table is limited, we can get additional samplers by downshifting to the bindless
sampler heap, at a performance penalty. That mechanism is already in place for
merged VS/TCS, so we can reuse it for this.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
Initially I left it in place because I didn't want to disturb the table
beyond what was in nv50_formats.c. However, we've moved past that now
and these formats are permanently broken so it's easiest to just remove
them from the table rather than have them in the table and then have C
code which disables them.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28453>
This is much more readable than the macro mess we had before. Most
notably because we no longer have this mess of #defines for format
support flags and instead have a character per flag and every every
supported flag is there every time. This makes it way easier to figure
out what we're actually claiming a format can do.
This conversion was tested with a patch that did a memcmp() of this
table against the old one to ensure that the two tables are bit-for-bit
identical. Later commits may modify the table in various ways but this
conversion to a CSV file is lossless.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28453>
In nir_lower_terminate_to_demote(), we were deleting the rest of the
block contents when we added a halt instruction but left any subsequent
CF nodes in the list. While this may be technically okay, that much
dead code makes the rest of NIR pretty grumpy. It's better to delete
everything to the end of the CF list, not just everything to the end of
the block.
Fixes: 75861c64b8 ("nir: Add a lower_terminate_to_demote pass")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28456>
In the common case, fs_inst will have up to 4 sources (the HW
instructions have up to 3, and our representation of SENDs have 4).
Embed such array into the fs_inst, and use it whenever applicable
instead of allocating a new array.
Also change the code to reuse the allocated src array when resizing to
a smaller length.
Between the changes above and the reduced amount of initializing
fs_regs, this reduces fossil-db time by around 2% for Borderlands 3
and Rise of the Tomb Raider, and around 1.5% for Total War Warhammer 3.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28379>
It was by pure luck that all sources (and the result) of nir_dpas_intel
had the same number of components. It is possible to support matrix
sizes where the accumlator matrix and the result matrix are larger
(e.g., 16x8 * 8x16 = 16x16).
This breaks all of the assumptions of NIR's infrastructure for code
generating intrinsics. Fix the by making the accumulator matrix be the
first source. The accumulator and the result will always have the same
dimensions (due to rules of matrix multiplication) and the same type
(due to restructions of the cooperative matrix extension). This forces
them to have the same number of components.
This doesn't fix all the potential problems. NIR expects that all
0-sized sources will have the same number of components. This just
ensures that the result has the correct number of components.
Fixes: 6b14da33ad ("intel/fs: nir: Add nir_intrinsic_dpas_intel")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
Was previously passing 1, 1, 0 as the regioning. This generated
incorrect disassembly because the encoding for a width of 1 is 0. Use
the enums to ensure the correct values are used.
Fixes: 1c92dad5cb ("intel/disasm: Disassembly support for DPAS")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
If the destination was the accumulator but is no longer, having the flag
set is not correct. On Xe2 this also causes a validation error.
v2: Reword the comment to be more clear. Suggested by Jordan.
Fixes: efa4e4bc5f ("intel/fs: Introduce regioning lowering pass.")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
The actual TCS epilog selection code is kept unchanged for now,
we'll delete it when RadeonSI also gets rid of TCS epilogs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
TCS epilogs are not needed anymore because the TCS can implement
dynamic states by itself now.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
This allows the TCS to read the primitive mode and whether
TES reads the tess factors, from an SGPR arg, which lets it
decide how to store them at runtime.
For linked shaders, the conditions will be constant and
NIR optimizations can delete the dead CF.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>