Note that the multipolygon PS disptach modes supported by Xe2 aren't
enabled by default yet, but they can be enabled manually via
INTEL_SIMD_DEBUG=fs2x8,fs4x8,fs2x16.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
This is needed because the information stored on the ATTR file for
multipolygon fragment shaders isn't stored as a contiguous sequence in
the GRF, instead the ATTR source may be lowered by assign_urb_setup()
to use a <16;8,0> region, which reads 4 SIMD16 GRFs for a SIMD32
instruction, even though the result of fs_inst::size_read() is
expected to be 2 GRFs. Special case ATTR sources for multipolygon PS
shaders to calculate the number of physical GRFs that will actually be
read by the instruction after lowering, based on the number of
polygons processed by the instruction.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
This is based on a previous patch by Marcin Ślusarz addressing the
same issue, though it's largely rewritten, simplified and includes
additional fixes.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
This fixes a number of assumptions made by the multipolygon input
attribute handling code from assign_urb_setup() so it also works on
Xe2+, which has additional multipolygon dispatch modes (like SIMD4x8
and SIMD2x16) and uses a different more compact representation of the
plane parameters.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
The interpolation deltas of PS inputs now show up as a 12B vec3 (A0,
A1-A0, A2-A0) in the ATTR file, instead of the previously used 16B
format with an unused component.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
The X and Y barycentric vectors are no longer interleaved in SIMD8
chunks (yay), so this is mostly a matter of disabling the
lower_barycentrics() pass and switching to a simpler implementation of
fetch_barycentric_reg() that simply calls fetch_payload_reg() instead
of the SIMD8 shuffling we had to do in previous generations.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
This extends fetch_payload_reg() to support fetching vector registers
like barycentrics stored on the payload as a contiguous sequence of
SIMD-wide vectors. In the SIMD32 case, both halves of the SIMD16
vector registers specified as regs[0] and regs[1] are zipped to
construct a single SIMD32-wide vector.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
This includes the render target array index, viewport index, and
front/back facing fields, which are now replicated per pair of
subspans in order to support fixed-layout multi-polygon PS dispatch.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
Note from Caio: proper handling of brw_sample_mask_reg
will appear in later patches.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
The PS thread payload format has changed enough in Xe2 that it
probably doesn't make sense to share code with gfx6. See BSpec page
"PS Thread Payload for Normal Dispatch - 512 bit GRF" for the new
format.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
These are new variants of the existing brw_reg GRF constructors that
take registers numbers in the new 512b units. Mainly useful for
thread payload setup code to use register numbers in a format that
matches the BSpec.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
We only need it for indirect draws.
Improves performance on an i7-12700 and A770:
- Piglit's drawoverhead base case +150.639% +/- 2.86933% (n=15).
- gfxbench5 gl_driver2_off +19.7219% +/- 1.13778% (n=15)
- SPECviewperf2020 catiav5test1 +1.6831% +/- 0.552052% (n=10).
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26806>
Whenever we use a BO in a batch, we need to find its corresponding exec
list entry, either to a) record that it's been used, b) update whether
it's being written, c) check for cross-batch implicit dependencies.
bo->index exists to accelerate these lookups. If a BO is used multiple
times by a batch, bo->index is its location in the list. Because the
field is global, and a BO can in theory be used concurrently by multiple
contexts, we need to double-check whether it's still there. If not, we
fall back to a linear search of all BOs in the list, looking to see if
our index was simply wrong (but presumably right for another context).
However, there's one glaringly obvious case that we missed here. If
bo->index is -1, then it's wrong for /all/ contexts, and in fact implies
that said BO has never been added to any exec list, ever. This is quite
common in fact: a new BO, never been used before, say from the BO cache,
or streaming uploaders, gets used for the first time.
In this case we can simply conclude that it's not in the list and skip
the linear walk through all buffers referenced by the batch.
Improves performance on an i7-12700 and A770:
- SPECviewperf2020 catiav5test1: 72.9214% +/- 0.312735% (n=45)
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26806>
A value of -1 means that the buffer has never been used in an execbuf
buffer list in any of our contexts. While setting this isn't critical,
doing so will allow us to short-circuit some looping in the next patch.
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26806>
This function is only used once. By inlining it, we can more easily
compare the CCS plane import code with the clear color plane import
code.
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25768>
Inline iris_resource_finish_aux_import and reimplement it with the
helper functions for managing resource planes. Provides more testing for
the helper functions and simplifies the code.
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25768>
Instead of checking the plane count in order to finish importing the aux
info, just check the plane index. Planes are added in reverse, so we'll
have the complete number of planes once we reach plane zero.
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25768>
Update the image-path from x64 to x86_64 for consistence with other platform
This is also cause the python to be updated to python3.12
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26736>
There is no need repeatedly downloading rust and d3d when building the docker locally
Download glext.h from github
Remove src directory and .git directory once compiling finished
Split piglit and depq compiling out
Clean middle files of piglit and depq
ci/msvc: Improve fetch source of spirv_samples_source
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26736>
Currently only msvc2019 are used
This is did intentionally, so we have a history script install msvc2019 only within vs2022
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26736>
Now when choose different version of msvc, there is no need rebuild windows-msvc docker
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26736>