The xe.ko kernel driver sets the ROW_CHICKEN bit to disable Early EOT
on all revisions of Xe2, I believe as far back as 6.10. Although Xe2
doesn't have the variable registers per thread feature of Xe3, it still
has Large GRF mode that can be switched on and off, and there are issues
with combining the two features. Plus, apparently this wasn't observed
to help much with performance.
That means that EOT sends are no longer special, and we don't need to
restrict ourselves to r112-r127. Relax the validator so Jay can use this.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Co-authored-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41831>
The existing util_get_narrow_range_coeffs doesn't work for RGB, since
all channels in RGB will share the same scale and bias.
Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41787>
Moves dispatch of multiple draws into `kk_draw`. This allows for any draw
pre-processing to operate on the full set of draws at once, reducing dispatch
calls and maximizing parallel work.
Draw data may also specify predicates that need to be applied to the draws.
This along with batched draw processing will be useful for implementing
features like conditional rendering later.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41799>
Set this to false for non-video queues like the Nvidia driver.
This prevents getting debug warns that
VK_STRUCTURE_TYPE_QUEUE_FAMILY_QUERY_RESULT_STATUS_PROPERTIES_KHR
is not handled when we enable KHR_video_queue.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41752>
`MKVEC.v2i8` only has explicit lane selection for `src0` and `src1`.
`src` is implicitly read as `.b01`, so having a byte swizzle on `src2`
results in an instruction that cannot be encoded.
This fixes a failure in OpenCL-CTS when running `test_relationals
shuffle_copy`:
```
Invalid swizzle:
r0 = MKVEC.v2i8 r0^.b0, r0^.b3, r0^.b0
invalid_instruction: Assertion `!"Invalid instruction"' failed.
```
Fixes: bc7053a ("pan/bi: Add a lowering pass for MKVEC and SWZ")
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41109>
This has already been fixed. We still want to always use new context
with multi instance VCNs to utilize all instances, so the kernel bug
being present or not won't change the decision there.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41796>
OpenCL-CTS `test_basic vload_private` fails with the following
assertion:
```
src/compiler/nir/nir_lower_explicit_io.c:1649: lower_explicit_io_deref:
Assertion `addr->bit_size == deref->def.bit_size' failed.
```
Use `nir_address_format_32bit_offset_as_64bit` when the shader has
64-bit pointers. The scratch offset is still 32-bit, but the NIR address
value now matches the 64-bit derefs being lowered.
Fixes: 01e6a0555c ("pan/compiler: Rework scratch memory strategy")
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41769>
When setting texture state words for load op clears simply inherit
them from the image view for a given attachment. The only piece of
information that needs updating is the offset relative to the view
index, because unknown at image creation time.
Fix dEQP-GLES:
dEQP-GLES3.functional.texture.specification.texsubimage2d*
Fixes: e08916677 ("pvr: Add support for VK_ATTACHMENT_LOAD_OP_LOAD.")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41797>
OpenCL-CTS `test_relationals relational_any` fails on Panfrost with an
assertion:
```
src/panfrost/compiler/pan_nir_lower_bool_to_bitsize.c:296:
lower_alu_instr: Assertion `alu->def.bit_size > 1' failed.
```
Bool to bitsize pass handles 2, 3 and 4-wide boolean reduction, but does
not handle 8 and 16, which fall-through to the default case, producing
1-bit bools.
Fixes: 5de5987678 ("nir,panfrost: Move lower_bool_to_bitsize to
panfrost")
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41770>
Threaded submit relies on DRM syncobj wait ioctls blocking until the
GPU signals completion. Under drm-shim there is no real GPU, so
SYNCOBJ_WAIT returns immediately, creating a race between the submit
thread and vkQueueWaitIdle that leads to use-after-free crashes.
Detect if we are running under drm-shim by checking the DRM version
description, skip enabling threaded submit in that case.
Assisted-by: Cursor Agent (Opus 4.6)
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41779>
So far with drm-shim we were always emulating V3D 4.2.
Now we always emulate V3D 7.1, but we allow selecting 4.2 through an
envvar: `V3D_GPU_ID=(42|71)`
Borrowed from etnaviv.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41779>
When using drm-shim there is no primary node for the driver. This is
fine, and hence we only mark that we don't have primary device.
This fixes using v3dv with drm-shim.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41779>
This allows using drm-shim for an emulated driver in AMD GPU host.
Otherwise we need to set MESA_LOADER_DRIVER_OVERRIDE to the emulated
driver in order to make it working.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41779>
Also, fix texel rate for G1-Pro variant 1.
And mention G1-Ultra, G1-Premium and G1-Pro in the release notes.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41639>
Both RRA and RMV used the PCI bus slot index in the trace device_id
field. On a typical single-GPU system, this resulted in "Device ID =
0000" displayed in RRA and RMV when traces were opened.
Match RGP dump, which reports device ids correctly.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41788>