Commit graph

217571 commits

Author SHA1 Message Date
Rhys Perry
24fe4a5b58 aco/ra: copy precolor affinities to p_create_vector/p_split_vector
fossil-db (navi31):
Totals from 7 (0.01% of 84369) affected shaders:
Instrs: 2742 -> 2704 (-1.39%); split: -1.82%, +0.44%
CodeSize: 15300 -> 15052 (-1.62%); split: -1.93%, +0.31%
VGPRs: 516 -> 504 (-2.33%)
Latency: 12478 -> 12504 (+0.21%); split: -0.24%, +0.45%
InvThroughput: 2350 -> 2300 (-2.13%)
Copies: 350 -> 272 (-22.29%)
VALU: 1626 -> 1592 (-2.09%)
VOPD: 280 -> 236 (-15.71%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39387>
2026-01-20 10:53:18 +00:00
Pierre-Eric Pelloux-Prayer
f5f84e6739 radeonsi: add asserts to validate emit functions use of atoms
emit functions shouldn't dirty any atom.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:28 +00:00
Pierre-Eric Pelloux-Prayer
0efe11e84e radeonsi/sqtt: restore barrier_flags in si_sqtt_init_cs
si_sqtt_start / si_sqtt_stop use emit_barrier which clears barriers_flags.
Since these functions are used to build an auxiliary cs which will only
be emitted later (on sqtt enablement/disablement) it shouldn't clear
the global barrier_flags value.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:28 +00:00
Pierre-Eric Pelloux-Prayer
3bc60e1bb0 radeonsi: add extra flags param to si_emit_barrier_direct
Most callers wants to add new flags to barrier_flags so add
a parameter.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:28 +00:00
Pierre-Eric Pelloux-Prayer
9175388740 radeonsi: add a si_clear_and_set_barrier_flags helper
Same as si_set_barrier_flags except it can be used to clear
some barriers first.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:27 +00:00
Pierre-Eric Pelloux-Prayer
db4b1cdb3b radeonsi: fix references to sctx->flags in documentation
It was renamed barrier_flags.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:26 +00:00
Pierre-Eric Pelloux-Prayer
c77828c8e9 radeonsi: add a si_set_barrier_flags helper
The pattern:

  ctx->barrier_flags |= ...;
  si_mark_atom_dirty(sctx, &sctx->atoms.s.barrier);

is used a lot, let's add an inline helper. This prevents
forgetting the call to si_mark_atom_dirty.

si_upload_bindless_descriptors is special because we're
already in the emit phase so we shouldn't dirty barrier
again.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:26 +00:00
Christian Gmeiner
ef860bcaa1 pvr/ci: Add dEQP-VK testing for BXS-4-64 on TI AM68 SK
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39356>
2026-01-20 09:19:16 +00:00
Christian Gmeiner
2386770815 ci: Build imagination vulkan driver
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39356>
2026-01-20 09:19:16 +00:00
Christian Gmeiner
a0a87eb88e ci: Describe imagination farm
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39356>
2026-01-20 09:19:16 +00:00
Rob Clark
a9f05399ae tu: gen8 support
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:33 +00:00
Rob Clark
77e83d1449 tu: gen8 sampler support
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:33 +00:00
Rob Clark
5b40d98388 tu: Add helper to set render mode
Make it less awkward to deal with gen6/7 vs gen8 differences.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:33 +00:00
Rob Clark
eee7a6fb35 tu: gen8 descriptor support
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:32 +00:00
Rob Clark
039e21fde8 tu: Support acceleration_structure for wave64
Gen8 replaces wave128 with double dispatch wave64, and so will need
smaller subgroup sizes.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:32 +00:00
Rob Clark
380c79c923 ir3: Limit 64b atomic 16b offset quirk to a7xx
This was fixed in gen8.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:32 +00:00
Rob Clark
1dc7d0ade9 ir3: Skip shading_rate lowering when unneeded
Some newer gen8 devices (like a840/kaanapali, but not x2-85 which is
otherwise similar) flip the hw shading rate value around to match
vulkan/gl instead of DX.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:32 +00:00
Rob Clark
6f1faceb6a ir3: Avoid narrowing int conversions from GPR on SALU
Narrowing integer conversions on SALU with GPR src do not behave as one
would expect on gen8, so avoid them.  This does not apply to uGPR srcs
or float conversions.

See, for example:
dEQP-VK.glsl.builtin.function.integer.bitcount.int_highp_compute

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:31 +00:00
Rob Clark
7cb890fe1b freedreno/registers: Update gen8 VRS registers
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:31 +00:00
Rob Clark
74484da82f freedreno/registers: Update gen8 FDM regs
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:31 +00:00
Rob Clark
635410f749 freedreno/registers: Fix py array reg offsets
Otherwise the first reg in the array ends up with offset=0, which
signals the end of parsing the magic_raw regs.  Which is not the
desired outcome :-)

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:30 +00:00
Rob Clark
f0ada848e5 freedreno/registers: Fix GRAS_LRZ_CB_CNTL
The field name needs to match between variants.  In this case, the
register is the same, just with different offset.  So use a bitset.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:30 +00:00
Rob Clark
49f2545de6 freedreno/registers: Fix gen8 TPL1_A2D_BLT_CNTL
START_OFFSET_TEXELS is removed.  Instead TPL1_A2D_SRC_TEXTURE_BASE can
take an unaligned address for IMG_BUFFER.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:30 +00:00
Rob Clark
f5f9fecfc3 freedreno/registers: Fix gen8 TPL1_MODE_CNTL
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:29 +00:00
Rob Clark
ff034b5aef freedreno/registers: Fix gen8 GRAS_SU_STEREO_CNTL
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:29 +00:00
Rob Clark
a2dc77323d freedreno/registers: Add subpass fence events
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:28 +00:00
Rob Clark
5546654104 freedreno/registers: Fix gen8 UV_PITCH
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:28 +00:00
Rob Clark
c9a0b1d6f1 freedreno/fdl: Fix gen8 sRGB buffers
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:28 +00:00
Rob Clark
bd00d86bd7 freedreno/fdl: Fix gen8 MUTABLEEN
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:28 +00:00
Rob Clark
eba06c5e5b tu: Convert foveat state to CRB
The GRAS regs are no longer consecutive in gen8.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:27 +00:00
Rob Clark
eb43e95d61 freedreno: Disable supports_double_threadsize for gen8
Gone is thread128.  Instead the hw can co-dispatch thread64.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:27 +00:00
Rob Clark
7958a19ee9 freedreno: Disable has_rt_workaround for gen8
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:26 +00:00
Faith Ekstrand
13926b3492 panfrost: Lower pixel-local storage to load/store_tile in NIR
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Now that we have intrinsics which map directly to the hardware opcodes,
we can lower PLS inside the gallium driver instead of the back-end
compiler having to know anything about it.  This simplifies the back-end
and is less code, if you ignore the new copyright header.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
2026-01-19 21:33:14 +00:00
Faith Ekstrand
669ddc5241 pan/blend: Use the blend builder helpers instead of nir_lower_blend()
This is a little more manual (though it's actually less code) but it
gives us a lot more control and makes the whole flow nicer.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
2026-01-19 21:33:14 +00:00
Faith Ekstrand
2313bec66e nir: Expose the guts of nir_lower_blend as builder helpers
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
2026-01-19 21:33:14 +00:00
Faith Ekstrand
d2c2d798f8 nir/lower_blend: Optimize trivial logic op cases
There's no point in going to/from UNORM if we're just going to copy or
throw away the source.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
2026-01-19 21:33:14 +00:00
Faith Ekstrand
68d22b5a2a nir/lower_blend: Move the format to nir_lower_blend_rt
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
2026-01-19 21:33:14 +00:00
Faith Ekstrand
d6556a580f nir,pan: Add and implement a new store_tile_pan intrinsic
Like we just did with load_tile_pan, this maps directly to ST_TILE in
the hardware.  This is more versatile and lets us do more of our
lowering in NIR.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
2026-01-19 21:33:13 +00:00
Faith Ekstrand
11b6cd2f2c nir,pan: Rework the pafrost tile load intrinsic
Instead of making it explicitly about outputs, this switchies it to
being a NIR version of LD_TILE.  It means we have to do a bit of work in
NIR and add a builder helper but the end result is something much more
versatile.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
2026-01-19 21:33:13 +00:00
Faith Ekstrand
592963e941 pan/bi: Implement pack_32_4x8 natively
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
2026-01-19 21:33:13 +00:00
Faith Ekstrand
4189865347 nir: panfrost tile loads are always divergent
Each lane refers to a different pixel.

Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
2026-01-19 21:33:13 +00:00
Yiwei Zhang
419a3e66f8 venus: allow vtest to properly wait for present
Previously common wsi has a special submission to install implicit fence
to wsi memory directly, which has been deprecated in favor of bonfire
implicit fencing (implicit fencing has been turned into explicit fencing
within vulkan since then). The virtgpu backend is fine but the vtest
backend has been regressed since then, only relying on renderer side hw
driver doing implicit fencing.

With async present landed earlier, we can directly tell which submission
is done by common wsi, and can revive the idle waiting accordingly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39377>
2026-01-19 21:01:13 +00:00
Yiwei Zhang
d4e2184904 venus: refactor vn_QueueSubmit2
To prepare for fixing vtest wsi present.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39377>
2026-01-19 21:01:13 +00:00
Christian Gmeiner
bb83b67910 etnaviv/ci: Add gitlab-ci-inc.yml to file list
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39391>
2026-01-19 20:46:38 +00:00
José Roberto de Souza
48b43157f8 intel/perf: Add Gfx 12.5 mdap_metrics struct and set it
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Gfx 12.5 struct has only one major difference with gfx9, that is OaCntr lenght,
while on gfx 9 it is 36 uint64_t long on gfx 12.5 it is 38 uint64_t long.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
2026-01-19 19:24:16 +00:00
José Roberto de Souza
a097a3d214 intel/perf: Change mdapi switch cases from ver to verx
We are missing handling for gfx12.5 so to add it we will need a switch case over
verx.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
2026-01-19 19:24:16 +00:00
José Roberto de Souza
2d75b3b873 intel/perf: Extend Xe2 mdap_metrics to Xe3
Looking at the reference code, there is no new struct for Xe3 so it should
use the same struct as Xe2.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
2026-01-19 19:24:15 +00:00
José Roberto de Souza
8e318e3246 intel/perf: Add Xe2 mdap_metrics struct and set it
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
2026-01-19 19:24:15 +00:00
José Roberto de Souza
0675a0da55 intel/perf: Nuke intel_perf_load_configuration() and related code
With no more users of intel_perf_load_configuration() it can be
removed with other i915 functions around it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
2026-01-19 19:24:15 +00:00
José Roberto de Souza
132bcbee74 anv/hasvk: Add intel_perf_get_configuration_id() and replace intel_perf_load_configuration() usage
We have no usage of the information returned by
intel_perf_load_configuration(). It is only used to add a copy of the
configuration so we have the metric id but we could instead get the
metric id from sysfs, that is added by mdapi.

Xe KMD don't have a uAPI to query the metrics configuration, so
using sysfs also fixes the integration of mdapi with Xe KMD.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
2026-01-19 19:24:15 +00:00