Add support for the new ioctl for KMD global counter collection. This
avoids needing hacks to parse dtb and mmap the GPU's i/o space.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
With PERFCNTR_CONFIG, some other process may have already reserved some
counters, so not all will be available to fdperf. Prepare for this by
using num_counters in counter_group.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
Move this earlier so we have the counter config early enough to probe
kernel support for PERFCNTR_CONFIG with a valid config.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
Pull in updated UABI header with PERFCNTR_CONFIG ioctl. Sync with:
commit 44c460d2cc8b87c08360fe60f861660c8045ef90
Merge: 9bb8af2770b7 9a967125427e
Author: Dave Airlie <airlied@redhat.com>
Merge tag 'drm-msm-next-2026-05-30' of https://gitlab.freedesktop.org/drm/msm into drm-next
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
When we do f2fmp(fadd(f2f32(a), f2f32(b))) we can always optimize it to
fadd(a, b) and obtain the same result minus an intermediate rounding
step, same for fmul.
I verified this on CPU using a custom script with Berkley SoftFloat
implementation, the results there are bit-for-bit identical except for
NaN representations.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41419>
This change splits the algorithm in two steps: first we have the
logical decision of which caches to bypass based on the needs of the
send operation, and then we have the code that picks the caching modes
based on which caches to bypass.
This should make it significantly easier for us to add new workarounds
without the risk of breaking existing cases.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41319>
Instead of having an if ladder followed by another if that overwrites
the previous result, have a single if ladder.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41319>
This is the next - but not final - step into making this function more
organized: split cache_mode into atomic, load and store versions, then
pick the version at the end.
v2: Initialize {load,store}_cache_mode (Sagar).
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41319>
We have code to choose cache_mode before send->sfid is assigned, but
after it we have more code to choose cache_mode that relies on
send->sfid. Move everything to after the selection of send->sfid so
the code to pick cache_mode is all together. I plan to simplify this
futher in the next commits, the goal of this patch is to make the next
diff easier to read.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41319>
GL_TEXTURE_IMMUTABLE_LEVELS is core state in OpenGL ES 3.0 (it comes with
immutable textures / glTexStorage), queryable through glGetTexParameter. The
getter only allowed it when ARB_texture_view or OES_texture_view is present, so
a GLES3 driver without texture views returns GL_INVALID_ENUM for a valid query.
Its sibling GL_TEXTURE_IMMUTABLE_FORMAT is correctly ungated.
Allow the query on any GLES3 context, matching the spec.
Fixes 8 dEQP-GLES3.functional.state_query.texture.*_immutable_levels_* cases on
etnaviv (which exposes neither texture-view extension).
Fixes: 214fd4e40d ("mesa/main: fix texture view enum checks")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41910>
`VkDeviceDeviceMemoryReportCreateInfoEXT::pUserData` was updated to have
`optional` sometime between v1.4.335 and v1.4.337 which updates codegen
in a backwards incompatible way. VkDeviceDeviceMemoryReportCreateInfoEXT
should not really be sent to the host anyways (as a guest provided callback
can never be called from the host) but older existing guest images are
already sending this struct so we need to preserve compatibility.
Bug: b/519606352
Test: GfxstreamEnd2EndTests
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42007>
Remove the unused tgsi_dump include, the stale nir_to_tgsi comment, the
redundant freshly-allocated null check, and the dead VS/FS delete branches.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41925>
r300_create_vs_state always allocates vs->first and builds the initial shader
variant before the state can be bound, so the !vs->first path in
r300_pick_vertex_shader is unreachable.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41925>
Replace the TGSI-based r300_draw_init_vertex_shader with a NIR
implementation, removing the nir_to_tgsi call from r300_state.c.
The three transformations previously done via tgsi_transform are now
done directly on a cloned NIR shader:
- Insert missing primary color output if secondary color is present.
- Insert all missing color/bcolor outputs if any back-face color is used.
- Add a WPOS output (copy of gl_Position) in the next available generic
slot, by duplicating each store_deref to gl_Position to the new output
at the same point.
The ntr_fixup_varying_slots is applied to the clone before any
transforms, keeping VS outputs aligned with FS inputs.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Assisted-by: Claude Sonnet 4.6
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41925>
Even though we use an enum to implement it internally, there's no real
benefit to that enum being exposed to users. This makes it look more
like any other container type with an opaque implementation.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41941>
This is analagous to `Vec::push_mut()`, which was stabilied in Rust
1.95.0. Since we can't use that rust version yet, we internally
implement it as `push()` followed by `last_mut().unwrap()`.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41941>
Bifrost/valhall descriptor pointers are incorrectly assigned
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 11fcb23f74 ("pan/desc: Add a struct for valhall/bifrost to the union in pan_tiler_context")
Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41126>
This fixes state leakage when using the RADEON_DEBUG=notcl debug option.
This manifested as heavy desktop corruption when running GL clients with
this flag, since the R300_VAP_TCL_BYPASS state would leak into other HWTCL
users such as Xorg/glamor or Wayland compositors.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41996>
Using bitfields results in nondeterministic bit patterns in the unused
bits. Since ir3_shader_output is stored in the cache, this makes it
difficult to verify cache equality between different builds.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41999>
Similar to RADV, restarts render pass with resolve attachments. Not
the most ideal for tiling, but we don't even use native resolve for
built-in modes due to Metal format limitations.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41888>
Multiview often involves a loop over view indexes, and our output
handling assumes that everything is constant-indexed. Unrolling
the loops takes care of this. (brw already does this.)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41872>