Commit graph

222023 commits

Author SHA1 Message Date
Pierre-Eric Pelloux-Prayer
2267c14803 ac/info: add gfx12.1 identification
Not the full support yet, just the id part so the family/gfx_level
fields are set to the proper values.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41264>
2026-05-04 09:38:31 +02:00
Pierre-Eric Pelloux-Prayer
20b0349b05 radeonsi: clamp cp prefetch size
Limit the size instead of asserting that the size (which comes
from the shader bo) is smaller.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15184
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41264>
2026-05-04 09:38:28 +02:00
Pavel Ondračka
c1f1b704d9 dri: add big-endian 8888 entries to driImageFormatToSizedInternalGLFormat
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
So that dma-buf-imported EGLImages on big-endian hosts resolve to a
sized GL internal format in st_bind_egl_image() instead of falling
back to unsized GL_RGBA/GL_RGB.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41132>
2026-05-04 09:01:43 +02:00
Pavel Ondračka
8f56c51d51 dri: add big-endian 8888 entries to dri2_format_table
So that dri2_get_mapping_by_fourcc() resolves the byte-reversed fourccs
(DRM_FORMAT_BGRA/BGRX/RGBA/RGBX8888) used for the native 8888 visual
on big-endian hosts.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41132>
2026-05-04 09:01:31 +02:00
Pavel Ondračka
1e97d3ed94 dri3: add big-endian 8888 fourccs to dri3_cpp_for_fourcc
Otherwise dri3_alloc_render_buffer() fails on big-endian hosts because
BGRA/BGRX/RGBA/RGBX8888 return cpp=0.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41132>
2026-05-04 09:01:21 +02:00
Samuel Pitoiset
9361a5b865 docs: describe the contributions workflow for RADV
This workflow has been discussed a lot with the team for the past
few years. Let's just clarify it for real in the documentation.

Co-written-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41239>
2026-05-04 06:35:14 +00:00
Georg Lehmann
38e691fc0a nir/opt_varyings: do no_signed_zero linking even for non removable stores
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
E.g. position in VS.

Foz-DB Navi48:
Totals from 948 (0.79% of 120695) affected shaders:
MaxWaves: 26816 -> 26828 (+0.04%)
Instrs: 799692 -> 796993 (-0.34%); split: -0.34%, +0.01%
CodeSize: 3855744 -> 3846816 (-0.23%); split: -0.24%, +0.01%
VGPRs: 50256 -> 50220 (-0.07%)
Latency: 2209359 -> 2207667 (-0.08%); split: -0.09%, +0.01%
InvThroughput: 305260 -> 303519 (-0.57%); split: -0.57%, +0.00%
VClause: 11640 -> 11643 (+0.03%); split: -0.01%, +0.03%
SClause: 21152 -> 21149 (-0.01%)
Copies: 51658 -> 51675 (+0.03%); split: -0.11%, +0.14%
Branches: 18656 -> 18655 (-0.01%)
PreVGPRs: 37999 -> 37984 (-0.04%)
VALU: 469752 -> 467406 (-0.50%); split: -0.50%, +0.00%
SALU: 105433 -> 105323 (-0.10%); split: -0.11%, +0.00%

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>
2026-05-03 19:55:10 +00:00
Georg Lehmann
fac4edbcba nir/opt_varyings: back propagate signed zero information to outputs
Foz-DB Navi48:
Totals from 809 (0.67% of 120695) affected shaders:
MaxWaves: 21804 -> 21808 (+0.02%)
Instrs: 863131 -> 861310 (-0.21%); split: -0.22%, +0.01%
CodeSize: 4535500 -> 4523232 (-0.27%); split: -0.30%, +0.03%
VGPRs: 47304 -> 47280 (-0.05%)
SpillSGPRs: 170 -> 82 (-51.76%)
Latency: 6791484 -> 6786880 (-0.07%); split: -0.07%, +0.00%
InvThroughput: 906281 -> 905301 (-0.11%); split: -0.11%, +0.00%
VClause: 16910 -> 16917 (+0.04%); split: -0.01%, +0.05%
SClause: 21856 -> 21827 (-0.13%); split: -0.14%, +0.01%
Copies: 61890 -> 61436 (-0.73%); split: -0.80%, +0.06%
Branches: 19725 -> 19640 (-0.43%)
PreSGPRs: 38011 -> 37851 (-0.42%)
PreVGPRs: 36482 -> 36454 (-0.08%)
VALU: 465316 -> 464323 (-0.21%); split: -0.22%, +0.00%
SALU: 143757 -> 143395 (-0.25%); split: -0.33%, +0.08%
VMEM: 36827 -> 36806 (-0.06%)
SMEM: 37769 -> 37768 (-0.00%)

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>
2026-05-03 19:55:10 +00:00
Georg Lehmann
b2bc57551a nir/instr_set: allow cse with fp_math_ctrl mismatches for intrinsics
Just like for ALU.

No Foz-DB changes.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>
2026-05-03 19:55:10 +00:00
Icenowy Zheng
a0a88e329d dri: try to enable GL_ARB_compatiblity when supported GL core version is 3.1
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
OpenGL 3.1 is a transitional version in the progression of dropping
legacy features. It does not feature a "Compatibility Profile", instead
only GL_ARB_compatiblity extension is defined for it.

Programs that queries GL_CONTEXT_PROFILE_MASK at runtime and call the
compatibility codepath when this query doesn't exist or the query
returns GL_CONTEXT_COMPATIBILITY_PROFILE_BIT will work on OpenGL
implementation with a version < 3.1 or a version > 3.1, but not on
implementations targetting OpenGL 3.1 and lacking GL_ARB_compatiblity.
As most programmers now have hardwares and drivers targetting version >
3.1 installed, such error is hard to catch.

So try the best to enable GL_ARB_compatiblity on drivers exposing
exactly OpenGL 3.1 to satisfy such programs. It's still possible to use
MESA_GL_VERSION_OVERRIDE=3.1FC to acquire a context w/o
GL_ARB_compatiblity on such drivers.

Fixes the overview functionality of kwin_wayland on panfrost with
Mali-G57 (which exposes OpenGL 3.1 on current Mesa), although the
problematic profile detection code is in Qt instead of KWin.

Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41298>
2026-05-03 13:55:49 +00:00
Marek Olšák
f583f6e717 nir: use nir_build_frag_coord everywhere
nir_build_frag_coord generates the correct sysval loads based on NIR
options. nir_load_frag_coord shouldn't be used directly because drivers
don't have to support it.

v2: RADV can't use it because nir->options isn't set, so use load_pixel_coord.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:01 +00:00
Marek Olšák
b63a9a8b39 nir: add direct lowered frag_coord building to replace lowering passes
Instead of lowering frag_coord 4 times during compilation,
just use this.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Marek Olšák
9c5ad16819 nir/opt_frag_coord_to_pixel_coord: handle frag_coord_xy
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Marek Olšák
076b0aaf1d nir/lower_wpos_ytransform: handle frag_coord_xy
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Marek Olšák
e49f29f25e nir: add frag_coord_xy
to strengthen and simplify pixel_coord lowering

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Dave Airlie
9cb688af88 lavapipe: treat NULL pColorAttachmentLocations as no handles
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
this fixes a crash seen in:
dEQP-VK.renderpasses.dynamic_rendering.partial_secondary_cmd_buff.local_read.interaction_with_color_write_enable

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41309>
2026-05-02 14:46:15 +10:00
squidbus
be75ece095 kk: Workaround for GPU capture under Rosetta 2.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
GPU capture bugs if heap sizes are not aligned to at least 16K. Ensuring that
they are is not expected to impact memory usage since it seems the actual
internal memory allocation is already aligned to 16K, the issue is only with
how the heap reports its size versus the allocation size that capture uses.

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41218>
2026-05-02 02:52:06 +00:00
squidbus
640b4cb96c kk: Enable VK_(EXT/KHR)_global_priority and VK_EXT_global_priority_query
Same as NVK, only currently exposes medium priority as default.

Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41286>
2026-05-02 02:35:28 +00:00
squidbus
f74a5dd0cf kk: Enable VK_EXT_buffer_device_address
Legacy alias of VK_KHR_buffer_device_address, for any applications
that still use it.

Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41286>
2026-05-02 02:35:28 +00:00
squidbus
4dff9d4329 kk: Enable VK_EXT_extended_dynamic_state3
Supports DepthClampEnable and DepthClipNegativeOneToOne, and allows
applications to omit pipeline create structures fully covered by
enabled dynamic state.

Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41286>
2026-05-02 02:35:28 +00:00
Mike Blumenkrantz
95954b0981 vk/cmd_queue: always ceil() param lens
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
this avoids rounding errors with pSampleMask for 64bit masks

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41312>
2026-05-01 22:38:22 +00:00
Mike Blumenkrantz
547dc7a131 lavapipe: allow fbfetch with shader objects
required by DRLR

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41312>
2026-05-01 22:38:22 +00:00
Mike Blumenkrantz
1da8528bbc lavapipe: rework immutable samplers
samplers can be destroyed whenever, which makes it problematic to store
the pointers into descriptor layouts for embedded samplers. instead,
directly store the descriptor info into the layout, since this is all
constant data which is unaffected by object lifetimes

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41312>
2026-05-01 22:38:22 +00:00
Mike Blumenkrantz
965beb520c lavapipe: use the right type for DGC mesh draws
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41312>
2026-05-01 22:38:21 +00:00
Mike Blumenkrantz
43051547b6 util/format: support 256-bit formats in util_format_get_tilesize()
Fixes: eb64ce4386 ("util: Add a helper for querying sparse tile sizes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41312>
2026-05-01 22:38:21 +00:00
Mike Blumenkrantz
ea57814003 lavapipe: null out local var to avoid uninit warning
harmless warning, but annoying

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41312>
2026-05-01 22:38:20 +00:00
Mike Blumenkrantz
f4461b66b6 lavapipe: fix pushconst data updating
in a sequence like:
* CmdPushConstants
* CmdBindPipeline (doesn't use push constants)
* CmdDispatch
* CmdBindPipeline (uses push constants)
* CmdDispatch

the previous code would never update pushconsts and the second dispatch
would have no valid data

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41312>
2026-05-01 22:38:20 +00:00
Mike Blumenkrantz
87764963f2 lavapipe: fix indirect memory copies
this was using the wrong size for the copy

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41312>
2026-05-01 22:38:20 +00:00
Mel Henning
216c5c6dde nvk: Re-enable zcull save/restore
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15221
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41275>
2026-05-01 21:10:21 +00:00
Mel Henning
0a3eb2b9fb nvk: Don't LOAD_ZCULL w/ VK_RENDERING_RESUMING_BIT
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41275>
2026-05-01 21:10:21 +00:00
Mel Henning
8d054c390e nvk: Zero zcull data in layout transition
We should have never been doing this as bind time. Instead, layout
transitions out of UNDEFINED are in the spec specifically so the
driver has a point where it can do initialization, so do our init there
instead.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41275>
2026-05-01 21:10:21 +00:00
Mel Henning
cb29829f72 nvk: Allocate a zcull save region in fewer cases
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41275>
2026-05-01 21:10:20 +00:00
Mel Henning
d4eb5c76c9 nvk: Split out nvk_cmd_fill_memory
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41275>
2026-05-01 21:10:20 +00:00
Yiwei Zhang
8b7525dd5d android_stub: purge unused log utils
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This commit:
- drops the unused log headers
- updates the header roll script
- drops the unused __android_log_vprint stub

Acked-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41254>
2026-05-01 20:23:23 +00:00
Yiwei Zhang
26c870f173 broadcom: remove unused Android log utils
These are leftovers from
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40434

Acked-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41254>
2026-05-01 20:23:23 +00:00
Yiwei Zhang
2065c589c0 intel: use stable NDK __android_log_print helper
The NDK api __android_log_print has been available since api level 3,
which is preferred since NDK api is more stable.

Acked-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41254>
2026-05-01 20:23:23 +00:00
Yiwei Zhang
8ea8e87378 util/os_misc: use stable NDK __android_log_write helper
The NDK api __android_log_write has been available since api level 3,
which is preferred since NDK api is more stable. Meanwhile, use write
instead of print to avoid extra internal copy/truncate involved in the
print helper.

Acked-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41254>
2026-05-01 20:23:23 +00:00
Calder Young
ebe835e94c intel_hang_replay: Don't force scratch page on Xe KMD unless explicitly requested
Added a --scratch flag instead of always forcing the scratch page enabled, this
allows the hang replay tool to be used to debug page faults.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
04bfdb287b anv: Disable scratch page by default on Xe KMD
Page faults will now cause the device to be lost instead of being ignored.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
4120ae4963 brw: Avoid vectorizing loads in NIR if it could extend into a different page
Took inspiration from RADV to make nir_opt_load_store_vectorize robust against
page faults, by checking the align_offset and align_mul to see if any extra
components could be overlapping into a different page.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
3ac6233655 brw: Avoid rounding every convergent block load up to a full register
To simplify things, our backend rounds convergent block loads up to a full
register. This causes page faults with the scratch page disabled since the
address is not always aligned to a register size. Loading smaller blocks is
slightly more difficult because the SEND instruction can only write back a
multiple of full registers, even if the actual data is smaller.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
8ce98fedc4 anv: Make sure robust UBO access does not fault
We can just conditionally replace the address with an address to a zero
initialized cacheline if the read is going to go out of bounds.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
64b5823d33 blorp: Work around sampler overfetch for buffer copies
First, the surface dimensions are used to determine the range of valid
pages that the data in the buffer overlaps, then rows are removed from
the surface until it does not overfetch into any neighboring pages. If
any rows were removed, an extra BTI is set up with a texel buffer that
views the contents of all the rows that were removed, and the shader is
compiled with a branch to sample the last rows through the texel buffer
instead of the main surface.

Using the texel buffer allows it to access the last rows without dealing
with overfetch or weird alignment hacks, and restricting texel buffer
usage to just the part of the surface that can't be accessed safely
ensures that we don't significantly impact performance for any buffer to
image copy that is unlucky enough to be close to a page boundry.

Co-authored-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
fd7c094f7b isl: Add and use isl_tiling_get_intratile_range_el/sa
Consolidates the logic for calculating the intratile extent of a slice of a
surface to avoid duplicating code in the next patch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
f5c848ef57 isl: Add function to calculate the amount of overfetch for an unpadded surface
Adds a function to calculate the total size of a 2D linear sampling engine
surface, including overfetch, for a buffer to image copy.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
3cd9b14c80 isl: Optimize the sampler cache to overlap as few 64B cachelines as possible
Since we now have a ISL_SURF_USAGE_NO_OVERFETCH_PADDING_BIT flag to turn extra
padding calculations on and off, we can align the row pitch of linear surfaces
that are accessed through the sampler to minimize the number of L3 cachelines
that each sampler cacheline overlaps for added efficiency.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
8d13628f7f isl: Add additional alignment/padding requirements to prevent overfetch
Bspec 58779 describes various cases where additional padding is required on the
bottom and right sides of a sampling engine surface to avoid page faults.

Since we don't want to mess up the other drivers that also use ISL, there's now
a requires_padding boolean in isl_dev that can be used to enable/disable the
extra padding calculations per device and driver.

The extra padding can also be disabled per-surface by adding the usage flag
ISL_SURF_USAGE_NO_OVERFETCH_PADDING_BIT, like when a specific row pitch is
needed.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
aee9602fea isl: Add usage flag to force SurfaceArray to false
When sampling BUFFER, 1D, or 2D surfaces, with no MSAA, no mipmap levels,
linear tiling, and SurfaceArray set to false, the surface padding
requirements are relaxed and its much easier to use the sampler to do
buffer-to-image copies in BLORP. We can't have it like this by default
though because we need SurfaceArray true for robustness.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:41 +00:00
Calder Young
bd88042f57 anv: Add padding to the shader heap to manage EU prefetch
Like the command streamer, the EUs will also blindly prefetch up to 3.5KiB
ahead of a shader. We can manage this in the shader heap by adding the
required padding when we allocate the buffers to back a shader allocation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:40 +00:00
Calder Young
5fb78a26db anv: Store batch buffers in a null-initialized VMA heap
The command streamer will blindly prefetch up to 4KiB ahead of a batch buffer
depending on the engine. To avoid page faults with the scratch page disabled,
we can create a special VMA heap for batch buffers that has pages initialized
with the null tile bit by default.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40149>
2026-05-01 19:51:40 +00:00