Commit graph

201327 commits

Author SHA1 Message Date
Marek Olšák
ec1a00f507 r300: don't lower sin/cos in finalize_nir
finalize_nir requires that calling it multiple times on the same shader
doesn't break it.

RV530 shader-db:
total instructions in shared programs: 132915 -> 132851 (-0.05%)
instructions in affected programs: 2016 -> 1952 (-3.17%)
helped: 16
HURT: 0
total temps in shared programs: 18238 -> 18232 (-0.03%)
temps in affected programs: 42 -> 36 (-14.29%)
helped: 6
HURT: 0
total cycles in shared programs: 197510 -> 197446 (-0.03%)
cycles in affected programs: 2102 -> 2038 (-3.04%)
helped: 16
HURT: 0

Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32160>
2024-11-25 10:59:37 +00:00
Pavel Ondračka
d406dbbde9 r300: run nir_opt_algebraic in the backend
No effect in shader-db right now, but without it the next commit
leads to small regression in instruction numbers (0.03%) instead
of the small win we have now (-0.05%).

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32160>
2024-11-25 10:59:37 +00:00
Rhys Perry
63b0692eac aco: don't use uniform continues if exec might be empty
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:59 +00:00
Rhys Perry
aa0ede751d aco/tests: add tests for empty exec masks
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:59 +00:00
Rhys Perry
f35e229fae aco: skip code if exec is empty
This is safer and potentially faster.

fossil-db (navi21):
Totals from 690 (0.87% of 79395) affected shaders:
Instrs: 4534778 -> 4535916 (+0.03%)
CodeSize: 25268516 -> 25272080 (+0.01%); split: -0.00%, +0.01%
Latency: 48482721 -> 48513907 (+0.06%); split: -0.00%, +0.07%
InvThroughput: 13213965 -> 13217828 (+0.03%); split: -0.00%, +0.03%
Copies: 432307 -> 432295 (-0.00%); split: -0.05%, +0.04%
Branches: 187305 -> 188249 (+0.50%)
VALU: 2904490 -> 2904508 (+0.00%); split: -0.00%, +0.00%
SALU: 674962 -> 675133 (+0.03%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:59 +00:00
Rhys Perry
f00c3a14c0 aco: require WQM after demote in control flow
fossil-db (navi21):
Totals from 424 (0.53% of 79395) affected shaders:
Instrs: 404496 -> 404752 (+0.06%); split: -0.07%, +0.13%
CodeSize: 2150608 -> 2151616 (+0.05%); split: -0.05%, +0.09%
Latency: 9124298 -> 9115957 (-0.09%); split: -0.12%, +0.03%
InvThroughput: 1883570 -> 1883468 (-0.01%); split: -0.01%, +0.00%
VClause: 6832 -> 6830 (-0.03%)
SClause: 13801 -> 13778 (-0.17%); split: -0.17%, +0.01%
Copies: 26758 -> 26673 (-0.32%); split: -0.44%, +0.12%
Branches: 9819 -> 9567 (-2.57%)
PreSGPRs: 17902 -> 17934 (+0.18%)
SALU: 45407 -> 45906 (+1.10%); split: -0.01%, +1.11%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:59 +00:00
Rhys Perry
8a175b02bc aco: use repair pass for LCSSA workaround
This makes instruction selection simpler and fixes potential issues with
allocated_vec or the optimizer moving SGPR uses out of the loop.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:59 +00:00
Rhys Perry
5de990f5a9 aco: add SSA repair pass
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:58 +00:00
Roman Stratiienko
83b4b829fd v3dv/android: Suppress AHB-related log spam
The VK_STRUCTURE_TYPE_IMPORT_ANDROID_HARDWARE_BUFFER_INFO_ANDROID is handled
by the common code.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32314>
2024-11-25 08:08:25 +00:00
Samuel Pitoiset
ba77b2d65d radv: fix printing with RADV_DEBUG=psocachestats
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32254>
2024-11-25 07:36:49 +00:00
Samuel Pitoiset
6c967c9bbe radv: fix dumping the trap handler shader disassembly
This has been broken in the recent RADV_DEBUG=shaders refactoring.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32254>
2024-11-25 07:36:49 +00:00
Samuel Pitoiset
5c3a757ba6 radv: add a pipeline helper to skip shaders cache
It's common for the three type of pipelines.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32254>
2024-11-25 07:36:48 +00:00
Samuel Pitoiset
3f646d43dd radv: fix dumping debug/perftest options when there are holes
Also fix the wrong assertion.

Fixes: 8c1e2ac03b ("radv: Refactor RADV_DEBUG=shaders to be a combination of other options.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32276>
2024-11-25 07:01:06 +00:00
Boris Brezillon
e0f48568c7 panfrost: Advertise support for AFBC(32x8,sparse,split)
Some MTK display controller drivers support only this AFBC modifier.
Give it a chance to use AFBC for scanout resources.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
2024-11-25 00:26:36 -05:00
Boris Brezillon
4af57952b1 panfrost: Add support for AFBC(split)
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
2024-11-25 00:26:26 -05:00
Boris Brezillon
762a0f4133 panfrost: Add the concept of render block
When dealing with AFBC render targets using wide blocks, the GPU needs to
keep rendering tiles that are a multiple of 16x16. This is described
as AFBC render block size, and adds extra constraints:

- render target buffers need to be aligned on 16 pixels in the vertical
  direction, even if the AFBC super block size is 4 or 8 pixels.
- if the effective tile size is smaller than the render block size, we
  should force a clean write and discard+ignore the CRC

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
2024-11-25 00:26:14 -05:00
Boris Brezillon
303acdef07 panfrost: Add a helper to expose the maximum effective tile size
On all previous GPUs, the effective tile size was limited to 16x16, but
it got increased on v10. Add an helper to query this maximum effective
tile size.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
2024-11-25 00:26:02 -05:00
Louis-Francis Ratté-Boulianne
a3c8258908 panfrost: Select the effective tile size as part of pan_fb_info
This allows using the tile size to make decisions not related to the
framebuffer descriptor. Mainly, for the near future, to decide
whether some tiling hierarchy levels should be disabled.

The color buffer allocation size is also calculated at the same time
as it's using common data underneath.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
2024-11-25 00:25:58 -05:00
Louis-Francis Ratté-Boulianne
eead8b6efd panfrost: Split up allocation and packing of tiler descriptor
This is mostly useful so that we can set the hierarchy level mask
using information from the `pan_fb_info` structure that isn't filled
yet when the tiler descriptor is first allocated.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
2024-11-25 00:25:51 -05:00
Boris Brezillon
ca84b1e9b5 panfrost: Increase AFBC body alignment requirement on v6+
AFBC body is required to be aligned on 128 bytes on v6+ hardware.

Cc: mesa-stable
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31948>
2024-11-25 00:25:05 -05:00
Timur Kristóf
45c523104a ac/nir/ngg: Implement optional primitive compaction.
It's an experimental feature that we may enable later.
Instead of exporting NULL primitives, perform a compaction
on primitives after culling.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32290>
2024-11-25 01:56:20 +01:00
Timur Kristóf
492d8f3778 ac/nir/ngg: Workgroup scan over two bools.
Implement two workgroup scans over two boolean values in parallel,
so that they can be done with very minimal ALU overhead.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32290>
2024-11-25 01:56:08 +01:00
Timur Kristóf
78f77e161c ac/nir/ngg: Pass wg_repack_result as pointer instead of returning it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32290>
2024-11-25 01:55:30 +01:00
Patrick Lerda
ac78692be4 r600: evergreen stencil/depth mipmap blit workaround
In certain cases, the hardware fails to properly process a mipmap level
of these special stencil and depth formats. This happens at width=16.
This change adds a software workaround.

Modifying the corresponding mipmap nblk_x, and the other related
values, could make the tests below to work. Anyway, this method
generates regressions.

This change was tested on palm and cayman and fixes the following tests:
spec/arb_framebuffer_object/framebuffer-blit-levels read stencil: fail pass
spec/arb_depth_buffer_float/fbo-clear-formats stencil/gl_depth32f_stencil8: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31957>
2024-11-24 21:01:54 +00:00
Patrick Lerda
81889f4d5c r600: ensure that the last vertex is always processed on evergreen
This situation is happening, for instance, when the hardware is
using the type FMT_8_8_8_8 (4 bytes) while the software was
requesting a 3 bytes type. The width should be adjusted to the
expected hardware size; otherwise, the last vertex is lost.

Note: The rv770 didn't behave like this. This is definitely
a hardware change between these gpus.

This change was tested on palm and cayman. Here are the tests fixed:
spec/!opengl 2.0/gl-2.0-vertexattribpointer-size-3: fail pass
deqp-gles2/functional/draw/random/62: fail pass
deqp-gles2/functional/vertex_arrays/single_attribute/strides/buffer_0_32_byte3_vec4_dynamic_draw_quads_1: fail pass
deqp-gles2/functional/vertex_arrays/single_attribute/strides/buffer_0_32_short3_vec4_dynamic_draw_quads_1: fail pass
deqp-gles2/functional/vertex_arrays/single_attribute/strides/buffer_0_32_short3_vec4_dynamic_draw_quads_256: fail pass
deqp-gles3/functional/draw/random/117: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/strides/byte/buffer_stride32_components3_quads1: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/strides/short/buffer_stride32_components3_quads1: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/strides/short/buffer_stride32_components3_quads256: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32184>
2024-11-24 20:23:38 +00:00
Patrick Lerda
275535774c r600: restructure r600_create_vertex_fetch_shader() to remove memcpy()
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32184>
2024-11-24 20:23:38 +00:00
Patrick Lerda
4d24995adb r600: fix the evergreen sampler when the minification and the magnification are not identical
This change fixes the evergreen nonconformity issue on non-mipmap
textures when the minification and the magnification are not in
the same state.

This modification disables 5278436d67 when the minification and
the magnification are different. This fixes the nonconformity
without new regressions. Anyway, I was unable to reproduce
the issue described by 5278436d67 on palm and cayman.

This change was tested on cayman and palm. It fixes 84 deqp-gles2
tests and 128 deqp-gles3 tests:
deqp-gles2/functional/texture/filtering/2d/linear_nearest_*
deqp-gles2/functional/texture/filtering/2d/nearest_linear_*
deqp-gles2/functional/texture/filtering/cube/linear_nearest_*
deqp-gles2/functional/texture/filtering/cube/nearest_linear_*
deqp-gles2/functional/texture/vertex/2d/filtering/linear_nearest_*
deqp-gles2/functional/texture/vertex/2d/filtering/nearest_linear_*
deqp-gles2/functional/texture/vertex/cube/filtering/linear_nearest_*
deqp-gles2/functional/texture/vertex/cube/filtering/nearest_linear_*
deqp-gles3/functional/texture/filtering/2d/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/2d/combinations/nearest_linear_*
deqp-gles3/functional/texture/filtering/2d_array/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/2d_array/combinations/nearest_linear_*
deqp-gles3/functional/texture/filtering/3d/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/3d/combinations/nearest_linear_*
deqp-gles3/functional/texture/filtering/cube/combinations/linear_nearest_*
deqp-gles3/functional/texture/filtering/cube/combinations/nearest_linear_*
deqp-gles3/functional/texture/vertex/2d/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/2d/filtering/nearest_linear_*
deqp-gles3/functional/texture/vertex/2d_array/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/2d_array/filtering/nearest_linear_*
deqp-gles3/functional/texture/vertex/3d/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/3d/filtering/nearest_linear_*
deqp-gles3/functional/texture/vertex/cube/filtering/linear_nearest_*
deqp-gles3/functional/texture/vertex/cube/filtering/nearest_linear_*

Fixes: 5278436d67 ("r600: force LOD range to be only one value when mip.min filter is NONE")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32185>
2024-11-24 20:07:42 +00:00
Gert Wollny
42be38a8fb radeon/evergreen: ensure equal sizes for depth-stencil npot textures
On evergreen depth-stencil textures are allocated as two objects, and
when using the eg_surface_init_1d_miptrees code path the size evaluation
uses the generalized surf_minify function. Here when allocating the
depth texture the alignment takes the depth bpe value into account, and
uses bpe=1 for the stencil texture. As a result the texture pair may
consist of textures with two different nblk_x sizes and this seems to
be a problem with some textures, namely npot and small (width < 32), but
not for mipmapped textures. In the problematic cases, if the so allocated
depth texture is larger than the stencil texture, then the kernel may reject
sent data with an error message like:

 evergreen_cs_track_validate_stencil:622 stencil read bo too
  small (layer size 131072, offset 524288, max layer 1, bo size 606208)

- because apparently the expected layer size is evaluated from the depth
texture size, but the actual bo size is evaluated based on the true texture
size values. If, on the other hand, the stencil texture is larger than the
depth texture, then the data is send with a wrong alignment, and certain
dEQP-GLES31 tests fail.

In order to obtain equal texture sizes in the problematic cases magnify
the depth texture alignment requirement by its bpe, so that the relative
alignment is the same for depth and stencil texture.

Fixes:
  dEQP-GLES31.functional.stencil_texturing.format
    .depth32f_stencil8_2d
    .depth32f_stencil8_2d_array
    .depth24_stencil8_2d
    .depth24_stencil8_2d_array
    .stencil_index8_2d
    .stencil_index8_2d_array
    .depth32f_stencil8_draw
    .depth24_stencil8_draw

  dEQP-GLES31.functional.texture.border_clamp.formats
    .stencil_index8.nearest_size_npot
    .depth24_stencil8_sample_stencil.nearest_size_npot
    .depth32f_stencil8_sample_stencil.nearest_size_npot

  dEQP-GLES31.functional.texture.border_clamp.per_axis_wrap_mode.texture_2d
    .uint_stencil.nearest.s_clamp_to_edge_t_clamp_to_border_npot
    .uint_stencil.nearest.s_repeat_t_clamp_to_border_npot
    .uint_stencil.nearest.s_mirrored_repeat_t_clamp_to_border_npot

  piglits:
    arb_framebuffer_object-depth-stencil-blit *stencil*
    framebuffer-blit-levels draw stencil
    arb_texture_stencil8/
       texwrap formats offset/gl_stencil_index8, npot
       texwrap formats/gl_stencil_index8, npot
    ext_framebuffer_multisample
       accuracy all_samples stencil_resolve small depthstencil
       unaligned-blit * stencil downsample
    ext_texture_array/fbo-depth-array *stencil
    egl_khr_gl_renderbuffer_image-clear-shared-image gl_depth_component24

v2: use util_is_power_of_two_or_zero (Marek)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32169>
2024-11-24 20:43:57 +01:00
Benjamin Lee
7eda433095 nir: document order requirement for nir_lower_viewport_transform
This requirement is currently satisfied by the usage in panfrost and
lima.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32084>
2024-11-24 17:25:14 +00:00
Benjamin Lee
11b6e47618 nir: clamp small W in nir_lower_viewport_transform
Because we are doing perspective division before clipping, small
gl_Position.w values will give Inf for positions and interpolated
varyings. Before this change, primitives containing a vertex with w=0
were invisible.

This is only used in panfrost and lima.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32084>
2024-11-24 17:25:14 +00:00
Tapani Pälli
19b6991160 anv/android: always create 2 graphics and compute capable queues
Android hwui requires 2 queues.

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32279>
2024-11-24 16:39:33 +00:00
Alyssa Rosenzweig
430fa29953 asahi: refmt
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
0755b6d3d5 asahi: add XML for cdm stream link with return
I don't know of any case of Apple's driver using this, but it seems to work. The
stream link bit is identical to VDM so that was easy, the tricky part was the
return but I bruteforced the encoding space and this is the (only) thing that
worked. So add the XML.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
ebdca6344e asahi/genxml: define missing macros
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
e01dc7a588 asahi/genxml: optimize out masking with shr
noticed in the agx asm.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
6a1a3dac21 asahi/genxml: fix 128-bit in CL path
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
a34b3ecb75 asahi/genxml: fix 0 encoding for groups
this was breaking launch word merging.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
65cc99a916 libagx: don't export vertex_id_for_top
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
810971532f libagx: fix return type
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
6dea5f49f2 hk: dce
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
33b41e029a hk: add cmd buffer to hk_cs
convenient.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
7609a974a3 hk: use common wg size
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:09 +00:00
Alyssa Rosenzweig
7c921fa0d7 hk: do not increment GS queries for passthru GS
fixes vkd3d-proton test_query_pipeline_statistics

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:08 +00:00
Alyssa Rosenzweig
6b8d4cca7e hk: be robust against invalid MSAA inputs
fixes vkd3d-proton test_multisample_rendering

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:08 +00:00
Alyssa Rosenzweig
1f7598c202 hk: implement EXT_depth_bias_control
fixes Z24 depth bias with Proton

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:08 +00:00
Alyssa Rosenzweig
94160615ef hk: handle mismatching colour vs z/s dimensions
hw doesn't really care. fixes test_mismatching_rtv_dsv_size

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:08 +00:00
Alyssa Rosenzweig
628a119d82 hk: expose missing eds3 feature
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:08 +00:00
Alyssa Rosenzweig
0caf6e0440 hk: generalize internal launch
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:08 +00:00
Alyssa Rosenzweig
aae0c1d5a8 asahi,hk: reenable rgb32 buffer textures
Apparently Direct3D has this. Boo :'(

This reverts commit 049808630e.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:08 +00:00
Alyssa Rosenzweig
02d4f49bcd agx: gather workgroup size
deduplicate this.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32320>
2024-11-24 13:06:08 +00:00