- Split encode and group mappings to allow the former to be re-used.
- Add custom zero value mapping for bitset enums.
- Enable optional enum mapping for ref mods (previously just op mods).
- Commonize nop/nop.end.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33998>
The conversion from bit value to register file type is already done
by the brw_eu_inst_3src_a1_dst_reg_file in the FFC macro now, so doing it
again produced incorrect results.
Fixes: e7179232 ("intel/brw: Move encoding of Gfx11 3-src inside the inst helpers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13141
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35960>
In some cases a format may be supported in a more limited way by the
hardware. For example, formats with NPoT pixel sizes. A driver might
normally prefer that mesa/st use R8G8B8X8 rather than R8G8B8. But if
the user wants to (dma-buf/etc) import R8G8B8, it is still possible,
and in this case zero copy is more important.
So add a PIPE_BIND_x flag as a hint to the driver when checking if
a format is supported.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35982>
This enables the rendering of RGB/BGR 24-bit format buffers directly
onto the framebuffer. For RGB888, support already exists for vertex and
texture formats, so render buffer format support has been added. For
BGR888, support for vertex, texture, and render buffer formats has been
added. The internal format chosen for both RGB888 and BGR888 is GL_RGB8.
Change-Id: I0557389dba05d3b44d7b935f02683df17e41fbd2
Signed-off-by: Petar G. Georgiev <quic_petarg@quicinc.com>
Signed-off-by: Lakshman Chandu Kondreddy <quic_lkondred@quicinc.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35982>
The memory footprint of the table has gotten quite out of hand (>3GB in
Control DX12). This patch brings that number down to around 3MB.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35959>
The fmul+fadd -> fma rules in nir_opt_algebraic are marked imprecise,
because they are a contraction. However, they respect signed zero/Inf/NaN rules.
As such, it is legal to do this fusion with shader float controls as long as the
exact bit is not set (mapping to SPIR-V NoContract).
Unfortunately, NIR's imprecise rules do not distinguish between contraction
issues versus float special case issues, forcing nir_search to skip all
imprecise rules when any shader float control modes are used. This notably
affects DXVK, which sets shader float controls to get D3D11 float behaviour and
hence loses FMA fusing.
Therefore, we plumb in the exact bit to express NoContract independent of the
float controls, and weaken the requirement for fma fusion to allowable
contraction. For fma splitting, it's a similar issue, as inexact GLSL fma in
SPIR-V is just a multiply add that we're allowed to contract rather than the
real deal.
Drivers that use their own FMA fusing passes (notably, Intel and AMD) are
unaffected, but DXVK-capable drivers using fuse_ffma should like this. Results
on hk shown:
Totals from 2194 (4.06% of 54019) affected shaders:
MaxWaves: 2174272 -> 2175936 (+0.08%); split: +0.08%, -0.01%
Instrs: 1173283 -> 1131494 (-3.56%); split: -3.57%, +0.01%
CodeSize: 8568168 -> 8381724 (-2.18%); split: -2.18%, +0.01%
Spills: 1094 -> 747 (-31.72%)
Fills: 988 -> 681 (-31.07%)
Scratch: 4444 -> 3820 (-14.04%)
ALU: 953032 -> 913149 (-4.18%); split: -4.19%, +0.01%
FSCIB: 953032 -> 913149 (-4.18%); split: -4.19%, +0.01%
IC: 215398 -> 215274 (-0.06%)
GPRs: 139865 -> 139032 (-0.60%); split: -1.56%, +0.96%
Uniforms: 414886 -> 414466 (-0.10%); split: -0.14%, +0.04%
Preamble instrs: 646398 -> 644017 (-0.37%); split: -0.43%, +0.07%
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35989>
Similar to nir_lower_alu_width(), the callback can return the
desired number of components for a phi, or 0 for no lowering.
The previous behavior of nir_lower_phis_to_scalar() with lower_all=true
can be elicited via nir_lower_all_phis_to_scalar() while the previous
behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar()
with NULL callback.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
This check causes unnecessary overhead and can be replaced by simply
checking whether a phi_src is from a loop continue block.
Except for rare edge cases, the result will be the same.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
With the check for the cap gone no driver will expose
ARB_shader_clock. Unfortunately the CI doesn't catch this
because it doesn't provide expectations whether a test should
pass or be skipped. In this case
spec@arb_shader_clock@execution@clock
spec@arb_shader_clock@execution@clock2x32
went from pass to skip. (Tested on r600, but on radeonsi one
can also see that the extension ARB_shader_clock is no longer
available).
Adding the test for the cap back in fixes this.
Fixes: 2ce201707e (Add support for EXT_shader_clock)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35992>
Add a new field userq_num_hqds to drm_amdgpu_info_hw_ip to expose the
number of available hardware queue descriptors (HQDs) for user queues.
This allows userspace to query the maximum number of user queues that
can be created for a particular IP block.
the patch link in driver side:
https://lists.freedesktop.org/archives/amd-gfx/2025-June/126686.html
v2: we should also put userq_num_hqds into radeon_info and
print it where other fields are printed. (Marek Olšák)
v3: rename num_userqs to num_queue_slots
and add print log in ac_print_gpu_info. (Marek Olšák)
v4: rename userq_num_hqds to userq_num_slots in hw_ip_info,
and update the hw information (Marek Olšák)
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35850>