This helps by reducing the number of branches with their corresponding
delay slots, at the expense of additional register pressure. It also helps
a lot with SFU stalls, probably because removing control-flow blocks
gives us more QPU scheduling flexibility to hide them.
Shader-db results below correspond to the "closed shaders" set, since the
full set is very dominated by the massive impact this change has on Skia's
shaders (for the better), so this is probably more representative of real
impact:
total instructions in shared programs: 11887255 -> 11854898 (-0.27%)
instructions in affected programs: 538170 -> 505813 (-6.01%)
helped: 1653
HURT: 43
Instructions are helped.
total threads in shared programs: 385924 -> 385872 (-0.01%)
threads in affected programs: 236 -> 184 (-22.03%)
helped: 22
HURT: 48
Inconclusive result (%-change mean confidence interval includes 0).
total uniforms in shared programs: 3552808 -> 3547894 (-0.14%)
uniforms in affected programs: 157486 -> 152572 (-3.12%)
helped: 1673
HURT: 35
Uniforms are helped.
total max-temps in shared programs: 2062403 -> 2064720 (0.11%)
max-temps in affected programs: 18209 -> 20526 (12.72%)
helped: 168
HURT: 369
Max-temps are HURT.
total spills in shared programs: 1937 -> 1994 (2.94%)
spills in affected programs: 79 -> 136 (72.15%)
helped: 0
HURT: 1
total fills in shared programs: 2652 -> 2717 (2.45%)
fills in affected programs: 115 -> 180 (56.52%)
helped: 0
HURT: 1
total sfu-stalls in shared programs: 19349 -> 18010 (-6.92%)
sfu-stalls in affected programs: 2321 -> 982 (-57.69%)
helped: 674
HURT: 74
Sfu-stalls are helped.
total inst-and-stalls in shared programs: 11906604 -> 11872908 (-0.28%)
inst-and-stalls in affected programs: 541339 -> 507643 (-6.22%)
helped: 1656
HURT: 43
Inst-and-stalls are helped.
total nops in shared programs: 245740 -> 238085 (-3.12%)
nops in affected programs: 19282 -> 11627 (-39.70%)
helped: 1335
HURT: 76
Nops are helped.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22922>
Similar to SpvDecorationRestrict, looks like it's also incorrectly
generated by glslang.
This will allow RADV/CI to leave MESA_SPIRV_LOG_LEVEL as default
(ie. only warnings).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Martin Roukala <martin.roukala@mupuf.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22917>
Resolves ambient occlusion rendering in Replicant
Resolves grass and ocean animations in Automata, and maybe more.
Both of these games have shaders that expect trig values to work across
large ranges with good precision.
Closes#7656
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22894>
Ensure the render target values are in the proper range.
This fixes `spec@!opengl 3.0@render-integer`.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22733>
Some of the new features require at least V3D 4.2. And actually, 4.2 is
the version used by the Raspberry Pi 4 hardware.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22733>
Instead of hardcoding conditionals to know which hardwared-based version
of a function to call, just wrap them in a macro to use
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22733>
Since we are disabling mesh, which has issues with gpl, enable gpl by
default now, leaving the renamed environment variable as a way to
disable it for debug purposes.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22910>
We are seeing frequent hangs in other workloads when something using
mesh shaders runs at the same time, so gate the feature behind an
environment variable until we figure out what's going on.
v2: (Sagar)
- Give the mesh enabled variable a more descriptive name
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22910>
The 'struct drm_amdgpu_cs_chunk_fence' is processed as
'struct drm_amdgpu_cs_chunk_data' which is a union.
This change ensures the proper alignment for this structure
to be processed as 'struct drm_amdgpu_cs_chunk_data'.
The presence of __u64 as one member of
'struct drm_amdgpu_cs_chunk_data' makes the
whole structure expected to be 64-bit aligned.
This is a minor issue detected by the gcc sanitizer (ubsan), for instance at the libdrm library:
../amdgpu/amdgpu_cs.c:937:26: runtime error: member access within misaligned address 0x63100001484c for type 'struct drm_amdgpu_cs_chunk_data', which requires 8 byte alignment
0x63100001484c: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
Fixes: ae7e4d7619 ("amd: rename ring_type --> amd_ip_type and match the kernel enum values")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22920>
Similar to what was done to alloc buffer but now for userptr bos.
There is no changes in i915 modes but Xe may different values in
future.
While at it, also setting bo->real.heap to IRIS_HEAP_SYSTEM_MEMORY
as it was already implicit set as IRIS_HEAP_SYSTEM_MEMORY is the
value 0 of the enum.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22240>
i915 and Xe kmd can have different mmaps modes, so here extracting
the code to handle it to function.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22240>
if a sampler is never used (no derefs) then its binding will never be
applied here, leaving it with binding=0. this will clobber the real binding=0
sampler in driver backends, leading to errors, so try to iterate using
the same criteria as above and apply bindings in the same way
fixes#8974
cc: mesa-stable
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22902>