Implement offset lowering by computing the appropriate LOD from
gradients and adjusting coordinates accordingly.
Passes dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.* on GC7000.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35753>
Implement offset lowering by using the explicit LOD value with nearest-integer
rounding (floor(lod + 0.5)) and reusing the coordinate calculation helper.
Passes dEQP-GLES3.functional.shaders.texture_functions.texturelodoffset.* on GC7000.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35753>
Implement offset lowering by calculating implicit LOD using coordinate derivatives (ddx/ddy)
and doing some deep floating point wizardry matching the binary blob behaviour.
Adds helper functions for coordinate calculation and LOD clamping that will be
reused by subsequent offset lowering passes.
Passes dEQP-GLES3.functional.shaders.texture_functions.textureoffset.* without explicit bias on GC7000.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35753>
Will be used by etnaviv too.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35753>
Passes dEQP-GLES3.functional.shaders.texture_functions.textureoffset.* with explicit bias on GC7000.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35753>
LDS sizes and offsets from LLVM are no longer used.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
This will enable large code removal.
shader->config.lds_size is now always computed the same as ACO except for
compute shaders.
We have to add a new 8-bit user SGPR bitfield called
GS_STATE_GS_OUT_LDS_OFFSET_256B, which contains the offset
that was previously set by the relocation.
Since the offset must be a multiple of 256, we have to add padding
to the LDS size computation to make sure the alignment to 256 for the ESGS
LDS size doesn't cause us to exceed the maximum LDS size.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
It has been implemented and works for PS outputs already.
The lowering callback needs 2 variants because we can't access
pipe_screen from it. The callback is rewritten to be more general.
We also need to do nir_clear_mediump_io_flag for any outputs we don't
lower because the mediump flag might prevent optimizations if it's not
cleared.
v2: fix si_nir_optim
Acked-by: Timur Kristóf <timur.kristof@gmail.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
outputs_written_before_ps is used to determine kill_outputs, which removes
param exports, but non-zero GS streams are xfb-only and not exported.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
The result of that function was overwritten by other code, so just remove it.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
It eliminates no-op (>= 0) clip/cull distance output components by setting
no_sysval_output = true.
We have to gather clip/cull distances manually to get reduced clip/cull
masks.
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
If there are more than 6 planes without gl_ClipVertex and gl_ClipDistance,
add "gl_ClipVertex = gl_Position;" to support up to 8 planes.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
so that it can be different between shader variants
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
Remove redundent lines in si_perfetto.cpp
Convert offset_B of buffer gpu_address to uint64_t ptr offset when
si_buffer_map is being called to read timestamp.
Destroy sctx->trace by calling u_trace_fini in si_utrace_fini which
will be called in si_destroy_context.
Signed-off-by: Julia Zhang <julia.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36066>
This increases primitive throughput when packing reduces the number
of pos exports due to holes in clip and cull distance arrays that could be
punched out by nir_opt_clip_cull_const. This applies to all chips.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
This increases primitive throughput for all hw with NGG if the shader
culls and the removal of cull distances reduces the number of position
exports.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
It will be different between NGG and legacy because NGG with culling
will not export cull distances.
The number of position exports could also be gathered from final NIR
to reduce logic duplication.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
It will be different between NGG and legacy because NGG with culling
will not export cull distances.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
This is very useful for increasing raw primitive throughput for GS
(mostly just RDNA 2), increasing raw primitive throughput with clip
and cull distance outputs when they actually cull anything (RDNA 1-4),
and reducing attribute store bandwidth usage (RDNA 3-4).
It will also replace fixed-func culling against cull distances when
culling in the shader is enabled, which will increase primitive throughput
even further.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
It receives all shaders and decides how to link them.
When culling is enabled for GS, we will need ES, GS, and FS in this
function at the same time.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>