When slot has no param offset, we should not emit store output for
them on gfx11.
Fixes: abe2e99e9e ("ac/nir/ngg: gs support 16bit outputs")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20350>
Follow the spec, all outputs even not this stream need to be
reset after emit vertex.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20350>
Because we called nir_lower_io_to_temporaries which ensure the
store output and emit vertex in the same block.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20350>
This is for VARYING_SLOT_VARx_16BIT slots varying streamout.
OpenGL ES will store 16bit medium precision varying to these slots.
Vulkan is not allowed to streamout varying less than 32bit.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20157>
GFX11 can have more than 4 SEs. I think it would be better to allocate
an array but that's for later.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20337>
A bit count of es_accepted works for both when ngg is and isn't
dynamically enabled. Unlike the other sequence, this should only be a
single SALU instruction.
fossil-db (gfx1100, nggc):
Totals from 41388 (30.75% of 134574) affected shaders:
Instrs: 25783544 -> 25432959 (-1.36%); split: -1.36%, +0.00%
CodeSize: 127281160 -> 125878820 (-1.10%); split: -1.10%, +0.00%
Latency: 92849566 -> 92723047 (-0.14%); split: -0.14%, +0.00%
InvThroughput: 9542194 -> 9485012 (-0.60%); split: -0.60%, +0.00%
Copies: 2031074 -> 1928796 (-5.04%); split: -5.04%, +0.00%
Branches: 642407 -> 642409 (+0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20321>
No shader-db or fossil-db changes on any Intel platform.
v2: Add missed i2b1 in ir3_nir_opt_preamble.c.
v3: Add missed i2b1 in ac_nir_lower_ngg.c.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
For legacy (non-NGG) GS to lower outputs to memory stores and add
shader query when required.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20158>
To be shared by NGG GS and legacy GS. Legacy GS need this when
GFX10 which mix use NGG and legacy GS. For example when streamout
is enabled, it uses legacy GS, otherwise uses NGG GS. So legacy
GS also need to update query emulation which is a sum of NGG and
legacy GS results.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20158>
Before this commit each stream will emit a query block, now
we merge them to a single block.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20074>
More precise type info, can be used for 16bit output streamout
to convert 16bit int/uint/float to 32bit one later.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19697>
Max slot number for 16bit output is 16, so no need to use
64 array size for them.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19697>
pipe_stream_output_info is built from nir_xfb_info, why not just use
nir_xfb_info directly.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20015>
Device memory scope is necessary because we need to ensure there is
always a waitcnt_vscnt instruction in order to avoid a race condition
between payload stores and their loads after mesh shaders launch.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19967>
radeonsi use packed location base while radv use un-packed location.
So we adjust instance_rate_inputs in each driver to hide the difference.
Note the attribute slot number is less than 16, so we can shift
instance_rate_inputs in radv by VERT_ATTRIB_GENERIC0 which is 16.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19868>
Add negative value is possible to wrap around. I haven't seen this
"nuw" causes any problem yet, but let's remove it for safe.
Fixes: 60ac5dda82 ("ac: Add NIR lowering for NGG GS.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19718>
We should not use "nuw" here as negative add positive may wrap
around (negative is 0xffffff??).
This problem can be observed with LLVM15 (I can't see when LLVM14):
%.neg = mul nsw i32 %31, -4
%163 = add nuw nsw i32 %.neg, 16
%164 = lshr i32 257, %.neg
%165 = lshr i32 %164, %163
LLVM just assume %.neg is possitive, so pre-shift 0x01010101 by 16.
This get wrong value because we can't get back the shifted bits with
a negative shift right.
Fixes: 75dbb40439 ("ac/nir: Remove byte permute from prefix sum of the repack sequence.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19718>
Outputs are always considered 32-bits.
Found by inspection.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19612>
XFB queries need to be enabled with NGG streamout and VS/TES.
Previously, the NGG lowering code relied on has_prim_query for XFB.
This fixes failures with RADV_PERFTEST=ngg_streamout on GFX10.3 with
the vkd3d-proton testsuite. Vulkan CTS is missing TES tests with XFB
queries apparently.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19493>
We need to pass the family to register parsing functions.
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19419>
radeonsi assume IO has been scalarized, this simplify the code
and radeonsi implementation.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19455>
Given that we no longer need the slot_to_register mapping, it's
useless to use this function.
This also fixes a bunch of failures with
dEQP-VK.transform_feedback.*omit_write* on RADV because in Vulkan
the spec requires XFB query counters to be incremented even if XFB
outputs aren't written to.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19437>