For last VGT stage to export parameter outputs.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691>
Used by last VGT stage to export position related outputs.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691>
Mesh shader primitive export is left unchanged because it needs
extra changes for per primitive output export when export
primitive.
Mesh shader will use second channel of primitive export.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691>
The face culling algorithm should have been disabled for
conservative overestimation because it already
(mistakenly) removed some close-to-zero area triangles.
Now that the driver disables it in that case,
let's always remove zero-area triangles.
This only costs +2 SALU instructions.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20987>
This is so that we can tell whether the current kernel
has the PCIe bandwidth info available or not.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20842>
The si_create_shadowing_ib_preamble() function can be reused from radv also.
Hence it is moved.
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18301>
This will be used by SQTT to implement a workaround on GFX11
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20529>
This allows ACO to combine the addition into the store without checking
for wraparound.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20296>
Instead let each driver call it.
radeonsi ignores the error because it doesn't require correct
pci-bus info to work properly.
radv keeps the existing behavior and fails if the pci-bus infos
is missing.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20645>
And add a validity flag because there's no way to
tell if they're valid, unless for the caller of
drmGetDevice2.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20645>
Gather outputs in advance to save both output data and type. Output data
is used for streamout and gfx11 param export. Output type is used for
streamout latter.
The output info will also be used for nir vertex export in the future.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Singed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20350>
When slot has no param offset, we should not emit store output for
them on gfx11.
Fixes: abe2e99e9e ("ac/nir/ngg: gs support 16bit outputs")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20350>
Follow the spec, all outputs even not this stream need to be
reset after emit vertex.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20350>
Because we called nir_lower_io_to_temporaries which ensure the
store output and emit vertex in the same block.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20350>
This is for VARYING_SLOT_VARx_16BIT slots varying streamout.
OpenGL ES will store 16bit medium precision varying to these slots.
Vulkan is not allowed to streamout varying less than 32bit.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20157>
GFX11 can have more than 4 SEs. I think it would be better to allocate
an array but that's for later.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20337>
A bit count of es_accepted works for both when ngg is and isn't
dynamically enabled. Unlike the other sequence, this should only be a
single SALU instruction.
fossil-db (gfx1100, nggc):
Totals from 41388 (30.75% of 134574) affected shaders:
Instrs: 25783544 -> 25432959 (-1.36%); split: -1.36%, +0.00%
CodeSize: 127281160 -> 125878820 (-1.10%); split: -1.10%, +0.00%
Latency: 92849566 -> 92723047 (-0.14%); split: -0.14%, +0.00%
InvThroughput: 9542194 -> 9485012 (-0.60%); split: -0.60%, +0.00%
Copies: 2031074 -> 1928796 (-5.04%); split: -5.04%, +0.00%
Branches: 642407 -> 642409 (+0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20321>
No shader-db or fossil-db changes on any Intel platform.
v2: Add missed i2b1 in ir3_nir_opt_preamble.c.
v3: Add missed i2b1 in ac_nir_lower_ngg.c.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
For legacy (non-NGG) GS to lower outputs to memory stores and add
shader query when required.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20158>
To be shared by NGG GS and legacy GS. Legacy GS need this when
GFX10 which mix use NGG and legacy GS. For example when streamout
is enabled, it uses legacy GS, otherwise uses NGG GS. So legacy
GS also need to update query emulation which is a sum of NGG and
legacy GS results.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20158>