Commit graph

7384 commits

Author SHA1 Message Date
Rhys Perry
aebffc241d aco: don't use nir_block_is_unreachable()
nir_cf_reinsert() can re-create the block, invalidating dominance
metadata.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9808>
2021-06-04 14:14:00 +00:00
Mauro Rossi
e4e4b6bc16 android: aco: add aco_optimizer_postRA.cpp to Makefile.sources
Fixes the following building error:

external/mesa/src/amd/compiler/aco_interface.cpp:155: error: undefined reference to 'aco::optimize_postRA(aco::Program*)'

Fixes: 0e4747d3fb ("aco: Introduce a new, post-RA optimizer.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11177>
2021-06-04 09:31:41 +02:00
Mauro Rossi
60e134e83e android: ac: add include src/util path
Fixes the following building error:

external/mesa/src/amd/common/ac_nir_lower_ngg.c:27:10: fatal error: 'u_math.h' file not found
         ^~~~~~~~~~
1 error generated.

Fixes: 3d589b8b46 ("ac: Add new NIR pass to lower NGG VS/TES.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11177>
2021-06-04 09:31:36 +02:00
Mauro Rossi
2dea82fc07 android: ac: add ac_nir_lower_ngg.c to Makefile.sources
Fixes the following building errors:

external/mesa/src/amd/vulkan/radv_shader.c:868: error: undefined reference to 'ac_nir_lower_ngg_gs'
external/mesa/src/amd/vulkan/radv_shader.c:851: error: undefined reference to 'ac_nir_lower_ngg_nogs'
external/mesa/src/amd/compiler/aco_interface.cpp:155: error: undefined reference to 'aco::optimize_postRA(aco::Program*)'

Fixes: 3d589b8b46 ("ac: Add new NIR pass to lower NGG VS/TES.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11177>
2021-06-04 09:31:28 +02:00
Samuel Pitoiset
aff92f50c6 ac: add ac_thread_trace::data
Instead of passing two different structs to ac_dump_rgp_capture().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11156>
2021-06-03 15:39:34 +00:00
Samuel Pitoiset
416496a0c4 ac/rgp: fix ac_fill_sqtt_asic_info() name
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11156>
2021-06-03 15:39:34 +00:00
Samuel Pitoiset
ea3f72c9d9 ac: rename ac_dump_thread_trace() to ac_dump_rgp_capture()
RGP captures can contain both SQTT and SPM data. While we are at it,
move it to ac_rgp.h and adjust a message.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11156>
2021-06-03 15:39:34 +00:00
Chia-I Wu
7ebd658e28 radv: use vk_default_allocator
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11117>
2021-06-03 08:13:26 +00:00
Samuel Pitoiset
380ac28891 ac: import performance counters from RadeonSI
Performance counters will be used by RADV for VK_KHR_performance_query
and also for adding SPM support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11140>
2021-06-03 07:15:21 +00:00
Rhys Perry
903f814b78 aco: don't create 4 and 5 dword NSA instructions on GFX10
"stability issues", apparently: https://reviews.llvm.org/D103348

fossil-db (Navi10):
Totals from 4512 (3.01% of 149839) affected shaders:
VGPRs: 221516 -> 223308 (+0.81%); split: -0.07%, +0.88%
CodeSize: 23000080 -> 23070672 (+0.31%); split: -0.08%, +0.39%
MaxWaves: 107718 -> 107496 (-0.21%); split: +0.11%, -0.32%
Instrs: 4321890 -> 4362822 (+0.95%); split: -0.00%, +0.95%
Latency: 71495710 -> 71581476 (+0.12%); split: -0.07%, +0.19%
InvThroughput: 11858568 -> 11938960 (+0.68%); split: -0.00%, +0.68%
VClause: 76575 -> 76585 (+0.01%); split: -0.05%, +0.07%
SClause: 168771 -> 168709 (-0.04%); split: -0.06%, +0.02%
Copies: 182305 -> 221948 (+21.75%); split: -0.00%, +21.75%
PreVGPRs: 194657 -> 195635 (+0.50%); split: -0.00%, +0.50%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: c353895c92 ("aco: use non-sequential addressing")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10898>
2021-06-03 03:49:07 +00:00
Rhys Perry
bb52484df5 aco/tests: improve reporting of failed code checks
Instead of just reporting the failed statements, print where they
originated. This is useful for tests which have a number of similar
checks.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10898>
2021-06-03 03:49:07 +00:00
Rhys Perry
9bf30c4a5c aco/tests: add tests for form_hard_clauses()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10898>
2021-06-03 03:49:07 +00:00
Rhys Perry
81162265b1 aco: do not clause NSA instructions
According to LLVM, this has "unpredictable results on GFX10.1".

https://reviews.llvm.org/D102211

fossil-db (Navi10):
Totals from 26690 (17.81% of 149839) affected shaders:
CodeSize: 167935160 -> 167706280 (-0.14%); split: -0.14%, +0.00%
Instrs: 31801427 -> 31744142 (-0.18%); split: -0.18%, +0.00%
Latency: 732672435 -> 732622463 (-0.01%)
InvThroughput: 163361435 -> 163357838 (-0.00%); split: -0.00%, +0.00%
VClause: 546131 -> 546903 (+0.14%); split: -0.00%, +0.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: c353895c92 ("aco: use non-sequential addressing")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10898>
2021-06-03 03:49:07 +00:00
Andres Gomez
15e41b576b ci: fix the vkd3d-proton runner
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11087>
2021-06-02 16:30:57 +00:00
Mike Blumenkrantz
89bac36f09 radv: explicitly load a desc set layout struct member during set allocate
accessing this variable repeatedly like this is a contended hotpath somehow,
so instead just create a const for reference

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11124>
2021-06-02 13:31:43 +00:00
Mike Blumenkrantz
79742d41c0 radv: declare index_va in a single call for indexed draw packet emit
this is an extreme hotpath, so having a single calculation in a const
variable is slightly better for compiler microoptimizing

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11124>
2021-06-02 13:31:43 +00:00
Tomeu Vizoso
67af3b6bba ci/lava: Switch LAVA jobs to x86 runners
So we don't need to provision aarch64 servers, which are these days
rarer than x8_64.

In the switch to the new runner tags, switch to one which contains the
device type, so we can dimension the runner jobs taking into account the
number of DUTs available.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11108>
2021-06-02 08:14:51 +02:00
Samuel Pitoiset
a70c3e5c8a ac/rgp: bump the SQTT file minor version to 5
To match latest RGP spec. Captures generated by RADV still work
with latest RGP (v1.10).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11063>
2021-06-01 06:09:49 +00:00
Samuel Pitoiset
c3a4ca2908 ac/rgp: mark SQTT_FILE_CHUNK_TYPE_ISA_DATABASE as deprecated
This is now deprecated and reserved for future uses.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11063>
2021-06-01 06:09:49 +00:00
Rhys Perry
0f8fef1261 radv: make attrib_end variable in radv_flush_vertex_descriptors 32-bit
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 1e9dc0474e ("radv: make radv_pipeline::attrib_ends 32bit")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11089>
2021-05-31 18:28:12 +00:00
Timur Kristóf
aabe9d2f6e aco: Eliminate SALU comparison when SCC can be used instead.
For example:

s0, scc = s_and_u32 ...
scc = s_cmp_eq_u32 s0, 0
p_cbranch_sccz

is turned into:

s0, scc = s_and_u32 ...
p_cbranch_sccnz

Fossil DB results on Sienna Cichlid:

Totals from 85267 (56.91% of 149839) affected shaders:
CodeSize: 202539256 -> 202237268 (-0.15%)
Instrs: 38964493 -> 38888996 (-0.19%)
Latency: 750062328 -> 749913450 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 167408952 -> 167405157 (-0.00%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
a93092d0ed aco: Use s_cbranch_vccz/nz in post-RA optimization.
A simple post-RA optimization which takes advantage of the
s_cbranch_vccz and s_cbranch_vccnz instructions.

It works on the following pattern:

vcc = v_cmp ...
scc = s_and vcc, exec
p_cbranch scc

The result looks like this:

vcc = v_cmp ...
p_cbranch vcc

Fossil DB results on Sienna Cichlid:

Totals from 4814 (3.21% of 149839) affected shaders:
CodeSize: 15371176 -> 15345964 (-0.16%)
Instrs: 3028557 -> 3022254 (-0.21%)
Latency: 21872753 -> 21823476 (-0.23%); split: -0.23%, +0.00%
InvThroughput: 4470282 -> 4468691 (-0.04%); split: -0.04%, +0.00%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
0e4747d3fb aco: Introduce a new, post-RA optimizer.
This commit adds the skeleton of a new ACO post-RA optimizer,
which is intended to be a simple pass called after RA, and
is meant to do code changes which can only be done
after RA.

It is currently empty, the actual optimizations will be added
in their own commits. It only has a DCE pass, which deletes
some dead code generated by the spiller.

Fossil DB results on Sienna Cichlid:

Totals from 375 (0.25% of 149839) affected shaders:
CodeSize: 2933056 -> 2907192 (-0.88%)
Instrs: 534154 -> 530706 (-0.65%)
Latency: 12088064 -> 12084907 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 4433454 -> 4432421 (-0.02%); split: -0.02%, +0.00%
Copies: 81649 -> 78203 (-4.22%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
6f3c472f2e aco: New writeout overloads for the test framework.
These will be used by future tests.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
8d37aa91d6 aco: Add Operand(Temp, PhysReg) constructor.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
4491b94d58 aco: Don't DCE instructions that write non-temps, eg. exec.
No Fossil DB changes.
This commit makes DCE usable after RA.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Samuel Pitoiset
ea5f1fa279 radv: fix generating hang reports if mutable descriptors are used
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11031>
2021-05-28 10:53:02 +00:00
Samuel Pitoiset
380742b9f3 radv: fix missing default state for DB_DFSM_CONTROL
Fixes: 69ae02151d ("radv: remove DFSM")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11028>
2021-05-27 11:48:29 +00:00
Samuel Pitoiset
b9ff51f750 radv: move all game workarounds to drirc
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10978>
2021-05-26 18:48:04 +00:00
Samuel Pitoiset
8aa735e856 radv: add few new drirc options
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10978>
2021-05-26 18:48:04 +00:00
Samuel Pitoiset
69ae02151d radv: remove DFSM
DFSM has never been enabled by default because it was slower.
RadeonSI is also dropping support for this because they discovered
that's actually not efficient in practice.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10968>
2021-05-26 17:22:14 +00:00
Samuel Pitoiset
f8f963f800 radv: stop reporting ACO from the device name
ACO is the default compiler for almost a year from now, so it should
be fine to replace RADV/ACO by just RADV. LLVM is still added
when RADV_DEBUG=llvm is used for convenience.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10972>
2021-05-26 15:58:54 +02:00
Rhys Perry
b5f2af86cf radv: fix formatting of radv_dri_options
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10980>
2021-05-26 13:29:47 +00:00
Rhys Perry
665f11e829 radv: add radv_absolute_depth_bias
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10980>
2021-05-26 13:29:47 +00:00
Mike Blumenkrantz
ceb7225057 radv: set maxVertexInputAttributeOffset to UINT32_MAX
this is what amdvlk uses

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10827>
2021-05-26 12:24:39 +00:00
Mike Blumenkrantz
1e9dc0474e radv: make radv_pipeline::attrib_ends 32bit
this is needed to support larger vertex attribute offsets

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10827>
2021-05-26 12:24:39 +00:00
Rhys Perry
7d23ea20a0 radv: don't allocate DCC predicate if the image doesn't use DCC
Fixes replay of RenderDoc captures created before a7c0cf500b.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10983>
2021-05-26 12:06:33 +00:00
Samuel Pitoiset
729ebe4b17 aco: fix emitting discard when the program just ends
For fragment shaders that only contain a discard, the exec mask has
to be zero'd and everything discarded.

It seems unnecessary to emit an export here because if the FS has no
exports, the compiler already emits a null export at the end.

Fixes incorrect hair rendering in Detroit: Become Human.

fossil-db (Sienna Cichlid):
Totals from 3 (0.00% of 149839) affected shaders:
CodeSize: 2896 -> 2872 (-0.83%)
Instrs: 556 -> 553 (-0.54%)
Latency: 29266 -> 29214 (-0.18%)
InvThroughput: 3374 -> 3372 (-0.06%)

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10955>
2021-05-26 10:32:59 +00:00
Samuel Pitoiset
9984ebf173 radv: use radv_dcc_enabled() for the FB mip flush workaround
This has no effects because radv_image_has_CB_metadata() still
accounts for DCC which is incorrect. This should be changed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10667>
2021-05-26 06:59:35 +00:00
Samuel Pitoiset
4631a52f8d radv: do not decompress DCC for partial resolves if stores are supported
It seems unnecessary.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10667>
2021-05-26 06:59:35 +00:00
Samuel Pitoiset
7af5a0c1b9 radv: only init DCC if compressed in the HW resolve path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10667>
2021-05-26 06:59:35 +00:00
Samuel Pitoiset
ff38e3aadd radv: only mark DCC as compressed when drawing if layout allows it
Just having DCC enabled on the base level doesn't mean we are
using compressed rendering.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10667>
2021-05-26 06:59:35 +00:00
Samuel Pitoiset
75d7c752af radv: remove redundant call to radv_dcc_enabled()
radv_layout_dcc_compressed() is now per level.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10667>
2021-05-26 06:59:35 +00:00
Samuel Pitoiset
bdb9634151 radv: pass an image range to radv_layout_dcc_compressed()
With DCC and mipmaps, some mips can't be compressed and it makes
sense to check this here.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10667>
2021-05-26 06:59:35 +00:00
Andres Gomez
07b86e64a5 ci: add VKD3D-Proton testsuite job for radv's Navy Flounder
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10870>
2021-05-25 17:03:25 +00:00
Andres Gomez
537c9460fa ci: add radv's trace job for Navy Flounder
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10870>
2021-05-25 17:03:25 +00:00
Andres Gomez
a71ffa4592 ci: uprev DXVK to 1.8.1
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10870>
2021-05-25 17:03:25 +00:00
Andres Gomez
fa8ca10e27 ci: remove radv's trace job for Polaris10
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10870>
2021-05-25 17:03:25 +00:00
Andres Gomez
f0f812dbe7 ci: update radv's trace job tag for Raven
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10870>
2021-05-25 17:03:25 +00:00
Marek Olšák
13acbaecd8 radeonsi: rewrite the prefix sum computation for shader culling
Instead of storing the vertex mask per wave into LDS and then computing
the prefix sum, store 8-bit bitcounts (vertex counts) of the vertex masks
into LDS. This allows us to compute the sum using v_sad_u8, which computes
a sum of 4 i8vec4 components in one instruction.

Each i8vec4 of vertex counts is loaded in parallel threads (one dword
per thread) instead of all being loaded in thread 0, and readlane copies
them to SGPRs instead of readfirstlane.

LDS is no longer initialized before culling. Instead, the counts for
inactive waves are masked with AND later.

Incorrect old comments are also fixed.

This change removes 80 bytes from the code size, and it allows increasing
the workgroup size from 128 to 256. (which is the main motivation for this)

Now changing the workgroup size with wave64 has no effect on the code size.
Switching to wave32 with 8 waves even generates slightly smaller code than
wave64 with 4 waves.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
2021-05-25 16:15:44 +00:00