This delays the waitcnt for has_attr_ring_wait_bug by a few instructions.
fossil-db (gfx1201):
Totals from 9 (0.00% of 208640) affected shaders:
Instrs: 19352 -> 19506 (+0.80%)
CodeSize: 101180 -> 101716 (+0.53%)
Latency: 660221 -> 678782 (+2.81%); split: -0.00%, +2.81%
InvThroughput: 95106 -> 97398 (+2.41%)
fossil-db (navi33):
Totals from 58834 (28.20% of 208626) affected shaders:
Instrs: 22424304 -> 22424571 (+0.00%)
CodeSize: 110198112 -> 110199184 (+0.00%)
Latency: 115894319 -> 126491124 (+9.14%); split: -0.00%, +9.14%
InvThroughput: 19424631 -> 19754358 (+1.70%); split: -0.00%, +1.70%
I don't think the stats are very accurate. This seems to often move the
s_waitcnt down into a divergent branch, but the wait still happens later
if the branch isn't taken, so the wait is counted twice.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>
This shouldn't fix anything, because event_vmem_bvh was never used here.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>
First, use 64bit values everywhere since shader_info::outputs_written
is a 64bit field.
Second, alpha to coverage should only be considered for draw buffer 0
as stated in the GL spec (quoting Version 4.6 (Core Profile), 17.3.1
Alpha To Coverage) :
"All alpha values in this section refer only to the alpha component
of the fragment shader output linked to color number zero, index
zero (see section 15.2.3)."
Third, the write message setup in brw_compile_fs.cpp was not taking
into account alpha-to-coverage being disabled anymore.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 294644643e ("brw: avoid requiring a valid render target for empty fragment shaders")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15625
Tested-by: Christoph Neuhauser <christoph.neuhauser@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42115>
This helper was only meant to be called once the driver knows it
doesn't have any render target setup, to figure out whether an empty
one needs to be created.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Tested-by: Christoph Neuhauser <christoph.neuhauser@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42115>
We are generally fine with undefs. However, booleans are only allowed to
contain 0/1 but are stored in 16b registers. Undefs may cause such
registers to be uninitialized and contain values other than 0/1. This
especially happens with undef booleans in phi srcs, which are explicitly
left uninitialized. In general, such non-0/1 values don't cause problems
because we mostly use booleans by comparing them to 0. However, they do
cause problems in special cases like `inot(x)` which we lower to `sub(1,
x)` which only works if true==1.
Fixes misrenderings in "Kingdoms of Amalur: Reckoning".
Totals from 12 (0.01% of 176258) affected shaders:
Instrs: 14590 -> 14615 (+0.17%)
CodeSize: 29796 -> 29808 (+0.04%)
NOPs: 3091 -> 3098 (+0.23%); split: -0.03%, +0.26%
MOVs: 735 -> 748 (+1.77%)
(sy)-stall: 4509 -> 4508 (-0.02%)
Cat0: 3471 -> 3483 (+0.35%); split: -0.03%, +0.37%
Cat1: 1257 -> 1270 (+1.03%)
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42051>
Some backends may want to lower some, but not all, undefs.
For example, in ir3, we are generally fine with undefs. However,
booleans are only allowed to contain 0/1 but are stored 16b registers.
Undefs may cause such registers to be uninitialized and contain values
other than 0/1.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42051>
In anv_shader_lower_nir() we call anv_ensure_fp64_shader() only if
device->fp64_nir is set, but the function that sets device->fp64_nir
is anv_ensure_fp64_shader()! That means the only way for
device->fp64_nir to be set is through blorp: if the app does not issue
the blorp shader that uses fp64, then we won't have it.
Change the check to be like the one we have in blorp_compile_fs_brw().
Fixes: 7d3b62e13d ("anv: only load fp64 software shader when needed")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42100>
The field was never populated (initialized to 0 with a TODO) yet
printed in dump_shader_info(..) and etna_dump_shader(..), giving
the misleading impression that loop counts were tracked. Remove
the field and its consumers.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41560>
This comes from the compiler bindings and lately we get
error[E0659]: `PIPE_FORMAT_R8_UINT` is ambiguous
--> ../src/nouveau/compiler/nak/from_nir.rs:2193:13
|
2193 | PIPE_FORMAT_R8_UINT => MemType::U8,
| ^^^^^^^^^^^^^^^^^^^ ambiguous name
|
= note: ambiguous because of multiple glob imports of a name in the same module
note: `PIPE_FORMAT_R8_UINT` could refer to the constant imported here
--> ../src/nouveau/compiler/nak/from_nir.rs:12:5
|
12 | use nak_bindings::*;
| ^^^^^^^^^^^^^^^
= help: consider adding an explicit import of `PIPE_FORMAT_R8_UINT` to disambiguate
note: `PIPE_FORMAT_R8_UINT` could also refer to the constant imported here
--> ../src/nouveau/compiler/nak/from_nir.rs:14:5
|
14 | use compiler::bindings::*;
| ^^^^^^^^^^^^^^^^^^^^^
= help: consider adding an explicit import of `PIPE_FORMAT_R8_UINT` to disambiguate
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42109>
In particular if a queue is going to block, a trace may end with
buffered end traces before the queue gets it's next job. So flush
before blocking. (Also on the producer side, since it is easy.)
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42013>