When moving constants, if switching to a floating-point representation
doesn't break anything, we'd rather have an fmov than an imov,
permitting inlining the constant in many circumstances.
total quadwords in shared programs: 3408 -> 3366 (-1.23%)
quadwords in affected programs: 1188 -> 1146 (-3.54%)
helped: 41
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 9.65% x̃: 11.11%
95% mean confidence interval for quadwords value: -1.07 -0.98
95% mean confidence interval for quadwords %-change: -11.38% -7.93%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Storing constants as float doesn't make sense when we have integer
instructions; better to switch to be integer natively and coerce to/from
float rather than the opposite.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We can statically determine from the disassembly if helper invocations
will be needed, so we can validate the corresponding bit in the
cmdstream and thus avoid printing the bit itself in the decode.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We check for texture ops which calculate derivatives (either explicitly
via dFd* or implicitly) and mark the shader as requiring helper
invocations.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Rather than passing through the transformed gl_Position, we can use the
hardware-level varying for this, which will correctly handle
gl_FragCoord.w
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We need a special path for special varyings so we parse them correctly
instead of throwing an error when they inevitably point to bad memory.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The hardware doesn't care, and a lot of Panfrost code relies on an
oversized buffer. The important part is that (stride *
padded_num_vertices) is no greater than size, which we'll need to check
once we validate instancing.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We don't need to dump the contents necessary, but having the stub with
the address is useful.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
I don't know who thought this mask was a good idea but unfortunately it
must have been me.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
If we permit more $whatever through than the shader needs, that's a bit
of a waste, but it isn't an error.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We don't actually care about the *contents* of the index buffer, but we
would rather like to ensure it is present and of the correct size.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We can infer these stats in many cases from the disassembly, so we
should try to sanity check where we can. We may need to be fuzzy about
analysis, since analysis gives us a bound but we don't mind if it's not
used fully by the shader.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We could do better by forcing the checks to *equal* zero (right now, an
indeterminate answer will pass the checks), but this is a start to guard
against some egregious cases.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
There are a number of conditions we need to test for to statically check
for TILE_RANGE_FAULTs, but once these checks are in order, we can print
as-is.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
These tags need to match up with what's actually described by the MFBD,
so check this. Once this is checked, since the type and contents of the
FBD are obvious from printing above, there's no need to explicitly mark
off the framebuffer line.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
For shaders using exclusively direct attribute/varyings, we can work
this out statically. For shaders with indirect access, we just set an
upper bound of 16 (the max attributes/varyings we support) and the
actual count will be reported regardless.
We proceed similarly for textures/samplers, as well as for UBOs. While
UBOs can be *indexed* indirectly, the *UBO itself* -- which is what we
count in the shader descriptor (rather than the UBO descriptors) -- is
statically determinable.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This one is a little tricky, but the idea is that:
r16-r23 are always uniforms
r8-r15 are sometimes work, sometimes uniforms...
...but as work, they are always written before use
...and as uniforms, they are never written before use
So we use that heuristic to determine the count to feed the machine.
We'll record work register use in the next commit.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Panfrost is the only user of the macro; we are better off expanding than
having random stuff in nir.h.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
A pair of special flags can turn the texture/sampler handle fields into
register selects. This means code like:
texture(uTextures[hr28.w], ...)
can be compiled to something like:
texture ..., fsampler[hr28.w], texture[hr28.w]
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This data structure is shared in other parts of the texture word, so
let's streamline printing.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This allows nodes to be unsigned and prevents a class of weird
signedness bugs identified by Coverity.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This path shouldn't be possible for in-spec shaders, but let's be
defensive. (Because security, right? Mostly because Coverity.)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We can smush this into one-line per record as per usual. We still need
more validation and cleaning this up, especially around instancing. But
for LINEAR records, it works okay already.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This consolidates texture format and dimensionality into something simple:
tiled rgba8_unorm.rgb1: 512x512
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>