The number of bits depends on generation. But printing negative values
with a5xx encoding (largest size) but compiling for a3xx or a4xx, would
result in negative values printed as large positive values.
I guess in practice huge negative branch offsets aren't likely (and if
that is the case, the shader is probably too big to grok by reading the
assembly). So just print using smallest bitfield size.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Clover now checks PIPE_SHADER_CAP_SUPPORTED_IRS for native support instead.
This change indirectly enables NIR support for compute shaders
on radeonsi.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
PIPE_SHADER_CAP_PREFERRED_IR was conflicting with PIPE_SHADER_IR_NIR
for compute shaders, so we let clover pick the one it wants to use.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Include the Mesa version and detail about the platform.
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
It is present from libva 2.1 (VAAPI 1.1.0 or higher).
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
When NIR is enabled, TGSI must not be used. When NIR is disabled, TGSI
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
v2: * Check whether the node src and dst registers are NULL before using
them.
* fix a type in the commit message.
Two cases are handled with this patch:
1. If copy propagation tries to eliminated a move from a relative
array access then it could optimize
MOV R1, ARRAY[RELADDR_1]
MOV R2, ARRAY[RELADDR_2]
OP2 R3, R1 R2
into
OP2 R3, ARRAY[RELADDR_1], ARRAY[RELADDR_2]
which is forbidden, because there is only one address register available.
2. When MULADD(x,a,MUL(x,c)) is handled
MUL TMP, R1, ARRAY[RELADDR_1]
MULLADD R3, R1, ARRAY[RELADDR_2], TMP
by folding this into
ADD TMP, ARRAY[RELADDR_2], ARRAY[RELADDR_1]
MUL R3, R1, TMP
which is also forbidden.
Test for these cases and reject the optimization if a forbidden combination
of relative access would be created.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
On r600 we use the scratch mem with read/read_ind, in that case
sb should track the rw_gpr as a dst instead of a src.
This stops the whole shader being optimised out.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Spills have to happen after the VLIW bundle currently
processed, so defer emitting the spill op.
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This fixes:
KHR-GL45.texture_gather.swizzle
on cayman and redwood.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add a build option to control building some of the misc tools we
have. Also set the executables to install, presumably you want
that if you're asking for the build.
v2: set 'install:' to the with_tools value, not true (Jordan)
handle 'all' in a the comma list (Dylan)
Add freedreno's tools (Dylan)
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
TGSI_INTERPOLATE_CONSTANT and TGSI_INTERPOLATE_LOC_CENTER have the
value zero so there's no change in behavior. It seems funny to
declare these fs input registers with constant interpolation. But
it looks like ureg_DECL_input_layout() is not called anywhere and
ureg_DECL_input() is only called from
util_make_geometry_passthrough_shader().
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
And add a default switch case to silence a compiler warning.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
And put static qualifier on const arrays.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This ports the texture gather integer workaround from radeonsi.
This fixes:
KHR-GL45.texture_gather.plain-gather-uint/int*
v2: add rect support, fix 2d array shadow
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (on irc)
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is taken from Glenn Kennards scratch series, but separated
out as a cleanup by me.
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The hw gives us coverage for pixel, not for individual fragment shader
invocations, in case execution isn't per pixel (eg, unlike cm, actually
cannot do "real" minSampleShading, it's either per-pixel or per-fragment,
but it doesn't really make a difference here).
Also, with msaa disabled, the hw still gives us a mask corresponding to
the number of samples, where GL requires this to be 1.
Fix this up by masking the sampleMaskIn bits with the bit corresponding to
the sampleID, if we know this shader is always executed at per-sample
granularity. (In case of a per-sample frequency shader and msaa disabled,
the sampleID will always be 0, so this works just fine there.)
Fixing this for the minSampleShading case will need a shader key (radeonsi
uses the prolog part for) (for eg, could get away with a single bit, cm
would need more bits depending on sample/invocation ratio, or read the
bits from a uniform), unless we'd want to always use a sample mask uniform
(which is probably not a good idea, as it would make the ordinary common
msaa case slower for no good reason).
This fixes some parts of piglit arb_sample_shading-samplemask (with fixed
test), in particular those which use a sampleID, still failing others
as expected.
Reviewed-by: Dave Airlie <airlied@redhat.com>
For some reason, we were iterating through the code twice (first just for
instructions needing barycentrics, then for instructions and input dcls).
Move things around slightly so this is no longer necessary.
There also was a unnedeed enabling of the fixed_pt_position_gpr - this is only
needed if the per-sample interpolation comes from an input, not from an
instruction (just move the assert where it belongs) (since the sample id to
sample from comes from a tgsi src in this case, and isn't sampleID).
Otherwise there should be no functional change.
Reviewed-by: Dave Airlie <airlied@redhat.com>
It seems these were missed when struct pipe_context * argument was
added to hud_graph::query_new_value.
Fixes: 3132afdf4c "gallium/hud: pass pipe_context explicitly to most functions"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>