Avoid building all those store 0 / store undef instruction pairs that
end up getting removed anyway.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Also, prepare for using tgsi_array_info.
This also opens the door for properly handling allocation failures, but I'm
leaving that for a separate change.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Doing the write-back of the temporary vector in radeon_llvm_emit_store makes
no sense.
This also allows us to get rid of get_alloca_for_array.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We can use the pointer stored in the temps array directly.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We can use the pointer stored in the temps array directly.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
In the alloca'd array case, no longer create redundant and unused allocas
for the individual elements; create getelementptrs instead.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
It's fairly rare that the BB layout puts BBs after the exit block, which
is likely the reason these issues lingered for so long.
This fixes a fraction of issues with the giant pixmark piano shader.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
All three fields of the vbuffer_attrs[] array are assigned in the following
loop. The remaining elements of the array are not used.
Tested with full Piglit run, Heaven 4.0, etc.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
We don't need to loop over the max number of color buffers, just the
current number (which is usually one).
Tested with full Piglit run, Heaven 4.0, etc.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Fixes regression with team_fortress_2 trace.
This change has been in our in-house tree for some time.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Since clears are more or less just normal draws, there isn't that much
benefit in having hand-rolled clear path. Add support to use u_blitter
instead if gen specific backend doesn't implement ctx->clear().
Signed-off-by: Rob Clark <robdclark@gmail.com>
Not (currently) state that is overwridden by u_blitter itself, but
drivers with custom blit/clear which are reusing part of the u_blitter
infrastructure will use it.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Apply the median and matrix filter before the compostioning
we apply the deinterlacing first to avoid the extra overhead
in processing the past and the future surfaces in deinterlacing.
v2: apply the filters on all the surfaces (Christian)
v3: use get_sampler_view_planes() instead of
get_sampler_view_components() and iterate over
VL_MAX_SURFACES (Christian)
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
These are the same for a3xx and later. (a2xx could probably use them
too, but due to limited hw support and ancient downstream kernels, it
isn't so easy to test.)
Signed-off-by: Rob Clark <robdclark@gmail.com>
For various tex fetch instructions, coord's get fixed up in different
ways. But modifying the array returned from get_src() has side-effects
if the same SSA src is used again.. the later instruction will see the
previous fixups.
Fix this, and const'ify things to prevent this sort of mistake in the
future.
Noticed by Varad when adding support for txf_ms.
Signed-off-by: Rob Clark <robdclark@gmail.com>