Commit graph

39979 commits

Author SHA1 Message Date
Erico Nunes
65e6c42d27 lima/ppir: fix branch codegen register encode
The branch instruction has 6 bits per register operand which allows it
to specify a component in the register.
Fix codegen so that it outputs the right component, otherwise it always
outputs the x component.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-07-23 08:49:19 +00:00
Erico Nunes
a255b49593 lima/ppir: fix debug logs in regalloc
The macros already prepend "ppir: ", remove them from the actual strings
so it doesn't appear duplicated.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-07-23 08:24:19 +00:00
Erico Nunes
9254059dd8 lima/ppir: fix alignment on regalloc spilling loads
The spilling code spills entire vec4 registers regardless of the
components used by the spilled uses.
The inserted stores code force the 4 components, but these loads were
using a variable number of components, causing bugs on loading the
spilled registers.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-07-23 08:24:19 +00:00
Ilia Mirkin
affb2da0f8 gallium: remove boolean from state tracker APIs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-22 22:13:51 -04:00
Ilia Mirkin
0e30c6b8a7 gallium: switch boolean -> bool at the interface definitions
This is a relatively minimal change to adjust all the gallium interfaces
to use bool instead of boolean. I tried to avoid making unrelated
changes inside of drivers to flip boolean -> bool to reduce the risk of
regressions (the compiler will much more easily allow "dirty" values
inside a char-based boolean than a C99 _Bool).

This has been build-tested on amd64 with:

Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d
                 vc4 i915 svga virgl swr panfrost iris lima kmsro
Gallium st:      mesa xa xvmc xvmc vdpau va

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 22:13:51 -04:00
Marek Olšák
cb9eb1834d radeonsi: fix warning: ‘ret’ may be used uninitialized
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-07-22 20:57:44 -04:00
Marek Olšák
850619117e tgsi: fix warning: ‘interp’ may be used uninitialized
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-07-22 20:57:44 -04:00
Marek Olšák
f257ef2bbb gallivm: fix warning: ‘op’ may be used uninitialized
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-07-22 20:57:44 -04:00
Kenneth Graunke
7cdde962c5 iris: Support storage images that have matching typed formats for reads
Even if we don't directly support typed reads on a format, we can often
translate them to a reasonable matching format.  Advertise those too.
2019-07-22 17:30:13 -07:00
Kenneth Graunke
2f1c7fae9e iris: Stop advertising MSAA storage images by mistake
st_extensions.c sets const->MaxImageSamples (GL_MAX_IMAGE_SAMPLES) by
looping over [16, 15, .. 1x] MSAA modes, and RGBA/BGRA/ARGB/ABGR 8888
color formats, calling pipe->is_format_supported() for each, with
the usage set to PIPE_BIND_SHADER_IMAGE.  If any are supported, it
selects that number of samples.

We were checking if sample_count <= 1, which meant that we were getting
a value of 1x MSAA, rather than the expected 0x (feature doesn't exist).

But, only on Icelake because Gen11 adds support for typed read messages
for R8G8B8A8_UNORM.  The lack of typed read messages for these formats
was tricking the check on Gen9 to say no correctly.  This caused some
Icelake conformance failures, because we don't implement this feature.

Just check for sample_count == 0 instead.
2019-07-22 17:30:13 -07:00
Alyssa Rosenzweig
f1dcaa0df6 panfrost: Set initialized in more cases
Indirect linear writes were not being marked as initialized, causing the
back blit to be dropped, breaking the listed tests.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig
9e3dc703ff panfrost/ci: Update expectations
We've fixed some shader tests.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig
21510c253c panfrost/midgard: Implement register spilling
Now that we run RA in a loop, before each iteration after a failed
allocation we choose a spill node and spill it to Thread Local Storage
using st_int4/ld_int4 instructions (for spills and fills respectively).

This allows us to compile complex shaders that normally would not fit
within the 16 work register limits, although it comes at a fairly steep
performance penalty.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Iago Toral Quiroga
dacaf7ec06 v3d: fill logicop_func in the fragment shader key when precompiling shaders
Since logicop_func 0 is PIPE_LOGIOP_CLEAR, we were trigger lowerinng
of logic ops on precompiled shaders, which we don't want to do. Also, this
had the side effect of making shader-db crash, as during this lowering we
would try to read the color format swizzle information from the fragment shader
key that we don't populate in precompiled shaders because right now we only
need it when logic operations are enabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-22 08:05:59 +02:00
Chia-I Wu
d31d25f634 virgl: fix a sync issue in virgl_buffer_transfer_extend
In virgl_buffer_transfer_extend, when no flush is needed, it tries
to extend a previously queued transfer instead if it can find one.
Comparing to virgl_resource_transfer_prepare, it fails to check if
the resource is busy.

The existence of a previously queued transfer normally implies that
the resource is not busy, maybe except for when the transfer is
PIPE_TRANSFER_UNSYNCHRONIZED.  Rather than burdening us with a
lengthy comment, and potential concerns over breaking it as the
transfer code evolves, this commit makes the valid_buffer_range
check the only condition to take the fast path.

In real world, we hit the fast path almost only because of the
valid_buffer_range check.  In micro benchmarks, the condition should
always be true, otherwise the benchmarks are not very representative
of meaningful workloads.  I think this fix is justified.

The recent change to PIPE_TRANSFER_MAP_DIRECTLY usage disables the
fast path.  This commit re-enables it as well.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-07-19 18:04:42 -07:00
Chia-I Wu
324c20304e virgl: rework virgl_transfer_queue_extend
Do not take a transfer and do the memcpy.  Add a _buffer suffix to
the function name to make it clear that it is only for buffers.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-07-19 18:04:37 -07:00
Chia-I Wu
2b8ad88078 virgl: fix virgl_buffer_transfer_extend
Without setting hw_res, virgl_transfer_queue_extend never finds a
match and always returns NULL.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-07-19 18:04:34 -07:00
Marek Olšák
bcabf75ab7 radeonsi: initialize scissor registers etc. without clear state
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:56 -04:00
Marek Olšák
47f41af06c radeonsi: return success from vi_dcc_clear_level to simplify callers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:54 -04:00
Marek Olšák
7a764b963a radeonsi: fix compute-based culling regression in 1ce52c1e37
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:50 -04:00
Marek Olšák
c741bed6e8 radeonsi/gfx10: fix VGT_PRIMITIVE_TYPE programming
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
a0d330bedb radeonsi/gfx10: enable Wave32 for vertex, geometry, and tessellation shaders
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
1d82240f55 radeonsi/gfx10: add debug options to enable/disable Wave32
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
8f72f137ad radeonsi/gfx10: add as_ngg variant for TES as ES to select Wave32/64
Legacy GS has to use Wave64, so TES before GS has to use Wave64 too.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
88efb63caf radeonsi/gfx10: implement Wave32
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
54e6900ede radeonsi/gfx10: use 32-bit wavemasks for Wave32
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
81091a5183 ac: create the LLVM builder in ac_llvm_context_init
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
eb54b8c222 ac: create the LLVM module for Wave32 or Wave64 in ac_llvm_context_init
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
921c1d24d5 ac/rtld: add support for Wave32
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
9e467d111b ac: initial Wave32 support in LLVM build helpers
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
c35e926a81 radeonsi: assume that selector != NULL for compute shaders
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:48 -04:00
Marek Olšák
bf0f0697a1 radeonsi: remove what appears to be legacy compute code
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:47 -04:00
Marek Olšák
be67a275b5 radeonsi: remove si_program::use_code_object_v2
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:45 -04:00
Marek Olšák
fd92e65feb radeonsi: add si_shader_selector into si_compute
Now we can assume that shader->selector is always set.
This will simplify some code.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:43 -04:00
Marek Olšák
e2c8ff009e radeonsi: set threadgroup size to 0 for threadgroups with only 1 wave
This has no effect on Wave64.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:39 -04:00
Marek Olšák
a8a526c5cb radeonsi/gfx10: set as_ngg for GS prolog
as_ngg is required by Wave32.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
d3a80f2dda radeonsi/gfx10: remove the disable_ngg option
because legacy VS hangs.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
0f30223cf4 radeonsi/gfx10: combine hw edgeflags with user edgeflags for correct behavior
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
bfaca7259c radeonsi/gfx10: deduplicate code for esvert_lds_size
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
a6722285c2 radeonsi/gfx10: simplify a streamout loop in gfx10_emit_ngg_epilogue
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
2683347ba0 radeonsi/gfx10: don't use MALLOC for outputs
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
1b4354dab9 radeonsi/gfx10: clean up ESGS ring size computation
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
37db9d2865 radeonsi/gfx10: fix unnecessary LDS overallocation for NGG GS
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
985a59e0d1 radeonsi/gfx10: don't compile the GS copy shader if it's 100% not needed
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
7f0ada3f3e radeonsi/gfx10: set GE_CTNL.PACKET_TO_ONE_PA for NGG
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
e08463ac22 radeonsi/gfx10: update a tunable max_es_verts_base for NGG
We have to fix the computation so as not to break quads.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
79d56e6a4a radeonsi/gfx10: implement ARB_post_depth_coverage
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
a57f0f8a6b radeonsi: fix leaked compute shader NIR
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:37 -04:00
Marek Olšák
98377d3450 radeonsi: save the enable_nir option in the shader cache correctly
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:35 -04:00
Marek Olšák
d227b91d2e radeonsi/gfx10: enable SDMA
no changes since gfx9 for buffers

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00