Commit graph

113303 commits

Author SHA1 Message Date
Eric Engestrom
367bb55c17 util: drop unused vsprintf() wrapper
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-19 22:39:38 +01:00
Eric Engestrom
e7db1806af util: drop unused strchr() wrapper
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-19 22:39:38 +01:00
Eric Engestrom
84e85035cf util: drop unused strstr() wrapper
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-19 22:39:38 +01:00
Jason Ekstrand
6301f80b84 nir: Only rematerialize comparisons with all SSA sources
Otherwise, you may end up moving a register read and that could result
in an incorrect shader.  This commit fixes a rendering issue in Elite:
Dangerous.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111152
Fixes: 3ee2e84c60 "nir: Rematerialize compare instructions"
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-07-19 19:45:36 +00:00
Daniel Schürmann
e352b4d650 spirv: Fix order of barriers in SpvOpControlBarrier
Semantically, the memory barrier has to come first to wait
for the completion of pending memory requests.
Afterwards, the workgroups can be synchronized.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-19 10:37:37 -07:00
Caio Marcelo de Oliveira Filho
4061a3f6c9 nir: use a switch when printing intrinsic indices
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-07-19 10:04:52 -07:00
Rhys Perry
e8644122ed nir/algebraic: mark a few comparison simplifications as precise
No vkpipeline-db changes found.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-07-19 16:33:01 +00:00
Rhys Perry
79801b9d7d nir/algebraic: optimize contradictory iand operands
Some of these were found in a few GTAV, Rise of the Tomb Raider and
Shadow of the Tomb Raider shaders.

Results from vkpipeline-db run with ACO:
Totals from affected shaders:
SGPRS: 376 -> 376 (0.00 %)
VGPRS: 220 -> 220 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 13492 -> 11560 (-14.32 %) bytes
LDS: 6 -> 6 (0.00 %) blocks
Max Waves: 69 -> 69 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

v2: use False instead of 0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-07-19 16:33:01 +00:00
Erico Nunes
32ced14bad lima/ppir: handle all node types in ppir_node_replace_child
ppir_node_replace_child is used by the const lowering routine in ppir.
All types need to be handled here, otherwise the src node is not updated
properly when one of the lowered nodes is a const, which results in, for
example, regalloc not assigning registers correctly.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-07-19 16:01:45 +00:00
Erico Nunes
2292f0c4b5 lima/ppir: branch regalloc fixes
The branch instruction has sources which must be handled in src handling
paths so that regalloc assigns registers to them properly.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-07-19 16:01:45 +00:00
Yevhenii Kolesnikov
32b72cbca5 main: Destroy static hash table
format_array_format_table has a static lifetime - it will be destroyed
by an atexit handler.

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-19 11:22:55 +03:00
Dave Airlie
248161123c radv: reset the window scissor with no clear state.
If we don't have clear state (which gfx10 doesn't currently)
we will fix to reset the scissor. AMDVLK will leave it set
to something else.

Marek also has this fix for radeonsi pending.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 11:00:44 +10:00
Dave Airlie
2ac2b98780 radv: fix crash in shader tracing.
Enabling tracing, and then having a vmfault, can leads to a segfault
before we print out the traces, as if a meta shader is executing
and we don't have the NIR for it.

Just pass the stage and give back a default.

Fixes: 9b9ccee4d6 ("radv: take LDS into account for compute shader occupancy stats")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 11:00:25 +10:00
Timothy Arceri
80c2c17e1e iris: change last_vue_stage() to look at uncompiled shaders
This allows us to find the last vue stage before we have compiled
the shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Timothy Arceri
30038dd5ec nir/lower_clip: add support for geometry shaders
This will be used to enabled compat profile support for geometry
shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Timothy Arceri
4b08bb4770 nir/lower_clip: add lower_clip_outputs() helper
This will be reused in the following patch to add support for clip
vertex lowering in geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Timothy Arceri
a59926b3ca nir/lower_clip: add create_clipdist_vars() helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Timothy Arceri
e38b930876 nir/lower_clip: add a find_clipvertex_and_position_outputs() helper
This will allow code sharing in a following patch that adds support
for lowering in geometry shaders. It also allows us to exit early
if there is no lowering to do which allows a small code tidy up.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Alyssa Rosenzweig
0395b58c92 panfrost: Set rt_count
This doesn't quite work yet, but it illustrates how MRT is implemented
in the MFBD: rt_count is set appropriately based on the number of render
targets, while additional render target descriptors are appended on with
an index variable in them (not quite decoded since there's some aspects
we don't understand there, but conceptually this should be right).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig
871ad7789f panfrost: Trace invisible BOs
Helps make the decode a little more readable (names instead of
addresses).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig
17752bae8e panfrost/decode: Preserve empty tiler heap symmetry
If tiler_heap_end == tiler_heap_start, ensure it's printed the same
rather than one erroring out as hex.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig
e797caa0dd panfrost: Zero polygon list body size for clears
There's no polygons, so you can't have any size to the polygon list,
although there is a minimal header.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig
f475b79980 panfrost/mfbd: Unify depth-only with masked FBO path
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig
629c7366a7 panfrost: Simplify set_framebuffer_state
Most of the ad hoc logic is already in Gallium.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig
227c395c00 panfrost: Check for NULL surface in places
Fixes a bunch of NULL dereferences, although it does cause GPU faults of
course.

This is caused by color buffers masked out in MRT, which we'll
eventually have to solve the right way... one thing at a time.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig
79b13b4376 panfrost: Expose 4 render targets
Hidden behind deqp flag as usual.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig
d56f92502e panfrost: Shrink tiler heap
128MB is excessive and 16MB is still plenty. Saves 112MB/context on
kernels without growable/heap support.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 15:25:16 -07:00
Caio Marcelo de Oliveira Filho
b6d4753568 nir/large_constants: De-duplicate constants
If a function has a constant and is called more than once, after
inlining we may end up with different variables representing the same
constant.  This commit look into the data and de-duplicate them.

The first pass now will collect the constant data in a per variable
buffer, then de-duplication happens (by sorting then linear walk), and
the second pass will use the data in var->data.location.

One side-effect of the current implementation is that constants will
be reordered.  If this turns out to be a problem is something that can
be fixed.

An alternative strategy considered was to perform this in a
per-function basis and then merge the results, the problem is that we
would have to fix up the offsets during the merge.  Given the data we
have, the current patch is good enough.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-18 12:24:24 -07:00
Caio Marcelo de Oliveira Filho
d9b67ad079 nir/large_constants: Use ralloc for var_infos
This will be used later on to allocate constant data for each
variable (and then deduplicate).  Also drop initializing found_read,
as it is already implicitly false in the literal.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-18 12:24:24 -07:00
Eric Anholt
0d8a4c67cf freedreno: Convert nir_lower_tg4_to_tex to the NIR lowering helper.
Cuts a bunch of boilerplate.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-07-18 11:28:56 -07:00
Eric Anholt
56f4ede73d freedreno: Convert load_barycentric_at_sample to the NIR lowering helper.
Cuts out a ton of boilerplate.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-07-18 11:28:56 -07:00
Eric Anholt
61098baf42 freedreno: Convert load_barycentric_at_offset to the NIR lowering helper.
Cuts out a ton of boilerplate.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-07-18 11:28:56 -07:00
Eric Anholt
cdc359c58e v3d: Use nir_shader_lower_instructions() for txf_ms lowering.
Cuts out a bunch of boilerplate.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-07-18 11:28:56 -07:00
Eric Anholt
251c64a53d nir: Allow internal changes to the instr in nir_shader_lower_instructions().
v3d's NIR txf_ms lowering wants to swizzle around the input coordinates in
NIR, but doesn't generate a new txf_ms instructions as replacement.  It's
pretty easy to allow that in nir_shader_lower_instructions, and it may be
common in lowering passes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-18 11:28:56 -07:00
Eric Anholt
c0640035fb vc4: Convert vc4_nir_lower_txf_ms to nir_shader_lower_instructions().
Cuts out a bunch of boilerplate.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-07-18 11:28:56 -07:00
Eric Anholt
40e7609603 v3d: Fix assertion failures in debug builds.
nir_lower_io leaves around deref_var instructions after lowering away
deref intrinsics.  This ends up breaking validation after v3d_nir_lower_io
removes variables not actually being stored by the shader's
store_output()s.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-07-18 11:28:56 -07:00
Alyssa Rosenzweig
1bced0fad2 panfrost: Handle Z24 textures
Just use the Z32 code.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig
f29c084960 panfrost/ci: Update expectations
We just fixed some stencil tests.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig
fad76470d5 panfrost: Make scissor test more robust
See v3d implementation.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig
5c554e235d panfrost: Use correct NO_DITHER field on MFBD
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig
676b9339dd panfrost: Implement Z32F(_S8) support
Z32F uses a dediacted float path. Z32F_S8 uses separate stencil planes
in the hardware, lowered via u_transfer_helper.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig
479185a1cd panfrost/decode: Don't disassemble NULL shaders
It is legal to load a shader from a NULL address, particularly when the
TILER job is used strictly for effects on the Z/S buffer with 0x0 color
mask. Don't crash the decoder in this case.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig
65d89097b8 panfrost: Copy stencil front to back if back disabled
When backside stenciling is disabled, backfacing primitives just do the
same thing as frontfacing primitives.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-18 10:42:43 -07:00
Jan Zielinski
6f7306c029 swr/rast: Refactor memory API between rasterizer core and swr
This commit cleans up API between the core of the rasterizer and swr.
Some formatting changes are also done.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-07-18 16:17:00 +02:00
Andreas Baierl
4627a0c4eb lima/ppir: Add gl_PointCoord handling
Treat gl_PointCoord as a system value and
add the necessary bits for correct codegen.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 13:20:39 +00:00
Andreas Baierl
3523233027 gallium: Add PIPE_CAP_TGSI_FS_POINT_IS_SYSVAL
This adds an option to treat gl_PointCoord as a system value.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 13:20:39 +00:00
Andreas Baierl
3349a60f6f nir/tgsi: Extend tgsi_to_nir.c to support gl_PointCoord as a system value.
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 13:20:39 +00:00
Andreas Baierl
f5804f1768 nir: Add gl_PointCoord system value
gl_PointCoord handling needs some special bits set in lima/ppir code
generation. Treating gl_PointCoord as a system value makes it easier
to distinguish from a regular varying.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 13:20:39 +00:00
Andreas Baierl
24af57407c glsl: Optionally declare gl_PointCoord as a system value
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 13:20:39 +00:00
Connor Abbott
b178fdf486 lima/gp: Fix problem with complex moves
When writing the scheduler, we forgot that you can't read the complex
unit in certain sources because it gets overwritten to 0 or 1. Fixing
this turned out to be possible without giving up and reducing
GPIR_VALUE_REG_NUM to 10, although it was difficult in a way I didn't
expect. There can be at most 4 next-max nodes that can't have moves
scheduled in the complex slot, so it actually isn't a problem for
getting the number of next-max nodes at 5 or lower. However, it is a
problem for stores. If a given node is a next-max node whose move cannot
go in the complex slot *and* is used by a store that we decide to
schedule, we have to reserve one of the non-complex slots for a move
instead of all the slots, or we can wind up in a situation where only
the complex slot is free and we fail the move. This means that we have
to add another term to the reservation logic, for stores whose children
cannot be in the complex slot.

Acked-by: Qiang Yu <yuq825@gmail.com>
2019-07-18 14:33:23 +02:00