Commit graph

19871 commits

Author SHA1 Message Date
Eduardo Lima Mitev
1d8111ebac i915g: Remove a few unused variables
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-28 08:59:50 +02:00
Marek Olšák
d500c9b060 Revert "radeonsi: get the raster config from AMDGPU on SI"
This reverts commit fc99cb3c9e.

"The performance went down from 64.7 to 51.4 fps in Valley and from 30.8 to
25.1 fps in Heaven on Radeon HD 7970. Other games seem to have also a 10-25%
performance decrease."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102429

It looks like we can't use the raster config values from the kernel.
2017-08-27 22:27:23 +02:00
Christian Gmeiner
67fc3e37a7 etnaviv: use correct param for etna_compatible_rs_format(..)
Found by code inspection.

Fixes: c9e8b49b88 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-26 17:20:39 +02:00
Ilia Mirkin
f623e1742f a2xx: fix DST_ALPHA blending for non-alpha formats
If we're rendering to a format without alpha, convert DST_ALPHA blend to
a ONE so that factors are properly computed. This same workaround is
done on a3xx+ as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-25 00:18:34 -04:00
Ilia Mirkin
f3bde890cd a2xx: set constant blend color
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-25 00:18:33 -04:00
Timothy Arceri
f40908f2d1 radeonsi: set IF_THRESHOLD to 4
In 74e39de932 it was set to 3 and it was reported that 4 caused
tesseract to start spilling VGPRs. This no longer seems to be the
case.

Totals:
SGPRS: 2787844 -> 2787764 (-0.00 %)
VGPRS: 1713121 -> 1712717 (-0.02 %)
Spilled SGPRs: 7532 -> 7532 (0.00 %)
Spilled VGPRs: 49 -> 33 (-32.65 %)
Private memory VGPRs: 2060 -> 2060 (0.00 %)
Scratch size: 2200 -> 2180 (-0.91 %) dwords per thread
Code Size: 79265520 -> 79248360 (-0.02 %) bytes
LDS: 436 -> 436 (0.00 %) blocks
Max Waves: 670535 -> 670608 (0.01 %)
Wait states: 0 -> 0 (0.00 %)

Before:
 VGPR SPILLING APPS   Shaders SpillVGPR  PrivVGPR ScratchSize
 EffectsCaveDemo          301         0       256       264
 ReflectionsSubwayDemo    264         0       256       264
 VehicleGame              295         0       128       132
 bioshock-infinite       1140         0       448       516
 dirt-showdown            453        33         0        28
 gang-beasts              364         0       500       496
 kerbal-space-program    1228         0       472       480
 tomb-raider-ultra       1199        16         0        20

After:
 VGPR SPILLING APPS   Shaders SpillVGPR  PrivVGPR ScratchSize
 EffectsCaveDemo          301         0       256       264
 ReflectionsSubwayDemo    264         0       256       264
 VehicleGame              295         0       128       132
 bioshock-infinite       1140         0       448       516
 dirt-showdown            453        33         0        28
 gang-beasts              364         0       500       496
 kerbal-space-program    1228         0       472       480

The only change in VGPR spills is the elimination of all spills
in Tomb Raider at Ultra settings. Closer examination shows that
the shaders go over the limit because they contain three
expressions a mul, rcp and ubo load. The ubo load is actually
used elsewhere and is therefore stored in a temp already in IR
such as tgsi but glsl ir counts it agaist the if cost.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-08-25 14:09:32 +10:00
Timothy Arceri
ea2515d780 glsl: pass shader source keys to the disk cache
We don't actually write them to disk here. That will happen in the
following commit.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-25 13:20:29 +10:00
Marek Olšák
fc99cb3c9e radeonsi: get the raster config from AMDGPU on SI
Not sure yet if we wanna do this on CIK and VI too.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-24 23:54:55 +02:00
Marek Olšák
28d5c30179 radeonsi: clean up setting GRBM_GFX_INDEX
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-24 23:54:55 +02:00
Marek Olšák
0b50f0915b radeonsi: move PA_SC_RASTER_CONFIG emission into a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-24 23:54:55 +02:00
Ilia Mirkin
96be442b77 nv50/ir: properly set sType for TXF ops to U32
All of the coordinates and LOD args are integers for TXF. This mostly
doesn't matter, except for converting into a levelZero=true operation by
removing an explicit zero LOD. For the comparison against zero to work
properly, the sType of the instruction has to be set correctly.

Fixes: KHR-GL45.robust_buffer_access_behavior.texel_fetch
Reported-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-24 08:41:57 -04:00
Tim Rowley
f0602dc920 swr: limit pipe_draw_info->restart_index usage
Only copy this value when in restart drawing mode.

Eliminates valgrind errors when running trivial programs.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-23 11:37:50 -05:00
Samuel Pitoiset
7fb4b6f270 radeonsi: fix wrong assertion in si_init_bindless_descriptors()
Bad mistake, sorry.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-23 17:13:44 +02:00
Leo Liu
89f75c9483 radeon/video: Return false explicitly for HEVC if not the case
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-23 10:51:14 -04:00
Nicolai Hähnle
83c5d12d9d gallium/radeon: fix saving multi-part command streams
Use the correct type to fix pointer arithmetic.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:09 +02:00
Nicolai Hähnle
6fdd7ba32e radeonsi: update comment describing indices into sctx->descriptors
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:01 +02:00
Samuel Pitoiset
94cc01105e radeonsi: do not assert when reserving bindless slot 0
When assertions were disabled, the compiler removed
the call to util_idalloc_alloc() and the first allocated
bindless slot was 0 which is invalid per the spec.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-23 13:38:56 +02:00
Samuel Pitoiset
f4ec41ecc4 radeonsi: rename some bindless-related helper functions
I think it makes more sense.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:37:07 +02:00
Samuel Pitoiset
9141d13214 radeonsi: minor cleanups in si_make_{texture,image}_handle_resident()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:37:05 +02:00
Leo Liu
398a299f7b radeon/vcn: enable P016 mode support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-22 15:13:34 -04:00
Leo Liu
df6c087a38 radeon/vcn: correct target buffer pitch calculation
since the way should be as same as UVD

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-22 15:12:19 -04:00
Marek Olšák
497506ad93 gallium: remove TGSI opcode SCS
use COS+SIN instead.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2017-08-22 16:42:17 +02:00
Marek Olšák
cdaaf66566 gallium: remove TGSI opcode BREAKC
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:33:48 +02:00
Marek Olšák
985e6b5ef9 gallium: remove TGSI opcode XPD
use MUL+MAD+MOV instead.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
3e2ff8fade gallium: remove TGSI opcode DPH
use DP4 or DP3 + ADD.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
86e6f7a73b gallium: remove TGSI opcode DP2A
use DP3 instead.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
0bb367830a gallium: remove TGSI_OPCODE_CALLNZ
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
068c3ad2cb gallium: remove TGSI FENCE opcodes
use MEMBAR instead

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
44716655e6 gallium: remove TGSI opcodes PUSHA, POPA, SAD, TXQ_LZ
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
8dadb07790 radeonsi: emit VGT_REUSE_OFF in the right place
clip_regs aren't marked dirty when writes_viewport_index is changed.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
a6fed63f27 radeonsi: add support for TGSI opcodes DCEIL, DFLR, DROUND, DSSG, DTRUNC
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
addd48194a radeonsi: use a faster version of PK2H
+ 4 piglit regressions, but it's correct accorcing to the GL spec and
performance is more important than piglit.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
dc2ac03669 radeonsi: don't decompress Z/S if there is no HTILE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
e96259fabe gallium/radeon: add helpers for whether HTILE is enabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
7dec48b81e radeonsi/gfx9: don't flush L2 metadata for DB if not needed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
aa64e24cb1 radeonsi/gfx9: don't flush L2 metadata for CB if not needed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
5b62eb237c radeonsi/gfx9: don't flush TC L2 between rendering and texturing if not needed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
287b0a28f4 radeonsi/gfx9: use correct TC flush flags when invalidating CB & DB
Now we can finally stop flushing L2 data.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
54c2c771bd radeonsi/gfx9: don't use GS scenario A for VS writing ViewportIndex
Vulkan doesn't do it anymore.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
776fcccabf gallium/radeon: clean up EOP_DATA_SEL magic numbers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
a57f588fa9 radeonsi/gfx9: set 'not a query' for r600_gfx_write_event_eop correctly
0 is PIPE_QUERY_OCCLUSION_COUNTER, which is not what we want.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
a65afda768 radeonsi/gfx9: prevent shader-db crashes
- don't precompile LS and ES (they don't exist on GFX9), compile as VS instead
- don't precompile HS and GS (we don't have LS and ES parts)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
fdef2f0fd1 radeonsi/gfx9: properly handle imported textures with unexpected swizzle mode
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
113278ee79 radeonsi: remove Constant Engine support
We have come to the conclusion that it doesn't improve performance.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
166823bfd2 radeonsi/gfx9: add a temporary workaround for a tessellation driver bug
The workaround will do for now. The root cause is still unknown.

This fixes new piglit: 16in-1out

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Samuel Pitoiset
39a35eb0c1 radeonsi: try to re-use previously deleted bindless descriptor slots
Currently, when the array is full it is resized but it can grow
over and over because we don't try to re-use descriptor slots.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:37 +02:00
Samuel Pitoiset
c2dfa9b111 radeonsi: use slot indexes for bindless handles
Using VRAM address as bindless handles is not a good idea because
we have to use LLVMIntToPTr and the LLVM CSE pass can't optimize
because it has no information about the pointer.

Instead, use slots indexes like the existing descriptors. Note
that we use fixed 16-dword slots for both samplers and images.
This doesn't really matter because no real apps use image handles.

This improves performance with DOW3 by +7%.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:29 +02:00
Samuel Pitoiset
50349f404d radeonsi: add si_emit_global_shader_pointers() helper
To share common code between rw buffers and bindless descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:24 +02:00
Samuel Pitoiset
a5ff4a8e2e radeonsi: only initialize dirty_mask when CE is used
Looks like it's useless to initialize that field when CE is
unused. This will also allow to declare more than 64 elements
for the array of bindless descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:23 +02:00
Samuel Pitoiset
a29ef75565 radeonsi: make some si_descriptors fields 32-bit
The number of bindless descriptors is dynamic and we definitely
have to support more than 256 slots.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:21 +02:00