Commit graph

27608 commits

Author SHA1 Message Date
Tim Rowley
4b4547a721 swr: [rasterizer] Reduce max in-flight draws to 96 (by default) 2016-03-25 14:45:39 -05:00
Tim Rowley
9111d63228 swr: [rasterizer] Fix run-time check asserts
One innocuous (uninitialized variable), and one not so innocuous
(stack corruption).
2016-03-25 14:45:39 -05:00
Tim Rowley
257db3610a swr: [rasterizer jitter] signed immediate builder 2016-03-25 14:45:39 -05:00
Tim Rowley
b958aea78a swr: [rasterizer common] changes for cygwin 2016-03-25 14:45:39 -05:00
Tim Rowley
e1222ade00 swr: [rasterizer] code styling and update copyrights 2016-03-25 14:45:14 -05:00
Tim Rowley
c75314ec67 swr: [rasterizer core] Guard against enquing work to invalid hot tiles 2016-03-25 14:43:15 -05:00
Tim Rowley
fee56fda6f swr: [rasterizer] Stop setting viewport size to larger than hottile array
Guard against enquing work to invalid tiles
2016-03-25 14:43:14 -05:00
Tim Rowley
e374d2d24b swr: [rasterizer] Discard work + misc fixes 2016-03-25 14:43:14 -05:00
Tim Rowley
542d7dec7b swr: [rasterizer] remove use of BYTE type 2016-03-25 14:43:14 -05:00
Tim Rowley
be4c558d01 swr: [rasterizer core] Fix crash that can occur when switching contexts 2016-03-25 14:43:14 -05:00
Tim Rowley
51a11658d9 swr: [rasterizer] remove unused knob 2016-03-25 14:43:14 -05:00
Tim Rowley
61beaa2279 swr: [rasterizer core] subcontext rework 2016-03-25 14:43:14 -05:00
Tim Rowley
0c18900cfb swr: [rasterizer common] add _simd_s[rl]lv_epi32 2016-03-25 14:43:14 -05:00
Tim Rowley
bef222db22 swr: [rasterizer core] Alleviate potential stack overflow for 32bit builds
Move large stack allocations in the GS and clipper into thread local storage.
2016-03-25 14:43:14 -05:00
Tim Rowley
3132f731f8 swr: [rasterizer] remove use of UCHAR and UINT64 types 2016-03-25 14:43:14 -05:00
Tim Rowley
643857f596 swr: [rasterizer] remove use of FLOAT type 2016-03-25 14:43:14 -05:00
Tim Rowley
3252fe3705 swr: [rasterizer] Fix Coverity issues reported by Mesa developers. 2016-03-25 14:43:14 -05:00
Tim Rowley
45d52673c2 swr: [rasterizer] add debug/perf category to knobs 2016-03-25 14:43:13 -05:00
Tim Rowley
1da9c8a970 swr: [rasterizer core] don't assume linux is 64-bit 2016-03-25 14:43:13 -05:00
Tim Rowley
49678803f7 swr: [rasterizer common] remove old unused win32 types 2016-03-25 14:43:13 -05:00
Tim Rowley
aca5513184 swr: [rasterizer jitter] vpermps support 2016-03-25 14:43:13 -05:00
Tim Rowley
bfb954189e swr: [rasterizer] Add rdtsc buckets support for shaders
Pass pointer to core buckets mgr back to sim layer.

Add support for RDTSC_START/RDTSC_STOP macros in the builder.

Each unique shader now has a unique bucket associated with it,
enabling more detailed reporting at the shader level. Currently
due to some llvm issue with thread local storage, 64bit runs require
single threaded mode.
2016-03-25 14:43:13 -05:00
Tim Rowley
abd4aa68cc swr: [rasterizer core] backend reorganization 2016-03-25 14:43:13 -05:00
Tim Rowley
13303f3320 swr: [rasterizer core] store blend output in temporary instead of PS output.
Fixes additive blend problem with MSAA
2016-03-25 14:26:17 -05:00
Tim Rowley
3f4fba3772 swr: [rasterizer core] Move InitializeHotTiles and corresponding clear code out of threads.cpp. 2016-03-25 14:26:17 -05:00
Tim Rowley
bdd690dc36 swr: [rasterizer jitter] Cleanup use of types inside of Builder.
Also, cached the simd width since we don't have to keep querying
the JitManager for it.
2016-03-25 14:26:17 -05:00
Tim Rowley
7ead4959a5 swr: [rasterizer jitter] Fix type mismatch on select args for SCATTERPS 2016-03-25 14:26:17 -05:00
Tim Rowley
136988b42b swr: [rasterizer core] fix rasterizing multisampling with scissor enabled
We were not evaluating the scissor edge equations at sample positions.
2016-03-25 14:26:17 -05:00
Tim Rowley
45f0ce168c swr: [rasterizer core] RingBuffer class for DC/DS
Use head/tail ring buffer indices for thread synchronization.

1. SwrWaitForIdle loops until ring is empty. (head == tail)
2. GetDrawContext waits until ring is not full. (head - tail) == Ring Size
3. Draw enqueues by incrementing head.
4. Last worker thread to move past a DC dequeues by incrementing tail.

Todo: To reduce contention we can cache the tail in the API thread. For
example, if you know you have 64 free entries in the ring then you don't
need to keep checking the tail until you used those 64 entries.
2016-03-25 14:26:17 -05:00
Tim Rowley
dd0f9eed8c swr: [rasterizer] switch assert uses to SWR_ASSERT 2016-03-25 14:26:16 -05:00
Tim Rowley
45a4afa634 swr: [rasterizer core] Split all RECT_LIST draws into 1 RECT per draw
Needed until proper RECT_LIST PrimAssembly code is written.
2016-03-25 14:26:16 -05:00
Tim Rowley
3a25185990 swr: [rasterizer] Add string knob type 2016-03-25 14:26:16 -05:00
Sonny Jiang
f87ed903fb radeon/vce: disable two pipe mode for Polaris11
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-03-24 23:08:04 -04:00
Sonny Jiang
0c5477465f radeon/vce: add Polaris11 VCE firmware support
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
2016-03-24 23:07:53 -04:00
Sonny Jiang
42e442d888 radeonsi: add support for Polaris (v2)
v2: Polaris chips should be defined after Stoney

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> (v1)
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1)
Signed-off-by: Leo Liu <leo.liu@amd.com> (v2 diff)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v2 diff)
2016-03-24 23:07:32 -04:00
Sonny Jiang
f5e24b19e8 winsys/amdgpu: addrlib - add Polaris support (v2)
v2: fix indentation as noted by Michel

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-24 23:06:39 -04:00
Rob Clark
61c7d20e4f ttn: remove stray global from header
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-24 16:04:54 -04:00
Samuel Pitoiset
b9c70fcdad nv50/ir: silence unhandled TGSI_PROPERTY_NEXT_SHADER info
radeonsi uses this property to make the best decision about which
shader to compile, but this is not currently used by our codegen.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-24 18:53:24 +01:00
Nicolai Hähnle
7880b81d39 radeonsi: silence a coverity warning
The following Coverity warning

5378     	tmpl.fetch_args = atomic_fetch_args;
5379     	tmpl.emit = atomic_emit;
>>>     CID 1357115:  Uninitialized variables  (UNINIT)
>>>     Using uninitialized value "tmpl". Field "tmpl.intr_name" is uninitialized.
5380     	bld_base->op_actions[TGSI_OPCODE_ATOMUADD] = tmpl;
5381     	bld_base->op_actions[TGSI_OPCODE_ATOMUADD].intr_name = "add";

... is a false positive, but what the hell. This change should "fix" it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-24 12:23:14 -05:00
Nicolai Hähnle
c4931ae174 radeonsi: fix out-of-bounds indexing of shader images
Results are undefined but may not crash. Without this change, out-of-bounds
indexing can lead to VM faults and GPU hangs.

Constant buffers, samplers, and possibly others will eventually need similar
treatment to support GL_ARB_robust_buffer_access_behavior.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-23 11:49:53 -05:00
Nicolai Hähnle
a8f5d11426 radeonsi: cache flush/invalidation for missing PIPE_BARRIER_*_BUFFER bits (v2)
This fixes arb_shader_image_load_store-host-mem-barrier.

v2: flush TC L2 for index buffers on <= CIK (Marek)

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-23 11:48:19 -05:00
Nicolai Hähnle
b15b1faefd gallium: add PIPE_BARRIER_STREAMOUT_BUFFER
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-23 11:48:02 -05:00
Marek Olšák
b8ec205515 radeonsi: fix 2D array MSAA failures since image support landed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-03-23 12:14:15 +01:00
Dave Airlie
53afbc980a tgsi: drop unused set_exec/kill_mask interfaces.
These don't get used and haven't been in git history from what I can
see, so drop them.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-22 13:07:05 +10:00
xavier
fce0b55ccb r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications.
Previously it was doing this transformation for a Trine 3 shader:
     MUL     R6.x.12,    R13.x.23, 0.5|3f000000
-    MULADD     R4.x.12,    -R6.x.12, 2|40000000, 1|3f800000
+    MULADD     R4.x.12,    -R13.x.23, -1|bf800000, 1|3f800000

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94412
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-22 07:43:13 +10:00
Samuel Pitoiset
9efd8b590f nvc0: make sure to delete samplers used by compute shaders
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-21 22:04:18 +01:00
Edward O'Callaghan
5219eb15e1 radeonsi: Set PIPE_SHADER_CAP_MAX_SHADER_IMAGES
This enables ARB_shader_image_load_store and ARB_shader_image_size.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
[allow the same number of images for all shader stages and require LLVM 3.9]

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:26 -05:00
Nicolai Hähnle
6f942ac5ee radeonsi: disable early Z if the fragment shader writes to memory
Empirically, both the EXEC_ON_* flags and LATE_Z are necessary.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Nicolai Hähnle
79762e877c tgsi/scan: add writes_memory to flag presence of stores or atomics
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00
Nicolai Hähnle
e9d935ed0e radeonsi: force the DCC enable bit off in image descriptors for writing (v2)
This avoids a lockup at least on Tonga.

v2: only force DCC off on VI+ (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:25 -05:00