Commit graph

27747 commits

Author SHA1 Message Date
Axel Davy
59a692916c gallium: Add a cap for offset_units_unscaled
D3D9 has a different behaviour for depth bias.

For OGL/D3D1X, the depth bias unit is the
minimal resolvable value for the depth buffer,
which depends on the format (and has different
behaviour for float depth buffers).

For D3D9, the depth bias unit is 1.0f.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-25 10:16:15 +02:00
Marek Olšák
28d0d0c5b4 radeonsi: fix fractional odd tessellation spacing for Polaris
ported from Vulkan (and no source explains why this is needed)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 17:36:43 +02:00
Marek Olšák
0d638f4b3d radeonsi: set some VGT context registers on SI-CI
the kernel sets them, but other UMDs can change them

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 16:24:53 +02:00
Marek Olšák
8f3ef4e8b8 radeonsi: optimize rendering to linear color buffers
loosely ported from Vulkan

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 16:24:53 +02:00
Marek Olšák
e4b22c9fa1 radeonsi: set almost optimal settings in SC_MODE_CNTL_1
ported from Vulkan

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 16:24:53 +02:00
Marek Olšák
603c073ec2 gallium/radeon: let drivers specify SC_MODE_CNTL_1 fields
radeonsi will set more fields

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 16:24:53 +02:00
Marek Olšák
ae0d2d15cc gallium/radeon: disable complicated point clipping against user clip planes
Nothing in the GL spec says that we should expand points to triangles.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 16:24:53 +02:00
Marek Olšák
1e8adb0ee4 radeonsi: fix a compute shader hang with big threadgroups on SI & CI
ported from Vulkan

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 16:24:53 +02:00
Ilia Mirkin
b433cb51e5 nvc0: when mapping directly, provide accurate xfer info + start
We were ignoring the incoming box parameters, and were providing totally
bogus stride/layer stride, and other bits, for when a non-full-surface
map was requested.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
2016-06-24 09:53:13 -04:00
Nicolai Hähnle
0da890e62c radeonsi: drop the DRAW_PREAMBLE packet on Polaris
It will be removed from the firmware for the Polaris.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 13:28:46 +02:00
Nicolai Hähnle
2aa0485902 radeonsi: use DRAW_(INDEX_)INDIRECT_MULTI on Polaris
The non-MULTI variants will be removed in Polaris firmware.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 13:28:32 +02:00
Nicolai Hähnle
bc4b7ebbfd winsys/radeon: add guard pages when R600_DEBUG=check_vm is enabled
This should help flush out GPU VM faults.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:36:03 +02:00
Nicolai Hähnle
49c0b4a0db winsys/amdgpu: add guard pages when R600_DEBUG=check_vm is enabled
This should help flush out GPU VM faults.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:36:03 +02:00
Nicolai Hähnle
dbac88a839 radeonsi: report a failure to parse dmesg instead of asserting
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:36:03 +02:00
Nicolai Hähnle
d46a9db840 radeon: check VM faults from DMA flush
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:36:03 +02:00
Nicolai Hähnle
80dd7870fe radeonsi: move gfx fence wait out of si_check_vm_faults
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:36:03 +02:00
Nicolai Hähnle
ad8438403b radeonsi: extract IB and bo list saving into separate functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:36:02 +02:00
Marek Olšák
05e741c6d6 radeonsi: set LLVM denormal flags
- make sure FP32 denormals will stay disabled in LLVM in the future
  (the current default is disabled)
- tell LLVM that FP64 denormals are enabled

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-06-24 12:31:03 +02:00
Marek Olšák
0e1fefa722 radeonsi: emit 1/sqrt for RSQ
We don't need the clamped version and we don't have to use any intrinsic.

Stats on Tonga:

15382 shaders in 9128 tests
Totals:
SGPRS: 1230560 -> 1230560 (0.00 %)
VGPRS: 469577 -> 462504 (-1.51 %)
Code Size: 22089908 -> 21730052 (-1.63 %) bytes
LDS: 598 -> 598 (0.00 %) blocks
Scratch: 283648 -> 281600 (-0.72 %) bytes per wave
Max Waves: 125664 -> 126969 (1.04 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 547280 -> 547280 (0.00 %)
VGPRS: 269132 -> 262059 (-2.63 %)
Code Size: 15709604 -> 15349748 (-2.29 %) bytes
LDS: 198 -> 198 (0.00 %) blocks
Scratch: 74752 -> 72704 (-2.74 %) bytes per wave
Max Waves: 47840 -> 49145 (2.73 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-06-24 12:31:03 +02:00
Jan Vesely
54c4d525da r600g: Enable FMA on chips that support it
v2: Merge with PIPE_SHADER_CAP_DOUBLES
    Add CHIP_HEMLOCK

v3: only set the instruction on EG and CM

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:30:59 +02:00
Marek Olšák
cbb5adb908 gallium/u_queue: allow the execute function to differ per job
so that independent types of jobs can use the same queue.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák
4a06786efd gallium/u_queue: reduce the number of mutexes by 2
by converting semaphores to condvars and using the main mutex

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák
2fba0aaa70 gallium/u_queue: add an option to name threads
for debugging

v2: correct the snprintf use

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák
404d0d50d8 gallium/u_queue: add an option to have multiple worker threads
independent jobs don't have to be stuck on only one thread

v2: use CALLOC & FREE

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák
4358f6dd13 gallium/u_queue: rewrite util_queue_fence to allow multiple waiters
Checking "signalled" is first done without a mutex, then with a mutex.
Also, checking without waiting doesn't lock the mutex. This is racy, but
should be safe.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák
d8367e91f2 gallium/u_queue: use a ring instead of a stack
and allow specifying its size in util_queue_init.

v2: use CALLOC & FREE

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Giuseppe Bilotta
60a27ad122 Remove wrongly repeated words in comments
Clean up misrepetitions ('if if', 'the the' etc) found throughout the
comments. This has been done manually, after grepping
case-insensitively for duplicate if, is, the, then, do, for, an,
plus a few other typos corrected in fly-by

v2:
    * proper commit message and non-joke title;
    * replace two 'as is' followed by 'is' to 'as-is'.
v3:
    * 'a integer' => 'an integer' and similar (originally spotted by
      Jason Ekstrand, I fixed a few other similar ones while at it)

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-06-23 13:55:03 -07:00
Brian Paul
5d07998317 svga: update some comments in svga_buffer_handle()
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 13:02:28 -06:00
Brian Paul
fe76212873 svga: add a const qualifier in svga_buffer_upload_piecewise()
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 13:02:28 -06:00
Brian Paul
e82fa96d19 svga: minor code refactor for svga_buffer_upload_command()
Put the HBS code into a separate function.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 13:02:28 -06:00
Brian Paul
db721da5a3 svga: minor code simplification in svga_context_finish()
Signed-off-by: Brian Paul <brianp@vmware.com>

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 13:02:28 -06:00
Tim Rowley
a16d274032 swr: [rasterizer core] fix dependency bug
Never be dependent on "draw 0", instead have a bool that makes the draw
dependent on the previous draw or not dependent at all.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:51:11 -05:00
Tim Rowley
73a9154bde swr: [rasterizer core] use wrap-around safe compares for dependency checking
Move drawIDs from 64-bit to 32-bit to increase perf.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:51:06 -05:00
Tim Rowley
dd189536dc swr: [rasterizer jitter] add support for component packing for 'odd' formats
Add early-out if no components are enabled. Add asserts.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:51:00 -05:00
Tim Rowley
35935ca4f2 swr: [rasterizer core] track whether GS outputs viewport array index
So we can skip the index gather in PA.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:55 -05:00
Tim Rowley
2d80295a6e swr: [rasterizer core] GS viewport array index attribute
Only adds the attribute mapping to the jitter; no implementation yet.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:47 -05:00
Tim Rowley
c7cd33b605 swr: [rasterizer core] conservative rasterization frontend support
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:41 -05:00
Tim Rowley
c867c22d85 swr: [rasterizer core] stop single threaded crash exit crash
Function static destructors were getting called by exit
handlers before context teardown.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:36 -05:00
Tim Rowley
0f025eb478 swr: [rasterizer jitter] small fetch jit cleanup
Handle SGV stores separate from the stream fetch code.

Because of this change, there is a potential to jit an extra unused store.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:30 -05:00
Tim Rowley
eca877f27b swr: [rasterizer core] remove old comment
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:25 -05:00
Tim Rowley
d3d97f8395 swr: [rasterizer jitter] cleanup supporting different llvm versions
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:19 -05:00
Tim Rowley
42215e6116 swr: [rasterizer jitter] unitialized component fix in fetch jit
Was trying to store an extra uninitialized component.
Only affects component packing, which isn't enabled (yet).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:12 -05:00
Tim Rowley
b6d2c96851 swr: [rasterizer] add support for building avx512 version
Currently, most code paths between AVX2 and AVX512 are identical
(see changes to knobs.h).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:05 -05:00
Tim Rowley
695af2a7e2 swr: [rasterizer common] fix include for Intel compiler
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:49:59 -05:00
Tim Rowley
95f21a9766 swr: [rasterizer common] workaround clang for windows __cpuid() bug
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:49:46 -05:00
Tim Rowley
9ca741c645 swr: push/pop DEBUG macro around llvm includes
llvm redefines DEBUG; adding push/pop prevents a undefined reference
to debug_refcnt_state in llvm-3.7+.

v2: add undef DEBUG

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 09:58:08 -05:00
Brian Paul
4f5d513755 svga: rename svga_surface_copy() to svga_resource_copy_region()
To be consistent with the pipe_context function name.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 07:31:20 -06:00
Brian Paul
743ff588f2 svga: don't copy blit_info into local var
There's no reason for doing so.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 07:31:20 -06:00
Brian Paul
e0dc3c5f19 gallium/util: fix some 4-space indentation in blitter code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 07:31:20 -06:00
Charmaine Lee
2aa9ff0cda svga: fix texture array update regression
With commit fb9fe35, we start using transfer_inline_write
for memcpy TexSubImage path, but that triggers a regression with
texture array in the svga driver.

With this patch, the direct map code will update the texture array
correctly.

Fixes VMware bug 1679293.

Tested with MTT piglit, glretrace, conform.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-23 07:31:20 -06:00