Commit graph

27730 commits

Author SHA1 Message Date
Marek Olšák
05e741c6d6 radeonsi: set LLVM denormal flags
- make sure FP32 denormals will stay disabled in LLVM in the future
  (the current default is disabled)
- tell LLVM that FP64 denormals are enabled

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-06-24 12:31:03 +02:00
Marek Olšák
0e1fefa722 radeonsi: emit 1/sqrt for RSQ
We don't need the clamped version and we don't have to use any intrinsic.

Stats on Tonga:

15382 shaders in 9128 tests
Totals:
SGPRS: 1230560 -> 1230560 (0.00 %)
VGPRS: 469577 -> 462504 (-1.51 %)
Code Size: 22089908 -> 21730052 (-1.63 %) bytes
LDS: 598 -> 598 (0.00 %) blocks
Scratch: 283648 -> 281600 (-0.72 %) bytes per wave
Max Waves: 125664 -> 126969 (1.04 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 547280 -> 547280 (0.00 %)
VGPRS: 269132 -> 262059 (-2.63 %)
Code Size: 15709604 -> 15349748 (-2.29 %) bytes
LDS: 198 -> 198 (0.00 %) blocks
Scratch: 74752 -> 72704 (-2.74 %) bytes per wave
Max Waves: 47840 -> 49145 (2.73 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-06-24 12:31:03 +02:00
Jan Vesely
54c4d525da r600g: Enable FMA on chips that support it
v2: Merge with PIPE_SHADER_CAP_DOUBLES
    Add CHIP_HEMLOCK

v3: only set the instruction on EG and CM

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-06-24 12:30:59 +02:00
Marek Olšák
cbb5adb908 gallium/u_queue: allow the execute function to differ per job
so that independent types of jobs can use the same queue.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák
4a06786efd gallium/u_queue: reduce the number of mutexes by 2
by converting semaphores to condvars and using the main mutex

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák
2fba0aaa70 gallium/u_queue: add an option to name threads
for debugging

v2: correct the snprintf use

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák
404d0d50d8 gallium/u_queue: add an option to have multiple worker threads
independent jobs don't have to be stuck on only one thread

v2: use CALLOC & FREE

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák
4358f6dd13 gallium/u_queue: rewrite util_queue_fence to allow multiple waiters
Checking "signalled" is first done without a mutex, then with a mutex.
Also, checking without waiting doesn't lock the mutex. This is racy, but
should be safe.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Marek Olšák
d8367e91f2 gallium/u_queue: use a ring instead of a stack
and allow specifying its size in util_queue_init.

v2: use CALLOC & FREE

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-24 12:24:40 +02:00
Giuseppe Bilotta
60a27ad122 Remove wrongly repeated words in comments
Clean up misrepetitions ('if if', 'the the' etc) found throughout the
comments. This has been done manually, after grepping
case-insensitively for duplicate if, is, the, then, do, for, an,
plus a few other typos corrected in fly-by

v2:
    * proper commit message and non-joke title;
    * replace two 'as is' followed by 'is' to 'as-is'.
v3:
    * 'a integer' => 'an integer' and similar (originally spotted by
      Jason Ekstrand, I fixed a few other similar ones while at it)

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-06-23 13:55:03 -07:00
Brian Paul
5d07998317 svga: update some comments in svga_buffer_handle()
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 13:02:28 -06:00
Brian Paul
fe76212873 svga: add a const qualifier in svga_buffer_upload_piecewise()
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 13:02:28 -06:00
Brian Paul
e82fa96d19 svga: minor code refactor for svga_buffer_upload_command()
Put the HBS code into a separate function.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 13:02:28 -06:00
Brian Paul
db721da5a3 svga: minor code simplification in svga_context_finish()
Signed-off-by: Brian Paul <brianp@vmware.com>

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 13:02:28 -06:00
Tim Rowley
a16d274032 swr: [rasterizer core] fix dependency bug
Never be dependent on "draw 0", instead have a bool that makes the draw
dependent on the previous draw or not dependent at all.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:51:11 -05:00
Tim Rowley
73a9154bde swr: [rasterizer core] use wrap-around safe compares for dependency checking
Move drawIDs from 64-bit to 32-bit to increase perf.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:51:06 -05:00
Tim Rowley
dd189536dc swr: [rasterizer jitter] add support for component packing for 'odd' formats
Add early-out if no components are enabled. Add asserts.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:51:00 -05:00
Tim Rowley
35935ca4f2 swr: [rasterizer core] track whether GS outputs viewport array index
So we can skip the index gather in PA.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:55 -05:00
Tim Rowley
2d80295a6e swr: [rasterizer core] GS viewport array index attribute
Only adds the attribute mapping to the jitter; no implementation yet.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:47 -05:00
Tim Rowley
c7cd33b605 swr: [rasterizer core] conservative rasterization frontend support
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:41 -05:00
Tim Rowley
c867c22d85 swr: [rasterizer core] stop single threaded crash exit crash
Function static destructors were getting called by exit
handlers before context teardown.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:36 -05:00
Tim Rowley
0f025eb478 swr: [rasterizer jitter] small fetch jit cleanup
Handle SGV stores separate from the stream fetch code.

Because of this change, there is a potential to jit an extra unused store.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:30 -05:00
Tim Rowley
eca877f27b swr: [rasterizer core] remove old comment
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:25 -05:00
Tim Rowley
d3d97f8395 swr: [rasterizer jitter] cleanup supporting different llvm versions
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:19 -05:00
Tim Rowley
42215e6116 swr: [rasterizer jitter] unitialized component fix in fetch jit
Was trying to store an extra uninitialized component.
Only affects component packing, which isn't enabled (yet).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:12 -05:00
Tim Rowley
b6d2c96851 swr: [rasterizer] add support for building avx512 version
Currently, most code paths between AVX2 and AVX512 are identical
(see changes to knobs.h).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:50:05 -05:00
Tim Rowley
695af2a7e2 swr: [rasterizer common] fix include for Intel compiler
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:49:59 -05:00
Tim Rowley
95f21a9766 swr: [rasterizer common] workaround clang for windows __cpuid() bug
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 10:49:46 -05:00
Tim Rowley
9ca741c645 swr: push/pop DEBUG macro around llvm includes
llvm redefines DEBUG; adding push/pop prevents a undefined reference
to debug_refcnt_state in llvm-3.7+.

v2: add undef DEBUG

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-23 09:58:08 -05:00
Brian Paul
4f5d513755 svga: rename svga_surface_copy() to svga_resource_copy_region()
To be consistent with the pipe_context function name.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 07:31:20 -06:00
Brian Paul
743ff588f2 svga: don't copy blit_info into local var
There's no reason for doing so.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 07:31:20 -06:00
Brian Paul
e0dc3c5f19 gallium/util: fix some 4-space indentation in blitter code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-06-23 07:31:20 -06:00
Charmaine Lee
2aa9ff0cda svga: fix texture array update regression
With commit fb9fe35, we start using transfer_inline_write
for memcpy TexSubImage path, but that triggers a regression with
texture array in the svga driver.

With this patch, the direct map code will update the texture array
correctly.

Fixes VMware bug 1679293.

Tested with MTT piglit, glretrace, conform.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-23 07:31:20 -06:00
Charmaine Lee
d4a77254cb svga: fix index/vertex buffer surface reference at draw
Currently with the SetVertexBuffers optimization, we avoid emitting
redundant DXSetVertexBuffers commands. However, these buffers surfaces
will still need to be referenced, otherwise, in the case of linux,
the subsequent surface discard map will map to the existing mob instead
of a new one, causing rendering artifacts.

With this patch, we'll call resource_rebind() to reference the resources
even if we are avoiding the actual set command. This fixes the
rendering artifacts in the window title area running with unity in
Ubuntu 14.04

Tested with piglit, glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2016-06-23 07:31:20 -06:00
Charmaine Lee
2b81e31d44 svga: fix vertex buffer references in the hw state
This patch fixes three issues with vertex buffer references:
(1) Instead of copy the vertex buffer resource handles to the hw state
    in the context structure, use pipe_resource_reference to properly
    reference the vertex buffer resources in the context.

(2) Make sure to unbind those unused vertex buffer resources.

(3) Force to rebind the vertex buffer resources at the first draw of each
    command buffer to make sure the vertex buffer resources are paged in.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-23 07:31:20 -06:00
Charmaine Lee
a1d74f5528 svga: fix index buffer reference in the hw state
Instead of copy the index buffer resource handle to the hw state in
the context structure, use pipe_resource_reference to properly reference
the index buffer resource in the context.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-23 07:31:19 -06:00
Ilia Mirkin
1f4bca798d nv50,nvc0: fix start_instance in manual push path
The start instance is applied as an offset into the buffer directly,
ignoring the divisor, not as an instance id offset that respects the
divisor.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-21 21:50:16 -04:00
Ilia Mirkin
5b0d64886d translate: fix start_instance parameter in sse version
The generic version gets this right already, but this was using an
incorrect formula in SSE.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-21 21:50:16 -04:00
Marek Olšák
b16d21270f radeonsi: add a debug flag for unsafe math LLVM optimizations
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-21 13:52:05 +02:00
Marek Olšák
70a25478fe radeonsi: use u_blitter for mipmap generation
This reduces time spend in glGenerateMipmap by a half.

v2: don't decompress the levels to be overwritten

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-21 13:52:05 +02:00
Marek Olšák
5fed1122e8 gallium/u_blitter: implement mipmap generation
for pipe_context::generate_mipmap

first move some of the blit code from util_blitter_blit_generic
to a separate function, then use it from util_blitter_generate_mipmap

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-21 13:52:05 +02:00
Vedran Miletić
82e0bbd01a clover: Fix build against clang SVN >= r273191
setLangDefaults() now requires PreprocessorOptions as an argument.

Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-06-21 10:08:57 +09:00
Rob Clark
64180de1bf gallium: make image_view const
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-20 12:36:20 -04:00
Rob Clark
ef534b9389 gallium: make constant_buffer const
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-20 12:36:20 -04:00
Rob Clark
e1c1c40cbc gallium: make shader_buffers const
Be consistent with the rest of the "set_xyz" state interfaces.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-20 12:36:20 -04:00
Nicolai Hähnle
1167905c41 radeonsi: use trapezoid distribution for tess on Fiji and Polaris
This yields a small performance improvement in Unigine Heaven.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-20 18:29:55 +02:00
Nicolai Hähnle
650137a9c8 radeonsi/sid: add Fiji+ tesselation distribution mode
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-20 18:29:15 +02:00
Nicolai Hähnle
32fd92e028 radeonsi: emit PA_SC_RASTER_CONFIG_1 only once
It is the same for all SEs.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-20 18:28:34 +02:00
Nicolai Hähnle
c95175581e radeonsi: fix calculation of valid RB mask per SE
The old calculation treated too many RBs as disabled.

Cc: 11.0 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-20 18:28:31 +02:00
Nicolai Hähnle
6c2e636982 radeonsi: raise SI_PM4_MAX_DW
The old limit, introduced in commit afa752d3f0,
was exceeded by 4 SE configurations which hit si_write_harvested_raster_configs.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-20 18:28:17 +02:00