Commit graph

92185 commits

Author SHA1 Message Date
Neha Bhende
1b415a5b28 svga: add function svga_linear_to_srgb()
This function will return compatible svga srgb format for corresponding
linear format

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-28 21:03:06 -06:00
Neha Bhende
6e06e281c6 glx: add missing sRGB attribute check in fbconfigs_compatible()
This patch will allow driver to choose srgb capable FBconfig
if GLX_FRAMEBUFFER_SRGB_CAPABLE_ARB attribute is 1

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-28 21:03:06 -06:00
Thomas Hellstrom
ca59fd1706 svga: Add a more elaborate format compatibility determination v2
dri3 is a bit sloppy about its format compatibility requirements, so add
a possibility to import xrgb surfaces as argb textures and vice versa.

At the same time, make the svga_texture_from_handle() function a bit more
readable and fix the error path where we leaked a winsys surface.

v2: Addressed review comments by Brian.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-28 21:03:06 -06:00
Tim Rowley
18d5c452d0 swr/rast: add memory api to SwrGetInterface()
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:57:09 -05:00
Tim Rowley
a46539af11 swr/rast: use gather instruction for odd format fetch
Small fetch performance optimization - use gather instruction
for odd format fetch instead of slow emulated code.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:57:02 -05:00
Tim Rowley
eff909de7d swr/rast: enable SIMD16 8x2 tile backend
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:56 -05:00
Tim Rowley
5fde2ae533 swr/rast: add SwrInit() to init backend/memory tables
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:50 -05:00
Tim Rowley
e8d58049f6 swr/rast: increment depth/stencil tile pointer in SIMD16 BE
Misplaced #endif preventing depth and stencil hot tile pointers
from incrementing in SIMD16 8x2 configuration of BackendPixelRate.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:42 -05:00
Tim Rowley
d4c1486737 swr/rast: add SwrGetInterface() function to return api
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:34 -05:00
Tim Rowley
dabd0499a6 swr/rast: enable per-warp scratch space for CS
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:28 -05:00
Tim Rowley
0424e6249a swr/rast: reduce simd{16}vertex stack for VS output
Frontend - reduce simdvertex/simd16vertex stack usage for VS output in
ProcessDraw, fixes stack overflow in some of the deeper call stacks under
SIMD16.

1. Move the vertex store out of PA_FACTORY, and off the stack
2. Allocate the vertex store out of the aligned heap (pointer is
   temporarily stored in TLS, but will be migrated to thread pool
   along with other frontend temporary buffers).
3. Grow the vertex store as necessary for the number of verts per
   primitive, in chunks of 8/4 simdvertex/simd16vertex

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:17 -05:00
Tim Rowley
536baf507e swr/rast: remove default argument from SwrSync()
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:11 -05:00
Tim Rowley
145bf5aa5b swr/rast: remove unused variables in the SIMD16 FE
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:57 -05:00
Tim Rowley
20f3a30219 swr/rast: move construction of const above goto
Fixes gcc error for SIMD16 FE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:50 -05:00
Tim Rowley
feefd3ef4e swr/rast: name threads to aid debugging
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:40 -05:00
Tim Rowley
9b907599b6 swr/rast: disable buffer overrun warning for Assemble()
Disabling buffer overrun warning for Assemble(uint32_t slot,
simdvector *verts) due to what looks like a MSVC compiler bug
when compiling the SIMD16 FE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:33 -05:00
Tim Rowley
d523b82498 swr/rast: clean up clipper comments
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:26 -05:00
Tim Rowley
8c0e0bf141 swr/rast: add SIMDAPI decorators in binner/clipper
Fixes MSVC errors with SIMD16 FE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:20 -05:00
Tim Rowley
42d804b2a3 swr/rast: add additional jit utility functions
Not used yet.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:02 -05:00
Tim Rowley
a373f1f27a swr/rast: more flexible max attribute slots
Ability to allocate space for an arbitrary number (at compile time)
of positions in the vertex layout.

Removes KNOB_NUM_ATTRIBUTES from knobs.h, replaces the VTX slot
number #defines with the SWR_VTX_SLOTS enum (which contains
replacement for NUM_ATTRIBUTES: SWR_VTX_NUM_SLOTS)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:53:39 -05:00
Kenneth Graunke
54d42cd976 i965: Drop BRW_NEW_CONTEXT from 3DSTATE_DS/GS on Gen7-7.5.
We already have BRW_NEW_BATCH, which completely covers all the cases
that BRW_NEW_CONTEXT would handle.  Drop it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
1d0e974406 i965: Drop _NEW_TRANSFORM from 3DSTATE_DS/GS on Gen7-7.5.
There's no reason for this as far as I can tell.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
a1f12574b0 i965: Set point rasterization rule to UPPER_RIGHT on Gen6-7.5.
Gen4-5 and Gen8+ already set this, but Gen6-7.5 did not.  We ought to
be consistent - the answer depends on the API, not the hardware generation.

The Sandybridge PRM says about RASTRULE_UPPER_RIGHT:

   "To match OpenGL point rasterization rules (round to +infinity, where
    this is the upper right direction wrt OpenGL screen origin of lower
    left).

So this is likely the one we should use.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
4878ab9bd4 i965: Always set AALINEDISTANCE_TRUE on Sandybridge.
We set this unconditionally on every other platform.  Zero (Manhattan)
isn't even listed as an option in the Sandybridge docs - only "true".

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
b625bcc601 i965: Use true AA line distance on G45/Ironlake.
The original Broadwater and Crestline platforms computed antialiased
line distances using "manhattan" distance, aka a + b = c.  Eaglelake
and Cantiga added "true" distance, which apparently does something
like max(a, b) + min(a, b) / 4.  Not exactly "true", but at least
more accurate.

The G45 documentation indicates that the old manhattan distance setting
is "only for debug purposes" and should never be used.  The Ironlake
documentation no longer mentions AALINEDISTANCE_MANHATTAN, though it
does still contain the narrative about the feature.

At any rate, we should use the more accurate mode.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-28 17:03:33 -07:00
Andres Gomez
81149c8f52 docs: add news item and link release notes for 17.0.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-04-29 01:21:17 +03:00
Andres Gomez
e06aec99f2 docs: add sha256 checksums for 17.0.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 6cb65ce2d3)
2017-04-29 01:20:51 +03:00
Andres Gomez
0ad8c4f375 docs: add release notes for 17.0.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 61b134a862)
2017-04-29 01:19:51 +03:00
Marek Olšák
7a515a607c radeonsi: don't load unused compute shader input SGPRs and VGPRs
Basically, don't load GRID_SIZE or BLOCK_SIZE if they are unused, determine
whether to load BLOCK_ID for each component separately, and set the number
of THREAD_ID VGPRs to load. Now we should get the maximum CS launch wave
rate in most cases.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:57:44 +02:00
Marek Olšák
46e48d4044 tgsi/scan: record compute shader system value usage
v2: just do indexing with swizzle[i]

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
fa15436e63 radeonsi: add a HUD query for draw calls with primitive restart
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
55445ff189 radeonsi: tell LLVM not to remove s_barrier instructions
LLVM 5.0 removes s_barrier instructions if the max-work-group-size
attribute is not set. What a surprise.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
0490074cab radeonsi: fix tess offchip offset for per-patch attributes
We need 4 more bits there. I don't know what is fixed by this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
4e50062028 radeonsi: pass tessellation ring addresses via user SGPRs
This removes s_load_dword latency for tess rings.

We need just 1 SGPR for the address if we use 64K alignment. The final asm
for recreating the descriptor is:

    // s2 is (address >> 16)
    s_mov_b32 s3, 0
    s_lshl_b64 s[4:5], s[2:3], 16
    s_mov_b32 s6, -1
    s_mov_b32 s7, 0x27fac

v2: bitcast the descriptor type from v2i64 to v4i32

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
2823e15f60 radeonsi: use si_insert_input_ret in si_llvm_emit_tcs_epilogue
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
9fd9a7d0ba radeonsi: remove VS epilog code, compile VS with PrimID export on demand
The use of PrimID in the pixel shader is too rare to deserve such
a sizable support code.

The initial idea of the VS epilog was to move the clipping code there and
remove it based on states, but optimized variants are now used to do that
and are easier to support, so the VS epilog has turned out to be not so
useful.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
3b2e93e472 radeonsi: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3
VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
678d568c7b radeonsi: don't load PrimID in TES if it's not used
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
808c33f6f0 radeonsi: explain (non-)monolithic shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
fc478248f3 radeonsi/gfx9: enable OpenGL 4.5
Tentatively enable it, expecting the scratch buffer support to be done before
the next Mesa release.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
ed9a51cd3b radeonsi/gfx9: 2nd shader of merged shaders should hold a reference of the 1st
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
ef40937854 radeonsi: add reference counting for shader selectors
The 2nd shader of merged shaders should take a reference of the 1st shader.
The next commit will do that.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
6c15e15af4 radeonsi/gfx9: set VGT_VERTEX_REUSE for ES in ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
887ef1de34 radeonsi/gfx9: set TES registers for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
49cd0cbfd5 radeonsi/gfx9: disallow scratch buffer for LS-HS and ES-GS
not implemented yet

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
2857b14bba radeonsi/gfx9: always compile monolithic ES-GS (asynchronously)
In addition to the non-monolithic variant.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
a82398a8f5 radeonsi/gfx9: add support for monolithic ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
6a9c20fdd5 radeonsi/gfx9: make sure the 1st shader's main part exists for merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
7df682c291 radeonsi/gfx9: select shader parts for non-monolithic ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
cd99c442c4 radeonsi/gfx9: add GS prolog support for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00