Commit graph

91471 commits

Author SHA1 Message Date
Tim Rowley
feefd3ef4e swr/rast: name threads to aid debugging
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:40 -05:00
Tim Rowley
9b907599b6 swr/rast: disable buffer overrun warning for Assemble()
Disabling buffer overrun warning for Assemble(uint32_t slot,
simdvector *verts) due to what looks like a MSVC compiler bug
when compiling the SIMD16 FE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:33 -05:00
Tim Rowley
d523b82498 swr/rast: clean up clipper comments
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:26 -05:00
Tim Rowley
8c0e0bf141 swr/rast: add SIMDAPI decorators in binner/clipper
Fixes MSVC errors with SIMD16 FE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:20 -05:00
Tim Rowley
42d804b2a3 swr/rast: add additional jit utility functions
Not used yet.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:02 -05:00
Tim Rowley
a373f1f27a swr/rast: more flexible max attribute slots
Ability to allocate space for an arbitrary number (at compile time)
of positions in the vertex layout.

Removes KNOB_NUM_ATTRIBUTES from knobs.h, replaces the VTX slot
number #defines with the SWR_VTX_SLOTS enum (which contains
replacement for NUM_ATTRIBUTES: SWR_VTX_NUM_SLOTS)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:53:39 -05:00
Kenneth Graunke
54d42cd976 i965: Drop BRW_NEW_CONTEXT from 3DSTATE_DS/GS on Gen7-7.5.
We already have BRW_NEW_BATCH, which completely covers all the cases
that BRW_NEW_CONTEXT would handle.  Drop it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
1d0e974406 i965: Drop _NEW_TRANSFORM from 3DSTATE_DS/GS on Gen7-7.5.
There's no reason for this as far as I can tell.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
a1f12574b0 i965: Set point rasterization rule to UPPER_RIGHT on Gen6-7.5.
Gen4-5 and Gen8+ already set this, but Gen6-7.5 did not.  We ought to
be consistent - the answer depends on the API, not the hardware generation.

The Sandybridge PRM says about RASTRULE_UPPER_RIGHT:

   "To match OpenGL point rasterization rules (round to +infinity, where
    this is the upper right direction wrt OpenGL screen origin of lower
    left).

So this is likely the one we should use.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
4878ab9bd4 i965: Always set AALINEDISTANCE_TRUE on Sandybridge.
We set this unconditionally on every other platform.  Zero (Manhattan)
isn't even listed as an option in the Sandybridge docs - only "true".

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
b625bcc601 i965: Use true AA line distance on G45/Ironlake.
The original Broadwater and Crestline platforms computed antialiased
line distances using "manhattan" distance, aka a + b = c.  Eaglelake
and Cantiga added "true" distance, which apparently does something
like max(a, b) + min(a, b) / 4.  Not exactly "true", but at least
more accurate.

The G45 documentation indicates that the old manhattan distance setting
is "only for debug purposes" and should never be used.  The Ironlake
documentation no longer mentions AALINEDISTANCE_MANHATTAN, though it
does still contain the narrative about the feature.

At any rate, we should use the more accurate mode.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-28 17:03:33 -07:00
Andres Gomez
81149c8f52 docs: add news item and link release notes for 17.0.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-04-29 01:21:17 +03:00
Andres Gomez
e06aec99f2 docs: add sha256 checksums for 17.0.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 6cb65ce2d3)
2017-04-29 01:20:51 +03:00
Andres Gomez
0ad8c4f375 docs: add release notes for 17.0.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 61b134a862)
2017-04-29 01:19:51 +03:00
Marek Olšák
7a515a607c radeonsi: don't load unused compute shader input SGPRs and VGPRs
Basically, don't load GRID_SIZE or BLOCK_SIZE if they are unused, determine
whether to load BLOCK_ID for each component separately, and set the number
of THREAD_ID VGPRs to load. Now we should get the maximum CS launch wave
rate in most cases.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:57:44 +02:00
Marek Olšák
46e48d4044 tgsi/scan: record compute shader system value usage
v2: just do indexing with swizzle[i]

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
fa15436e63 radeonsi: add a HUD query for draw calls with primitive restart
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
55445ff189 radeonsi: tell LLVM not to remove s_barrier instructions
LLVM 5.0 removes s_barrier instructions if the max-work-group-size
attribute is not set. What a surprise.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
0490074cab radeonsi: fix tess offchip offset for per-patch attributes
We need 4 more bits there. I don't know what is fixed by this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
4e50062028 radeonsi: pass tessellation ring addresses via user SGPRs
This removes s_load_dword latency for tess rings.

We need just 1 SGPR for the address if we use 64K alignment. The final asm
for recreating the descriptor is:

    // s2 is (address >> 16)
    s_mov_b32 s3, 0
    s_lshl_b64 s[4:5], s[2:3], 16
    s_mov_b32 s6, -1
    s_mov_b32 s7, 0x27fac

v2: bitcast the descriptor type from v2i64 to v4i32

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
2823e15f60 radeonsi: use si_insert_input_ret in si_llvm_emit_tcs_epilogue
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
9fd9a7d0ba radeonsi: remove VS epilog code, compile VS with PrimID export on demand
The use of PrimID in the pixel shader is too rare to deserve such
a sizable support code.

The initial idea of the VS epilog was to move the clipping code there and
remove it based on states, but optimized variants are now used to do that
and are easier to support, so the VS epilog has turned out to be not so
useful.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
3b2e93e472 radeonsi: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3
VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
678d568c7b radeonsi: don't load PrimID in TES if it's not used
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
808c33f6f0 radeonsi: explain (non-)monolithic shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
fc478248f3 radeonsi/gfx9: enable OpenGL 4.5
Tentatively enable it, expecting the scratch buffer support to be done before
the next Mesa release.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
ed9a51cd3b radeonsi/gfx9: 2nd shader of merged shaders should hold a reference of the 1st
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
ef40937854 radeonsi: add reference counting for shader selectors
The 2nd shader of merged shaders should take a reference of the 1st shader.
The next commit will do that.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
6c15e15af4 radeonsi/gfx9: set VGT_VERTEX_REUSE for ES in ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
887ef1de34 radeonsi/gfx9: set TES registers for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
49cd0cbfd5 radeonsi/gfx9: disallow scratch buffer for LS-HS and ES-GS
not implemented yet

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
2857b14bba radeonsi/gfx9: always compile monolithic ES-GS (asynchronously)
In addition to the non-monolithic variant.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
a82398a8f5 radeonsi/gfx9: add support for monolithic ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
6a9c20fdd5 radeonsi/gfx9: make sure the 1st shader's main part exists for merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
7df682c291 radeonsi/gfx9: select shader parts for non-monolithic ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
cd99c442c4 radeonsi/gfx9: add GS prolog support for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
e0570bc283 radeonsi/gfx9: add VS prolog support for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
6b93452b24 radeonsi/gfx9: pass GS input SGPRs and VGPRs from the ES part to GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
37e22ab65e radeonsi/gfx9: store ES outputs to LDS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
d616c57342 radeonsi/gfx9: load GS inputs from LDS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
fc781fa0ab radeonsi/gfx9: get GS wave ID from the correct input
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
bcaf905129 radeonsi/gfx9: add the function signature of merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
8b220877ad radeonsi/gfx9: set registers and shader key for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
ab197ad8d1 radeonsi/gfx9: add GS user SGPRs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
b2f5d03152 radeonsi: rename declare_tess_lds -> declare_lds_as_pointer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
e3caa1cd36 radeonsi: simplify some shader type conditions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
021e65640e radeonsi: rename the swizzle parameter of lds_store
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
dcea7e5d19 radeonsi: add si_shader::prolog2
For a GS prolog in merged ES-GS.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
eb35238ffe radeonsi/gfx9: move RW_BUFFERS to s[0:1] for merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
0af00f179e radeonsi/gfx9: add support for monolithic merged LS-HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00