Commit graph

103421 commits

Author SHA1 Message Date
Ross Burton
d7c4ce1d1d egl: fix build race in automake
There is a parallel make build issue in src/egl/drivers/dri2/
for wayland builds. Can be reproduced with:

$ rm src/egl/drivers/dri2/*.h src/egl/drivers/dri2/platform_wayland.lo
$ make -C src/egl/ drivers/dri2/platform_wayland.lo
../../../mesa-18.1.2/src/egl/drivers/dri2/platform_wayland.c:50:10: fatal error: linux-dmabuf-unstable-v1-client-protocol.h: No such file or directory

This patch adds the missing dependency.

Fixes: 02cc359372 "egl/wayland: Use linux-dmabuf interface for buffers"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

[Eric: fixed up the commit title]
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-06-29 12:49:51 +01:00
Marek Olšák
5a6414f135 radeonsi: implement vertex color clamping for tess and GS 2018-06-28 22:41:12 -04:00
Marek Olšák
034b385fc2 radeonsi: move VS_STATE_SGPR before draw SGPRs
for vertex color clamping.
2018-06-28 22:27:25 -04:00
Marek Olšák
0c554bc5d5 radeonsi: don't use malloc in si_generate_gs_copy_shader 2018-06-28 22:27:25 -04:00
Marek Olšák
7bac3b589c radeonsi: disable DCC statistics gathering on everything but Stoney
I think we don't need it on other chips.
2018-06-28 22:27:25 -04:00
Marek Olšák
0da94fa19c radeonsi: don't enable DCC statistics gathering for small surfaces 2018-06-28 22:27:25 -04:00
Marek Olšák
f8b0c54e3f radeonsi: simplify logic around vi_separate_dcc_try_enable 2018-06-28 22:27:25 -04:00
Marek Olšák
41f80373b4 radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
2018-06-28 22:27:25 -04:00
Marek Olšák
fb28bf23db radeonsi: remove references to Evergreen 2018-06-28 22:27:25 -04:00
Marek Olšák
1542169a4a radeonsi: enable shader caching for compute shaders
Compute shaders were not using the shader cache.
2018-06-28 22:27:25 -04:00
Marek Olšák
d77557c9db radeonsi: store compute local_size into tgsi_shader_info
This is kinda a hack, but it's enough for the shader cache.
2018-06-28 22:27:25 -04:00
Marek Olšák
d13f240269 radeonsi: unify duplicated code for initial shader compilation 2018-06-28 22:27:25 -04:00
Marek Olšák
8e9c57a7fe ac: set +auto-waitcnt-before-barrier when needed
This removes useless s_waitcnt before barriers.
Only radeonsi uses this function.
2018-06-28 22:27:25 -04:00
Marek Olšák
7d6ec9d43b radeonsi/gfx9: insert the barrier between merged shaders inside the if block 2018-06-28 22:27:25 -04:00
Joe M. Kniss
70425bcfe6 gallium: plumb invariant output attrib thru TGSI
Add support for glsl 'invariant' modifier for output data declarations.
Gallium drivers that use TGSI serialization currently loose invariant
modifiers in glsl shaders.

v2: use boolean for invariant instead of unsigned.

Tested: chromiumos on qemu with virglrenderer.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-06-29 11:11:54 +10:00
Francisco Jerez
c2c803be7b intel/fs: Build 32-wide FS shaders.
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-28 13:25:21 -07:00
Jason Ekstrand
b95b0e2918 intel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaround
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-06-28 13:25:18 -07:00
Jason Ekstrand
d5e028a57b intel/fs: Add fields to wm_prog_data for SIMD32 dispatch
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
bcbc7d3a17 intel/fs: Fix nir_intrinsic_load_helper_invocation for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
7144247c2c intel/fs: Fix fs_builder::sample_mask_reg() for 32-wide FS dispatch.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
37c1df28c9 intel/fs: Fix Gen6+ interpolation setup for SIMD32
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
e208bc3bb7 intel/fs: Get rid of MOV_DISPATCH_TO_FLAGS
We can just emit the MOV in the two places where we use this.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
5e3028d826 intel/fs: Emit MOV_DISPATCH_TO_FLAGS once for the centroid workaround
There's no reason for us to emit it a pile of times and then have a
whole pass to clean it up.  Just emit it once like we really want.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
40fe108e2b intel/fs: Generalize the unlit centroid workaround
This generalizes the unlit centroid workaround so it's less code and now
supports SIMD32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
1d381731e0 intel/fs: Fix sample id setup for SIMD32.
v2 (Jason Ekstrand):
 - Disallow gl_SampleId in SIMD32 on gen7

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
2fd0aed89a intel/fs: Fix Gen7 compressed source region alignment restriction for SIMD32
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
6909aed90e intel/fs: Implement 32-wide FS payload setup on Gen6+
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
f6c4aace22 intel/fs: Extend thread payload layout to SIMD32
And handle 32-wide payload register reads in fetch_payload_reg().

v2 (Jason Ekstrand);
 - Fix some whitespace and brace placement

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
8f143f70d6 intel/fs: Wrap FS payload register look-up in a helper function.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
d996e5b812 intel/fs: Use fs_regs instead of brw_regs in the unlit centroid workaround
While we're here, we change to using horiz_offset() instead of abusing
half().

v2 (Jason Ekstrand):
 - Use horiz_offset() instead of half()

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
38aee1a06d intel/fs: Simplify fs_visitor::emit_samplepos_setup
The original code manually handled splitting the MOVs to 8-wide to
handle various regioning restrictions.  Now that we have a SIMD width
splitting pass that handles these things, we can just emit everything at
the full width and let the SIMD splitting pass handle it.  We also now
have a useful "subscript" helper which is designed exactly for the case
where you want to take a W type and read it as a vector of Bs so we may
as well use that too.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
244a0ff3a8 i965: Add plumbing for shader time in 32-wide FS dispatch mode.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
2d7d652d5c intel/fs: Disable opt_sampler_eot() in 32-wide dispatch.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
db6ca13efc intel/fs: Emit LINE+MAC for LINTERP with unaligned coordinates
On g4x through Sandy Bridge, src1 (the coordinates) of the PLN
instruction is required to be an even register number.  When it's odd
(which can happen with SIMD32), we have to emit a LINE+MAC combination
instead.  Unfortunately, we can't just fall through to the gen4 case
because the input registers are still set up for PLN which lays out the
four src1 registers differently in SIMD16 than LINE.

v2 (Jason Ekstrand):
 - Take advantage of both accumulators and emit LINE LINE MAC MAC
   (Based on a patch from Francisco Jerez)
 - Unify the gen4 and gen4x-6 cases using a loop

v3 (Jason Ekstrand):
 - Don't unify gen4 with gen4x-6 as this turns out to be more fragile
   than first thought without reworking the gen4 barycentric coordinate
   layout.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
566e6abd6d intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN
When we don't have PLN (gen4 and gen11+), we implement LINTERP as either
LINE+MAC or a pair of MADs.  In both cases, the accumulator is written
by the first of the two instructions and read by the second.  Even
though the accumulator value isn't actually ever used from a logical
instruction perspective, it is trashed so we need to make the scheduler
aware.  Otherwise, the scheduler could end up re-ordering instructions
and putting a LINTERP between another an instruction which writes the
accumulator and another which tries to use that result.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
73d60455e9 intel/fs: Rework INTERPOLATE_AT_PER_SLOT_OFFSET
This reworks INTERPOLATE_AT_PER_SLOT_OFFSET to work more like an ALU
operation and less like a send.  This is less code over-all and, as a
side-effect, it now properly handles execution groups and lowering so
SIMD32 support just falls out.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
74b477039d intel/fs: Add the group to the flag subreg number on SNB and older
We want consistent behavior in the meaning of the flag_subreg field
between SNB and IVB+.

v2 (Jason Ekstrand):
 - Add some extra commentary

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
2aefa5e19f intel/fs: Fix FB read header setup for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
e06f5b30cc intel/fs: Fix logical FB write lowering for SIMD32
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
ce370902d4 intel/fs: Fix FB write message control codegen for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
8b788069fb intel/fs: Don't enable dual source blend if no outputs are written
This prevents a crash in some arb_enhanced_layouts tests that would be
caused by the next commit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
48241c780a intel/fs: Fix codegen of FS_OPCODE_SET_SAMPLE_ID for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
789d20df36 intel/eu: Fix pixel interpolator queries for SIMD32.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
1650442026 intel/fs: Disable SIMD32 dispatch for fragment shaders with discard.
Current discard handling requires dedicating the second flag register to
discard.  However, control-flow in SIMD32 requires both flag registers
so it's incompatible with the current discard handling.  Just don't
support SIMD32+discard for now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
1811cbdc25 intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow
The hardware's control flow logic is 16-wide so we're out of luck
here.  We could, in theory, support SIMD32 if we know the control-flow
is uniform but we don't have that information at this point.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
d5b617a28e intel/fs: Split instructions low to high in lower_simd_width
Commit 0d905597f fixed an issue with the placement of the zip and unzip
instructions.  However, as a side-effect, it reversed the order in which
we were emitting the split instructions so that they went from high
group to low instead of low to high.  This is fine for most things like
texture instructions and the like but certain render target writes
really want to be emitted low to high.  This commit just switches the
order back around to be low to high.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 0d905597f "intel/fs: Be more explicit about our placement of [un]zip"
2018-06-28 13:19:38 -07:00
Jason Ekstrand
0b830081f0 intel/fs: Rework KSP data to be SIMD width-based
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
9d78abbef8 intel/compiler: Add and use helpers for working with KSP indices
The pixel shader dispatch table is kind-of a confusing mess.  This adds
some helpers for dealing with it and for easily extracting the correct
data from wm_prog_data.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Jason Ekstrand
85750348bc i965: Re-arrange shader kernel setup in WM state
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00
Francisco Jerez
5b6e91dd35 intel/fs: Remove program key argument from generator.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-06-28 13:19:38 -07:00