Commit graph

117163 commits

Author SHA1 Message Date
Arno Messiaen
a9391a1a01 lima: add cubemap support
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-10-31 06:29:31 +00:00
Arno Messiaen
9890590fba lima: introduce ppir_op_load_coords_reg to differentiate between loading texture coordinates straight from a varying vs loading them from a register
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-10-31 06:29:31 +00:00
Arno Messiaen
28e1d55d6e lima: add layer_stride field to lima_resource struct
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-10-31 06:29:31 +00:00
Arno Messiaen
f3686083a4 lima: fix stride in texture descriptor
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-10-31 06:29:31 +00:00
Ian Romanick
7b3f38ef69 intel/compiler: Report the number of non-spill/fill SEND messages on vec4 too
This make shader-db's report.py work on Haswell and earlier platforms.
The problem is that the script would detect the "sends" output for
scalar shaders and expect in in vec4 shaders too.  When it didn't find
it, the script would fail with:

    Traceback (most recent call last):
      File "./report.py", line 351, in <module>
        main()
      File "./report.py", line 182, in main
        before_count = before[p][m]
    KeyError: 'sends'

Fixes: f192741ddd ("intel/compiler: Report the number of non-spill/fill SEND messages")

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 21:27:03 -07:00
Tapani Pälli
b380d47998 nir: fix couple of compile warnings
Fixes "warning: braces around scalar initializer" warnings.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 00:21:44 +00:00
Bas Nieuwenhuizen
ec770085c2 radv: Fix timeout handling in syncobj wait.
libdrm returns -errno instead of directly the ioctl ret of -1.

Fixes: 1c3cda7d27 "radv: Add syncobj signal/reset/wait to winsys."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-31 00:48:17 +01:00
Ilia Mirkin
1b9d1e13d8 nv50/ir: mark STORE destination inputs as used
Observed an issue when looking at the code generatedy by the
image-vertex-attrib-input-output piglit test. Even though the test
itself worked fine (due to TIC 0 being used for the image), this needs
to be fixed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2019-10-30 19:13:18 -04:00
Ilia Mirkin
869e32593a gm107/ir: fix loading z offset for layered 3d image bindings
Unfortuantely we don't know if a particular load is a real 2d image (as
would be a cube face or 2d array element), or a layer of a 3d image.
Since we pass in the TIC reference, the instruction's type has to match
what's in the TIC (experimentally). In order to properly support
bindless images, this also can't be done by looking at the current
bindings and generating appropriate code.

As a result all plain 2d loads are converted into a pair of 2d/3d loads,
with appropriate predicates to ensure only one of those actually
executes, and the values are all merged in.

This goes somewhat against the current flow, so for GM107 we do the OOB
handling directly in the surface processing logic. Perhaps the other
gens should do something similar, but that is left to another change.

This fixes dEQP tests like image_load_store.3d.*_single_layer and GL-CTS
tests like shader_image_load_store.non-layered_binding without breaking
anything else.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "20.0" <mesa-stable@lists.freedesktop.org>
2019-10-30 19:12:36 -04:00
Lionel Landwerlin
e02c181bfd intel/dev: set default num_eu_per_subslice on gen12
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8125d7960b ("intel/dev: Add preliminary device info for Tigerlake")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-30 22:30:09 +00:00
Dylan Baker
4226952199 docs/new_features: Empty the feature list for the 20.0 cycle 2019-10-30 15:18:27 -07:00
Dylan Baker
1fdcc2494e Bump VERSION to 20.0.0-devel 2019-10-30 14:56:02 -07:00
Jordan Justen
98da208660
docs/relnotes/new_features.txt: Add note about gen12 support
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-30 14:08:51 -07:00
Jordan Justen
2b186264cc
intel/eu/validate/gen12: Add TGL to eu_validate tests.
These reworks were combined into this patch:

 * Matt Turner: i965: Disable NoDDChk/NoDDClr test on Gen12+
 * Francisco Jerez: intel/eu/validate/gen12: Disable
   qword_low_power_no_depctrl eu_validate test.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 14:08:51 -07:00
Jordan Justen
8125d7960b
intel/dev: Add preliminary device info for Tigerlake
Reworks:
 * adjust 64-bit support, hiz (Jason Ekstrand)
 * sim-id (Lionel Landwerlin)
 * adjust threads, urb size (Rafael Antognolli)
 * adjust urb size (Kenneth Graunke)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 14:08:48 -07:00
Lionel Landwerlin
632995227c intel/dump_gpu: handle context create extended ioctl
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-30 21:58:31 +02:00
Bas Nieuwenhuizen
ae454a03b7 radv: Allocate space for temp. semaphore parts.
Calculated the number for allocation and did not
reserve space ....

Fixes: 2117c53b72 "radv: Add temporary datastructure for submissions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 20:51:39 +01:00
Rafael Antognolli
3c317e8187 anv: Add Tile Cache Flush for Unified Cache. 2019-10-30 19:51:03 +00:00
Rafael Antognolli
a99c67b690 blorp: Add Tile Cache Flush for Unified Cache. 2019-10-30 19:51:03 +00:00
Rafael Antognolli
d3995c19eb iris: Add Tile Cache Flush for Unified Cache. 2019-10-30 19:51:03 +00:00
Jordan Justen
f573cd4757 intel/genxml: Add gen12 tile cache flush bit
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-30 19:51:03 +00:00
Daniel Schürmann
8678699918 aco: implement VGPR spilling
VGPR spilling is implemented via MUBUF instructions and scratch memory.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
c79972b604 aco: always set scratch_offset in startpgm
This patch also moves private_segment_buffer and
scratch_offset to Program to easily access it.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
b0de16b7de aco: omit linear VGPRs as spill variables
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
aded548e66 aco: ensure that spilled VGPR reloads are done after p_logical_start
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
a7ff1bb5b9 aco: simplify calculation of target register pressure when spilling
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Rhys Perry
e73de4e1d8 aco: fix new_demand calculation for first instructions
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
93b42a1907 aco: don't add interferences between spilled phi operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
fdf8ad0256 aco: consider loop_exit blocks like merge blocks, even if they have only one predecessor
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
d48d72e98a aco: don't insert the exec mask into set of live-out variables when spilling
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
cd20e29de1 aco: fix transitive affinities of spilled variables
Variables spilled on both branch legs need to be assigned to the same spilling slot.
These affinities can be transitive through multiple merge blocks.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
8023dcd71e aco: fix live-range splits of phis
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
655a703349 aco: remove potential critical edge on loops.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
78bca0d0ce aco: improve live variable analysis
This patch makes the live variable analysis more precise
w.r.t. killed phi operands and the block's register pressure.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:32 +00:00
Daniel Schürmann
0b8216b2cd aco: Lower to CSSA
Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes.
Previously, it was possible that phi operands have intersecting live-ranges, and thus,
couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to
spill phis, even if it was beneficial.
This patch implements a conversion pass which is currently only called if spilling is necessary.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:32 +00:00
Jonathan Marek
329d322a16 etnaviv: fix non-pointsprite points on GC7000L
Fixes these deqp tests (and more):
dEQP-GLES2.functional.draw.draw_arrays.points.single_attribute
dEQP-GLES2.functional.draw.draw_arrays.points.multiple_attributes
dEQP-GLES2.functional.draw.draw_arrays.points.default_attribute
dEQP-GLES2.functional.draw.draw_elements.points.single_attribute
dEQP-GLES2.functional.draw.draw_elements.points.multiple_attributes
dEQP-GLES2.functional.draw.draw_elements.points.default_attribute

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-30 19:44:41 +00:00
Jonathan Marek
ad5cbbd228 etnaviv: stencil fix
The final version of previous stencil fix patch ended up breaking one-sided
stencil.

Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L):
dEQP-GLES2.functional.fragment_ops.depth_stencil.*

Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0

Fixes: 05da025f ("etnaviv: fix two-sided stencil")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-30 19:44:41 +00:00
Jonathan Marek
7b524e1acb etnaviv: fix depth bias
Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L):
dEQP-GLES2.functional.polygon_offset.*

Fixes: 6c3c05dc ("etnaviv: fix polygon offset")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-30 19:44:41 +00:00
Jordan Justen
b529db00ee
iris: Set MOCS for external surfaces to uncached
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 12:42:54 -07:00
Rafael Antognolli
ffb46b2bb7 iris: Align fast clear color state buffer to a page.
On gen11 and older, compressed images are tiled and aligned to 4K. On
gen12 this 4K alignment restriction was removed. However, only aligning
the fast clear color buffer to 64B (a cacheline, as it's on the
documentation) is causing some bugs where the fast clear color is not
converted during the fast clear operation. Aligning things to 4K seems
to fix it.

v2: Fix typo case in the comment (Nanley)
v3: Rebase and fix conflicts.
v4: Fix rebase mistake (Nanley).

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-30 19:41:29 +00:00
Rafael Antognolli
e51722a7c7 anv: Align fast clear color state buffer to a page.
On gen11 and older, compressed images are tiled and aligned to 4K. On
gen12 this 4K alignment restriction was removed. However, only aligning
the fast clear color buffer to 64B (a cacheline, as it's on the
documentation) is causing some bugs where the fast clear color is not
converted during the fast clear operation. Aligning things to 4K seems
to fix it.

v2: Assert that image->planes[plane].offset is 4K aligned (Nanley)

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-30 19:41:29 +00:00
Erik Faye-Lund
477f019812 zink: only enable KHR_external_memory_fd if supported
While we're at it, make sure we error out if it's not supported when
required.

This brings us a bit closer to being able to test on SwiftShader, which
doesn't currently support KHR_external_memory_fd.
2019-10-30 19:40:50 +00:00
Bas Nieuwenhuizen
780c937a5d radv: Start signalling semaphores in WSI acquire.
Winsys semaphores without signal operation get silently ignored.

Not so for syncobjs, so actually signal them.

Fixes: 84d9551b23 "radv: Always enable syncobj when supported for all fences/semaphores."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2030
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 19:42:10 +01:00
Rhys Perry
e1bcc7a828 aco: rename README to README.md
Closes: #1974
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 18:16:00 +00:00
Rhys Perry
d4684a294b aco: a couple loop handling fixes for GFX10 hazard pass
It was joining from the wrong blocks and block.kind is a bitmask instead
of an enum.

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-30 18:13:53 +00:00
Matt Turner
12d3b11908 intel/compiler: Add instruction compaction support on Gen12
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
c8fbc8823f intel/compiler: Make separate src0/src1 index tables
TGL uses different data (and even a different format!) for each source.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
cde73625f8 intel/compiler: Inline get_src_index()
TGL will have separate tables for src0 and src1, so the shared function
will no longer make sense.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
d0eff8a539 intel/compiler: Restructure instruction compaction in preparation for Gen12
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
ded9fb2b18 intel/compiler: Remove unreachable() from brw_reg_type.c
The EU compaction unit test fuzzes the compaction code by flipping bits.
We use a simple skip_bits() function with a list of reserved bits to
ignore, but for more complex cases like invalid combinations of register
file:type, we need either machinery to check validity or for these
functions to simply inform us whether a combination was valid.

enum brw_reg_type a 4-bit field in brw_reg, so rather than expanding it
with an "INVALID" value, just return -1 and let the caller check for
that.

Scott suggested redefining unreachable() within the unit test to
longjmp() which would allow driver code like this to still use it and
allow the test to handle expected failures like this. If that plan works
out, I plan to revert this.
2019-10-30 11:11:50 -07:00