Commit graph

89540 commits

Author SHA1 Message Date
Bas Nieuwenhuizen
d4ecc3c929 ac/nir: Add loading from LDS for merged GS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:29 +01:00
Bas Nieuwenhuizen
ec53e52742 ac/nir: Add ES output to LDS for GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:18 +01:00
Bas Nieuwenhuizen
3e77333030 ac/nir: Add merged GS function.
[airlied: merged fixup + and fixed up a couple more bits].

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:14 +01:00
Bas Nieuwenhuizen
f82797b56d radv: Only emit TES when it exists.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:14:14 +01:00
Bas Nieuwenhuizen
6e21b7a294 radv: Use control shader presence for detecting tess.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:11:10 +01:00
Dave Airlie
5bc5e07d81 radv: fixup tess eval shader when combined.
This fixes some access to the tess eval shader when it's combined
with geometry on gfx9.

This is a review of Bas's commit:
radv: Prevent crashing by accessing TES for VGT reuse depth.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:11:10 +01:00
Bas Nieuwenhuizen
e6acc20b6a radv: Set VGT_GS_MODE properly for gfx9
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 05:55:11 +01:00
Dave Airlie
99281c1e8f radv: ensure correct outinfo is picked.
This struct used to rely on being in a union, it isn't anymore,
so we have to pick the correct outinfo struct now.

This should fix a regression since the union became a struct.

dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set

Fixes: 6078a3bd51 (ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 14:44:09 +10:00
George Kyriazis
f9d239e11f swr: Rework scratch space allocation
Remove allocation of > 2kbyte buffers into context memory in
swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers
and shader constants to a scratch space to be used by the upcoming draw.)

Large shader constant allocations need to be done in the circular scratch
buffer instead of context memory, because their values persist across
render calls.

Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger
buffers will get too large for the circular scratch space.

Fixes render issues with CEI Ensight.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 20:18:09 -05:00
Bas Nieuwenhuizen
ffaf4d608a radv: Enable tessellation shaders for GFX9.
It mostly works now.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 01:50:43 +02:00
Dave Airlie
1dda214d9c ac/nir: init full exec mask for merged shaders.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 01:50:40 +02:00
Dave Airlie
14978a1c3b radv: drop unused r600_htile_info.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 00:38:57 +01:00
Dave Airlie
c8eb3558cc radv: fix CLEAR_STATE packet length.
Looking at shader traces I noticed some registers were missing,
one of them was being eaten by the wrong clear state length.

Fixes: 4f42ea4dc (radv: use CLEAR_STATE for initializing some registers)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-19 23:56:48 +01:00
Dylan Baker
a447f9fe7b meson: don't build gallium dri target if gallium is disabled
Otherwise -Dgallium-drivers= will cause libmesa_gallium to be built and
the megadriver install script to attempt to install drivers without any
actual drivers being built.

fixes: 66f97f6640 ("meson: build radeonsi")
Reported-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2017-10-19 15:17:34 -07:00
Timothy Arceri
087e010b2b radv: copy indirect lowering settings from radeonsi
It looks the original indirect mask was probably copied from
ANV.

Sascha Willems demo results:

tessellation ~4000 -> ~4200 fps

V2: continue lowering local indirects due to llvm deficiencies.

Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 08:01:26 +11:00
Timothy Arceri
5549b47d7b radv: stop redundant setting of active_stages
We already set it when above in the nir compilation loop.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 08:01:26 +11:00
Timothy Arceri
bebfeb7e1c ac: move some code out of loop in store_tcs_output()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 08:01:26 +11:00
Bas Nieuwenhuizen
228325f4b7 radv: Modify rsrc1/rsrc2 generation for merged tess.
No OC_LDS_EN for HS, and the included LS vgpr_comp_cnt is at
a different offset.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:44 +02:00
Bas Nieuwenhuizen
8250efb90a radv: Set correct registers for merged shader rings.
We need different regs to end up in s0/s1.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:39 +02:00
Bas Nieuwenhuizen
6a074f87be radv: Add GFX9 HS emitting code.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:34 +02:00
Bas Nieuwenhuizen
b096245030 radv: Remove remaining hard coded references to VS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:31 +02:00
Bas Nieuwenhuizen
91b033f4f6 radv: Update GFX9 user data regs for GS/tess.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:27 +02:00
Bas Nieuwenhuizen
ce03c119ce radv: Add code to compile merged shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:23 +02:00
Bas Nieuwenhuizen
640f2c458f ac/nir: Add LS-HS input VGPR workaround.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:19 +02:00
Bas Nieuwenhuizen
0a182e73d9 ac/nir: Compile the bodies of multiple shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:15 +02:00
Bas Nieuwenhuizen
56d8af1ec5 ac/nir: Expand user SGPR descriptions a bit.
To prevent VS/TCS collisions in merged shaders.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:07 +02:00
Bas Nieuwenhuizen
25efef40d2 ac/nir: Don't write to the dynamic HS word on GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:04 +02:00
Bas Nieuwenhuizen
d8bd693d03 ac/nir: Add function creation for merged LS+HS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:00 +02:00
Bas Nieuwenhuizen
0cdc8b26f8 ac/nir: Make scan_shader_output_decl less dependent on the context.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:56 +02:00
Bas Nieuwenhuizen
6078a3bd51 ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:51 +02:00
Bas Nieuwenhuizen
a996ed1f9b ac/nir: Change interface to allow multiple source shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:47 +02:00
Bas Nieuwenhuizen
872b21487c ac/nir: Add HS calling convention.
Needed for GFX9 merged shaders.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:42 +02:00
Bas Nieuwenhuizen
163a4bf386 ac: Parse the new HS RSRC1 register.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:20 +02:00
Tim Rowley
bfda35c8dd swr: knob overrides for Intel Xeon Phi
Architecture benefits from having more threads/work outstanding.

Patch by Jan Zielinski.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
028ffa5e18 swr/rast: Add api to override draws in flight
Allow draws in flight to be overridden via SWR_CREATECONTEXT_INFO.

Patch by Jan Zielinski.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
2559f2b93e swr/rast: Widen fetch shader to SIMD16 (disabled for now)
Refactored the gather operation to process 16 elements at a time via
paired SIMD8 operations.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
49090ccf54 swr/rast: Change DS memory allocation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
04ea03d99d swr/rast: Fix indentation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
62e2d657c8 swr/rast: Miscellaneous viewport array code changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
ed1db803fa swr/rast: Minor changes for os-x
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Kenneth Graunke
82144b7392 i965: Don't disable aux buffers for non-overlapping miplevels.
Meta's GenerateMipmap implementation binds the same image for both
sampling and rendering - but it samples from one miplevel while
rendering the next.  This is a false self-dependency, and there's
no need to disable auxiliary buffers in this case.  In fact, we really
want to leave it enabled so the new miplevels gain color compression.

Thankfully, the texture object's _MaxLevel is always one shy of the
miplevel being rendered.  So we can simply check if irb->mt_level is
overlaps with the texture's defined levels.  If not, there's no self-
dependency and we can leave the auxiliary buffers enabled.

Fixes a performance regression in GFXBench4 Car Chase, which apparently
calls glGenerateMipmap() on every frame.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103247
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
2017-10-19 11:10:00 -07:00
Kenneth Graunke
fa6ca6991b i965: Remove the intel_miptree_prepare_fb_fetch wrapper.
Now that intel_miptree_prepare_texture takes levels and layers, there's
not much use in this anymore.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
2017-10-19 11:10:00 -07:00
Kenneth Graunke
e208d7f874 i965: Only resolve texture levels/layers that are accessed.
This should avoid unnecessary resolves when working with texture views.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
2017-10-19 11:10:00 -07:00
Kenneth Graunke
0954ce1000 i965: Make intel_miptree_prepare_texture() take level/layer arguments.
This effectively exports intel_miptree_prepare_texture_slices() as
intel_miptree_prepare_texture().  The hope is to avoid resolves for
when using texture views that access a subset of the levels/layers.

For now, we pass the same arguments to separate the mechanical change
from the one that actually modifies our behavior.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
2017-10-19 11:10:00 -07:00
Tim Rowley
33bdbc1db4 gallium: add more exceptions to tgsi_util_get_inst_usage_mask
A number of double/int64 operations don't have matching
read and write usage masks, which the fallthrough case of
tgsi_util_get_inst_usage_mask assumes for componentwise
tagged instructions.

No regressions in llvmpipe piglit; fixes a large number of
swr regressions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-19 12:49:32 -05:00
Kenneth Graunke
113a6a639f isl: Fix width check in isl_gen7_choose_msaa_layout.
The restriction is supposed to apply if the width *field* is >= 8192,
meaning the actual width *value* is >= 8193.

The code also incorrectly used == for some reason.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-19 10:21:45 -07:00
Kenneth Graunke
68f69ebdcc i965: Use is_scheduling_barrier instead of schedule_node::is_barrier.
Commit a73116ecc6 tried to make add_barrier_deps()
walk to the next barrier, and stop.  To accomplish that, it added an
is_barrier flag.  Unfortunately, this only works half of the time.

The issue is that add_barrier_deps() walks both backward (to the
previous barrier), and forward (to the next barrier).  It also sets
is_barrier.  Assuming that we're processing instructions in forward
order, this means that is_barrier will be set for previous instructions,
but not future ones.  So we'll never see it, and walk further than we
need to.

dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
now compiles its shaders in 3.6 seconds instead of 3.3 minutes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Pallavi G <pallavi.g@intel.com>
2017-10-19 10:19:20 -07:00
Kenneth Graunke
3d112a7cd4 i965: Move fs_inst::has_side_effects()'s eot check to the parent class.
This eliminates a layer of wrapping, and makes a backend_instruction
sufficient.  The downside is that it exposes 'eot' to the vec4 backend,
which it doesn't need, but can basically happily ignore.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Pallavi G <pallavi.g@intel.com>
2017-10-19 10:19:20 -07:00
Roland Scheidegger
77b8392858 tgsi: fix tgsi_util_get_inst_usage_mask
The logic for handling shadow coords was completely broken.
Fixes be3ab867bd.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103265

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-19 16:33:39 +02:00
Iago Toral Quiroga
2d87caa279 glsl/linker: produce error when invalid explicit locations are used
We only need to add a check to validate output locations here. For
inputs with invalid locations we will fail to link when we can't
find a matching output in the same (invalid) location.

v2: compute location slots properly depending on shader stage and
    variable type / direction

Fixes:
KHR-GL45.enhanced_layouts.varying_location_limit

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-19 11:27:12 +02:00