Commit graph

33056 commits

Author SHA1 Message Date
Rob Clark
e6c6495d3a freedreno: add debug option to force emulated indirect
Useful mostly for debugging indirect draw.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
f93f2f7b1e freedreno: also mark draw-indirect buffer as read
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
4b1d0d2844 freedreno: small cleanups
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
91730fb0ff freedreno: avoid unneccessary batch flush
In some cases we can end up trying to add a write dependency on ourself,
which shouldn't trigger a flush.

Avoids an extra couple flushes per from in stk.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
4ab6ab8036 freedreno: avoid mem2gmem for invalidated buffers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
2fcf6faa06 freedreno: deferred flush support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:41 -05:00
Rob Clark
15ebf387fc freedreno: rework fence tracking
ctx->last_fence isn't such a terribly clever idea, if batches can be
flushed out of order.  Instead, each batch now holds a fence, which is
created before the batch is flushed (useful for next patch), that later
gets populated after the batch is actually flushed.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:40 -05:00
Rob Clark
deb57fb237 freedreno: proper locking for iterating dependent batches
In transfer_map(), when we need to flush batches that read from a
resource, we should be holding screen->lock to guard against race
conditions.  Somehow deferred flush seems to make this existing
race more obvious.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:40 -05:00
Rob Clark
ef6313ffd3 freedreno/a5xx: correct max_indicies for indirect draws
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-03 14:17:40 -05:00
Eric Anholt
0ed952c7e9 broadcom/vc4: Use a single-entry cached last_hindex value.
Since almost all BOs will be in one CL at a time, this cache will almost
always hit except for the first usage of the BO in each CL.

This didn't show up as statistically significant on the minetest trace
(n=340), but if I lop off the throttled lobe of the bimodal distribution,
it very clearly does (0.74731% +/- 0.162093%, n=269).
2017-12-01 15:37:28 -08:00
Eric Anholt
230e646a40 broadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN.
No significant difference in the minetest replay, but it should reduce
overhead by not requiring that we write quad indices to index buffers that
we repeatedly re-upload (and making the draw packet smaller, as well).

Over the course of the series the actual game seems to be up by 1-2 fps.
2017-12-01 15:37:28 -08:00
Eric Anholt
5167367050 broadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES.
Now that there's only one user of it, it's pretty obvious how to avoid
emitting redundant ones.  This should save a bunch of kernel validation
overhead.

No statistically sigificant difference on the minetest trace I was looking
at (n=169), but the maximum FPS is up by .3%
2017-12-01 15:37:28 -08:00
Eric Anholt
842b05d6ad broadcom/vc4: Simplify the relocation handling for index buffers.
Originally there was CL code for handling various relocations back when I
had relocs for the TSDA/TA buffers.  Now that the kernel handles those
entirely on its own, I can inline that code into the one place using it.
2017-12-01 15:37:28 -08:00
Eric Anholt
84ab48c15c broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.
We failed to take the start into account for how many vertices to draw in
this round, so we would end up decrementing count below 0, which as an
unsigned number meant we would loop until the CLs soon ran out of space.

When I wrote the code I was thinking about how to use the previously
emitted shader state (no index bias baked into the elements) by emitting
up to 65535 and then only re-emitting with bias for the second wround, but
that doesn't work if the start is over 65535.  Instead, just delay
emitting shader state until we get into the drawarrays GFXH-515 loop and
always bake the bias in when we're doing the workaround.
2017-12-01 15:37:28 -08:00
Eric Anholt
bcb6ebe91a broadcom/vc4: Fix the scaling factor for the GFXH-515 workaround.
For triangle strips, we step by max_verts - 2.
2017-12-01 15:37:28 -08:00
Dylan Baker
f56e964e01 meson: use dep_thread instead of dependency('threads') in freedreno
They are the same thing, but this is more consistent with the rest of
the project.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-01 15:31:43 -08:00
Dylan Baker
5e71efef44 meson: Add lmsensors support
v2: - Make -Dlmsensors=false work
    - Simplify auto and true cases

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-01 15:31:43 -08:00
Eric Engestrom
29ee934331 gallium/hud: use #ifdef to test for macro existence
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-01 13:49:42 +00:00
Eric Engestrom
13a7a2d455 amd: remove always-true BRAHMA_BUILD define
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-01 13:49:42 +00:00
George Kyriazis
95adbe1a4e swr/scons: Fix intermittent build failure
gen_rasterizer*.cpp depends on gen_ar_eventhandler.hpp.
Account for new dependency.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-12-01 07:47:13 -06:00
Dave Airlie
4e7f6437b5 r600: add ARB_shader_storage_buffer_object support (v3)
This just builds on the image support. Evergreen only has ssbo
for fragment and compute no other stages.

v2: handle images and ssbo in the same shader properly (Ilia)
v3: fix RESQ on buffers,
    fix missing atom emit
    fix first element offset
    use R32 format
    write separate buffer rat store path.
(from running deqp gles3.1 tests)

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-01 06:12:31 +00:00
Dave Airlie
c758fd05d8 r600/cayman: looks like cmpxchg moved to Z
On cayman it appears the cmp component is now in Z.

Fixes:
arb_shader_image_load_store-dead-fragments on cayman.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-01 03:59:17 +00:00
Dave Airlie
4f3e73516c r600/shader: fix 64->32 conversions
These didn't handle the TGSI at all properly, this fixes
them to use the common path for 64->32 then adds the 32->int
on at the end.

Fixes:
generated_tests/spec/arb_gpu_shader_fp64/execution/conversion/*

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-01 03:48:35 +00:00
Marek Olšák
ed4780383c radeonsi/gfx9: fix importing shared textures with DCC
VI has 11 dwords at least. GFX9 has 10 dwords.

Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-30 18:46:11 +01:00
Tapani Pälli
faccbaf3fa mesa: add AllowGLSLCrossStageInterpolationMismatch workaround
This fixes issues seen with certain versions of Unreal Engine 4 editor
and games built with that using GLSL 4.30.

v2: add driinfo_gallium change (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97852
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103801
Acked-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-30 11:43:10 +02:00
Wladimir J. van der Laan
f1a9a724f9 etnaviv: GC7000: Factor out state based texture functionality
Prepare for two texture handling paths, the descriptor-based
path will be added in a future commit. These are structured
so that the texture implementation handles its own state
emission.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:33:20 +01:00
Wladimir J. van der Laan
075f8cd7de etnaviv: GC7000: Move active_samplers_bits to texture
This needs to be shared between texture_plain and texture_desc.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:33:16 +01:00
Wladimir J. van der Laan
260a5e2a1a etnaviv: GC7000: Factor out incompatible texture handling logic
This will be shared with the texture descriptor path.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:33:11 +01:00
Wladimir J. van der Laan
9d1f8805b0 etnaviv: GC7000: Track dirty sampler views
Need this to efficiently emit texture descriptor invalidations.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:33:07 +01:00
Wladimir J. van der Laan
5cc36f9f21 etnaviv: GC7000: Make point sprites work on HALTI5
Track varying component offset of the point size output, as well as
provide the offset of the point coord input.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:33:02 +01:00
Wladimir J. van der Laan
3d09bb390a etnaviv: GC7000: State changes for HALTI3..5
Update state objects to add new state, and emit function to emit new
state.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:32:33 +01:00
Wladimir J. van der Laan
acd3dff463 etnaviv: GC7000: Update screen specs for HALTI5
- This core must load shaders from memory (AFAIK)
- Yet another new location for UNIFORMS

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:32:21 +01:00
Wladimir J. van der Laan
c6033e84bb etnaviv: GC7000: Update context reset for ..HALTI5
Update context reset for HALTI3..HALTI5, sorting states for the HALTI
version that has them.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:28:09 +01:00
Wladimir J. van der Laan
baff59ebf0 etnaviv: GC7000: No RS align when using BLT
RS align is not necessary and might even be harmful when using the BLT
engine for blitting.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:28:02 +01:00
Wladimir J. van der Laan
dd3a04c2c3 etnaviv: GC7000: BLT engine blitting support
Add an implemenation of key clear_blit functions using the BLT engine
that replaced the RS on GC7000.

Also set level->size correctly for imported resources. This is important
for the BLT resolve-in-place path to work for them.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:27:57 +01:00
Wladimir J. van der Laan
079bbaec0c etnaviv: GC7000: Factor out RS blit functionality
Prepare for BLT-based blitting path by moving RS-based
blitting to the RS implementation file, making this
self-contained.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:27:53 +01:00
Wladimir J. van der Laan
77768b1859 etnaviv: GC7000: Move etna_coalesce to emit header file
Want to be able to emit state from the texture implementation,
and the blitter implementation.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:27:48 +01:00
Wladimir J. van der Laan
571d980695 etnaviv: GC7000: Support BLT as recipient for etna_stall
When the BLT is involved as source or target, add an extra BLT
enable/disable sequence around the sync sequence.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:27:43 +01:00
Wladimir J. van der Laan
150d8766ea etnaviv: Use only DRAW_INSTANCED on GC3000+
The blob does this, as DRAW_INSTANCED can replace fully all the other
draw commands. It is also required to handle integer vertex formats.
The other path is only there for compatibility and might go away (or at
least rot to become buggy due to dis-use) in newer hardware.

As a by-effect this changes the behavior for GC3000-, by no longer using
the index offset for DRAW_INDEXED but instead adding it to INDEX_ADDR.
This should make no difference.

Preparation for GC7000 support.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-11-30 07:26:55 +01:00
Wladimir J. van der Laan
23630ab1b6 etnaviv: Emit SCALE for vertex attributes
This is used by HALTI2+ (GC3000+) when drawing with DRAW_INSTANCED.

It is also necessary when switching between integer and floating point
vertex element formats.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-30 07:26:46 +01:00
Dave Airlie
2c4861e453 r600: no need to reinit compute regs
Compute setup gets emitted into the normal gfx state buffer,
so no need to reinit the basics.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-30 09:53:22 +10:00
Dave Airlie
ea355e29f7 r600: split cb setup code out from evergreen compute path.
This just makes it easier to bypass for TGSI later.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-30 09:39:25 +10:00
Dave Airlie
77c70e5fe5 r600: add support for compute pkt flags to debug dumping.
This just lets us see packets marked for compute.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-30 09:32:31 +10:00
Dave Airlie
779306c8b6 r600: fix bfe where src/dst are same.
This fixes overlaps where src/dst are the same.

Fixes a bunch of the deqp bitfield tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-30 09:32:31 +10:00
Adam Jackson
0d044351b7 gallium/dri2: Enable {GLX_ARB,EGL_KHR}_context_flush_control
Reviewed-and-tested-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-29 16:00:24 -05:00
Marek Olšák
2c5f2936af r300,r600,radeonsi: replace RADEON_FLUSH_* with PIPE_FLUSH_*
and handle PIPE_FLUSH_HINT_FINISH in r300.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
950221f923 radeonsi: remove r600_common_screen
Most files in gallium/radeon now include si_pipe.h.

chip_class and family are now here:
    sscreen->info.family
    sscreen->info.chip_class

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
4d1fe8f964 radeonsi: remove r600_pipe_common::barrier_flags::compute_to_L2
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
c0d44fe0e9 radeonsi: remove query/apply_opaque_metadata callbacks
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
2208b760f3 radeonsi: move shader debug helpers out of r600_pipe_common.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00