Commit graph

28438 commits

Author SHA1 Message Date
Eric Anholt
3da4e38f48 vc4: Add QPU scheduling to handle MUL rotate sources.
We need MUL rotates to do ddx/ddy support.
2016-08-25 17:24:11 -07:00
Eric Anholt
b0b99a7952 vc4: Add disassembly for constant MUL rotates 2016-08-25 17:24:11 -07:00
Eric Anholt
b160708e03 vc4: Add real validation for MUL rotation.
Caught problems in the upcoming DDX/DDY implementation.
2016-08-25 17:24:11 -07:00
Eric Anholt
31da39ddc9 vc4: Add a QIR value for the QPU element register.
This will be used in the ddx/ddy support for "Am I the top half?" or "Am I
the left half?" checks.
2016-08-25 17:24:11 -07:00
Marek Olšák
a491b9e945 radeonsi: don't use allocas for arrays with LLVM 3.8
It crashes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97413
2016-08-25 21:19:17 +02:00
Marek Olšák
fe91ae06d3 gallium/radeon: unify and simplify checking for an empty gfx IB
We can take advantage of the fact that multi_fence does the obvious thing
with NULL fences.

This fixes unflushed fences that can get stuck due to empty IBs.
2016-08-25 21:19:17 +02:00
Marek Olšák
3ff0b67e1b radeonsi: disable SDMA texture copying on Carrizo
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-08-25 14:51:08 +02:00
Marek Olšák
1276316d67 gallium/noop: use 3-space indentation
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-25 14:09:48 +02:00
Marek Olšák
9daaa6f5a6 gallium: add a pipe_context parameter to resource_get_handle
radeonsi needs to do some operations (DCC decompression) for OpenGL-OpenCL
interop and this is the only way to make it coherent with the current
context. It can optionally be set to NULL.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-25 14:09:48 +02:00
Samuel Pitoiset
a227b0a4f1 nvc0: invalidate textures/samplers on GK104+
Like Fermi, textures and samplers are aliased between 3D and compute,
especially the TIC_FLUSH/TSC_FLUSH methods and we have to re-validate
these resources when switching between the two pipelines.

This fixes a GPU hang with Elemental (and most likely with other UE4 demos).

Tested on GK107 and GM107.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
CC: <mesa-stable@lists.freedesktop.org>
2016-08-24 22:26:36 +02:00
Rhys Kidd
c9c989763a gallium/ttn: Remove duplicated TGSI_OPCODE_DP2A initialization
Duplicate line is currently on 1535.

Identified by Clang, when run through Eric Anholt's Travis harness.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-24 11:54:50 -07:00
Eric Anholt
87a88f2daa vc4: Fix GPU hangs with >16 varying values.
Fixes glsl-routing in piglit and hangs in glbenchmark 2.0.2.
2016-08-24 10:43:22 -07:00
Leo Liu
5277f25480 vl/rbsp: fix another three byte not detected
This happens when three byte "00 00 03" is partly loaded to
vlc->buffer, thus at the bottom of buffer with valid bits is
"00" or "00 00" and left  like "00 03" or "03" in the data,
so that it will not be detected by three byte emulation check.
The reason for that is the escaped bit was set to 0 from the
rbsp init.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2016-08-24 11:17:16 -04:00
Marek Olšák
2c13abb491 radeonsi: fix VM faults due NULL internal const buffers on CIK
They are harmless, but the interrupts do decrease performance.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97039

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-08-24 15:39:57 +02:00
Tomasz Figa
577f85e2bb gallium/winsys/kms: Look up the GEM handle after importing a prime FD
drmPrimeHandleToFD() will return the same GEM handle every time the same
buffer is imported, even from a different prime FD. Since GEM handles
are not reference counted, we need to make sure that each GEM handle is
referenced only by one display target struct, by looking it up in
kms_sw->bo_list first and bumping the refcount of the found dt on hit
and falling back to creating a new dt only on miss.

v2: Split into separate function.
    Use helper function for lookup.

v3 [Emil Velikov]:
    Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan)

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-08-24 14:39:23 +01:00
Tomasz Figa
0465c72d46 gallium/winsys/kms: Move display target handle lookup to separate function
As a preparation to use the lookup in more than once place, move the
code that looks up given KMS/GEM handle to a separate function. This
change should not introduce any functional changes.

v2: Split into separate patch.
    Move lookup code into separate function.

v3 [Emil Velikov]:
    Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan)

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-08-24 14:39:23 +01:00
Tomasz Figa
e71b78ebf9 gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2)
Currently kms_sw_displaytarget_add_from_prime() allocates the struct and
fills in only some of the fields, resulting in a half-baked struct that
needs to be further completed by the caller. To make this a bit more
consistent, pass width, height and stride to this function and fill in
everything there, so that caller can take the returned struct as is.

v2: Split from one big patch into four fixing one thing at a time.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-08-24 14:39:23 +01:00
Tomasz Figa
0aa6a818ef gallium/winsys/kms: Fix double refcount when importing from prime FD (v2)
Currently the code creates a display target struct with refcount field
initialized to 1 and then the caller again increments it, leading to
a leaked reference. Let's remove the unnecessary increment.

v2: Split from one big patch into four fixing one thing at a time.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-08-24 14:39:22 +01:00
Eric Engestrom
9411eb67ec gallium/cso: avoid unnecessary null dereference
The label `out:` calls `destroy()` which dereferences `ctx`.
This is unnecessary as there is nothing to destroy.
Immediately return instead.

CovID: 1258255
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-24 11:35:05 +01:00
Eric Engestrom
f6b9fb6e4c st/xvmc: fix a couple 'unused-but-set-variable' warnings
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-08-24 11:32:00 +01:00
Kai Wasserbäch
f033d97155 st/va: Remove unused variable coded_size from vlVaEndPicture()
Removes the following GCC warning:
 ../../../../../src/gallium/state_trackers/va/picture.c:542:17: warning:
  unused variable 'coded_size' [-Wunused-variable]
    unsigned int coded_size;
                 ^~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2016-08-24 10:35:53 +02:00
Kai Wasserbäch
83d08d4cab st/va: Remove else case in vlVaEndPicture() made superfluous by c59628d11b
Commit c59628d11b made the else statement
and duplication of the context->decoder->end_frame() call superfluous.

Cc: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2016-08-24 10:35:20 +02:00
Eric Engestrom
cd340052ad st/va: add missing mutex_unlock
Fixes: c59628d11b ("st/va: enable dual instances encode by sync surface")

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-08-24 10:33:07 +02:00
Ilia Mirkin
361678edd7 st/dri: respect driver's request to avoid mixed color/depth bit configs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-23 18:30:53 -04:00
Ilia Mirkin
9515d651f9 gallium: add a cap to expose whether driver supports mixed color/zs bits
Some hardware can't render to color/depth buffers of mixed bitness. When
that happens a fallback has to happen, but this allows the driver to
express that this isn't an optimal scenario. The purpose of this is to
remove such fbconfigs from the GLX/EGL config list.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-23 18:30:49 -04:00
Ilia Mirkin
528390021f dri: add a way to request that modes have matching color/zs depths
Some GPUs, notably nv3x/nv4x can't render to mismatched color/zs
framebuffer depths. Fallbacks can be done by the driver, with shadow
surfaces, but no reason to encourage applications to select non-matching
glx visuals.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-23 18:30:30 -04:00
Ilia Mirkin
092f994a03 nv50/ir: make sure cfg iterator always hits all blocks
In some very specially-crafted cases, we could attempt to visit a node
that has already been visited, and then run out of bb's to visit, while
there were still cross blocks on the list. Make sure that those get
moved over in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96274
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2016-08-23 18:30:12 -04:00
Eric Anholt
47e3cc7557 vc4: Tell state_tracker that we would prefer NIR.
Before this series, the code generation path was:

GLSL IR -> TGSI -> NIR -> NIR clone -> QIR -> QPU

Now it's (generally)

GLSL IR -> NIR -> NIR clone -> QIR -> QPU
2016-08-22 12:11:08 -07:00
Eric Anholt
f4d143f0d9 vc4: Use proper type sizes for uniforms. 2016-08-22 11:52:26 -07:00
Eric Anholt
bdb54cdc16 vc4: Add VARYING_SLOT_PNTC support.
We end up with this when doing GLSL-to-NIR.
2016-08-22 11:52:26 -07:00
Eric Anholt
3c1ea6e651 vc4: Fix vc4_nir_lower_io for non-vec4 I/O.
To support GLSL-to-NIR, we need to be able to support actual
float/vec2/vec3 varyings.
2016-08-22 11:52:26 -07:00
Eric Anholt
e8378fee0c nir: Define system values for vc4's blending-lowering arguments.
In the GLSL-to-NIR conversion of VC4, I had a bit of trouble with what I
was calling the "state uniforms" that I was putting into the NIR fighting
with its other lowering passes.  Instead of using magic uniform base
numbers in the backend, follow the lead of load_user_clip_plane and just
define system values for them.

v2: Fix unintended change to channel_num, drop unspecified const_index
    value on blend_const_color_r_float.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-22 11:52:26 -07:00
Marek Olšák
0328b20050 gallium/hud: round max_value to print nicely rounded numbers next to graphs
This improves readability a lot.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Marek Olšák
0f1befe926 gallium/hud: generalize code for drawing numbers next to graphs
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Marek Olšák
a33eb48d61 gallium/hud: draw numbers with 3 decimal places if those aren't 0
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Marek Olšák
b9c9551c09 gallium/hud: use sRGB for nicer AA lines
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Marek Olšák
6ffde82083 gallium/hud: use AA lines for graphs
this looks a lot better (with the next patch)

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Marek Olšák
6902f9e82a gallium/hud: don't enable blending for all objects
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-22 16:01:35 +02:00
Roland Scheidegger
0849621891 llvmpipe: fix issues with depth clamp
We only did depth clamp when the value was written from the fs.
This is very wrong both for d3d10 and GL, and only passed the
corresponding piglit test due to pure luck (it no longer does
with the enhanced test).
Also, interpolation clamped values to 1.0 always, which can legitimately
happen if depth clip is disabled, so fix that as well (untested).
There is one unresolved issue left, d3d10 always does depth clamping,
whereas GL does not (but does [0,1] clamp instead for fs depth outputs)
- this information isn't in any gallium state object, leave it as-is
for now (though it looks like llvmpipe misses the [0,1] clamp as well).
This (with the previous patch) fixes piglit depth-clamp-range test.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-08-20 04:05:33 +02:00
Roland Scheidegger
b0a647f284 llvmpipe: fix depth clamping wrt reversed near/far values
This wasn't handled before (the result was that no matter what value got
clamped, it always ended up as the near value in this case) (if clamping
actually happened).
Fix this by using the util helper for that (the math is otherwise "mostly"
the same, mostly because there could actually be differences due to float
rounding, but I don't even know which one would be more correct).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-08-20 04:05:33 +02:00
Ilia Mirkin
89f00f749f a4xx: make sure to actually clamp depth as requested
We were previously ... not clamping. I guess this meant that everything
got clamped to 1/0, which was enough to pass the existing tests. Or
perhaps the clamping would only happen to the rasterized depth value and
not the frag shader's output depth value.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2016-08-19 19:40:04 -04:00
Ilia Mirkin
cd8e30452f a4xx: only disable depth clipping, not all clipping, when requested
The previous bit disables the whole clipper, including the regular
viewport-related clipping that would go on. The two new bits disable
near and far clipping (separately, as verified with the
depth-clamp-range piglit).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2016-08-19 19:40:04 -04:00
Eric Anholt
5adee83806 vc4: Switch store_output to using nir_lower_io_to_scalar / component. 2016-08-19 13:11:36 -07:00
Eric Anholt
f8fecc396a vc4: Use the intrinsic's first_component for vattr VPM index.
Avoids another multiplication by 4 of the base in the NIR.
2016-08-19 13:11:36 -07:00
Eric Anholt
cbf8c19410 vc4: Convert to using nir_lower_io_scalar for FS inputs.
The scalarizing of FS inputs can be done in a non-driver-dependent manner,
so extract it out of the driver.
2016-08-19 13:11:36 -07:00
Eric Anholt
c30b22c421 vc4: Switch to using the intrinsic accessors.
The const_index[] values have always felt magic, and this documents them a
bit better.
2016-08-19 13:11:36 -07:00
Eric Anholt
c078c41520 ttn: Use nir_load_front_face instead of the TGSI-style input.
This reduces the diff between GLSL-to-NIR and TGSI-to-NIR, and gives NIR
more optimization to work on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-19 13:11:36 -07:00
Eric Anholt
ed92241d78 ttn: Make FRAG_RESULT_DEPTH be a float variable to match gtn and ptn.
This lets TTN-using drivers handle FRAG_RESULT_DEPTH the same between all
their source paths.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-08-19 13:11:36 -07:00
Eric Anholt
d80d03b830 vc4: Dump the TGSI before trying to convert it to NIR.
In the case of debugging a crash in TTN, this is nice to have.
2016-08-19 13:11:36 -07:00
Boyuan Zhang
c0be51f270 radeon/vce: set flag based on dual instance enablement
Set the flag on when dual instance encoding is supported,
otherwise set it to off.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
2016-08-19 10:36:44 -04:00