Commit graph

106697 commits

Author SHA1 Message Date
Eric Anholt
a7e15a5086 v3d: Avoid assertion failures when removing end-of-shader instructions.
After generating VIR, we leave c->cursor pointing at the end of the
shader.  If the shader had dead code at the end (for example from preamble
instructions in a shader with no side effects), we would assertion fail
that we were leaving the cursor pointing at freed memory.  Since anything
following DCE should be setting up a new cursor anyway, just clear the
cursor at the start.
2018-12-14 17:48:01 -08:00
Eric Anholt
5b2cc03852 v3d: Add support for draw indirect for GLES3.1.
In trying to enable compute shaders, I found that a bunch of deqp-gles31's
compute stuff wanted to interact with indirect dispatch.  This was easy to
do on its own.
2018-12-14 17:48:01 -08:00
Eric Anholt
ff80e58b38 v3d: Add missing flagging of SYNCB as a TSY op.
Fixes: f2e41daac5 ("broadcom/vc5: Update QPU instruction pack/unpack for v4.2.")
2018-12-14 17:48:01 -08:00
Eric Anholt
3f9bcf9136 v3d: Make sure that a thrsw doesn't split a multop from its umul24.
The thrsw will invalidate rtop, just like accumulators and flags.  Caught
by simulator assertions in CS imulextended/umulextended tests.

Fixes: 90269ba353 ("broadcom/vc5: Use THRSW to enable multi-threaded shaders.")
2018-12-14 17:48:01 -08:00
Eric Anholt
332a5cf6a5 v3d: Add safety checks for resource_create().
This should ease my debugging next time I screw it up.
2018-12-14 17:48:01 -08:00
Eric Anholt
6ad9e8690d v3d: Add support for texturing from linear.
Just like vc4, we have to support linear shared BOs for X11 on arbitrary
displays.  When we're faced with a request to texture from one of those,
make a shadow image that we copy using the TFU at the start of the draw
call.
2018-12-14 17:48:01 -08:00
Eric Anholt
976ea90bdc v3d: Add support for using the TFU to do some blits.
This will be useful in particular for blits from raster to UIF for X11.
2018-12-14 17:48:01 -08:00
Eric Anholt
e5b4d1f55f v3d: Don't forget to bump the number of writes when doing TFU ops.
generatemipmap is just filling out the rest of the mipmap that's already
been written (by a mapping or a draw call), so it didn't matter.  As I
reuse the TFU code for linear-to-UIF conversions, it'll start mattering.
2018-12-14 17:48:01 -08:00
Eric Anholt
485df2574e v3d: Set up the right stride for raster TFU.
I didn't have any raster images in the generatemipmap path, so the
pixels-vs-bytes mixup didn't matter here.
2018-12-14 17:48:01 -08:00
Eric Anholt
e731d53716 v3d: Don't forget to wait for our TFU job before rendering from it.
Otherwise we may race to read old contents.  This didn't show up in the
CTS and piglit for me, but it did once I started using the TFU to do
linear->UIF blits for X11.

Fixes: 2ebca177dc ("v3d: Use the TFU to do generatemipmap.")
2018-12-14 17:48:01 -08:00
Ilia Mirkin
153d3fc5f9 nvc0: always keep TSC slot 0 bound to fix TXF
Same as on nv50, the TXF op always uses the TSC bound to slot 0,
returning blank values if nothing is bound.

An earlier change arranges for the TSC entries list to always have valid
data at entry 0, so here we just make use of it.

Fixes arb_texture_buffer_object-subdata-sync among others.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-14 20:01:31 -05:00
Ilia Mirkin
4aeaf89aa7 nvc0: replace use of explicit default_tsc with entry 0
This was used for implementing FBFETCH. However that uses TXF, which
doesn't do much with a TSC. The only important bit is that sRGB-decoding
works as expected, which we can achieve since all samplers we ever
generate enable sRGB-decoding. Always point to entry 0 in the TSC table,
and ensure that even before it ever gets initialized, the sRGB-decoding
enable bit is set.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-12-14 20:01:31 -05:00
Rob Clark
5f9085638a freedreno/a6xx: fix corrupted uniforms
For older gen's fd_wfi() is used to conditionally insert a WFI if there
hasn't already been one since last draw.  But this doesn't work out well
with stateobj since the order the stateobj is evaluated might not be
what you expect.  (Ie. stateobj might not be evaluated until a later
draw if there is no geometry from the current draw in a given tile.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-14 15:01:30 -05:00
Alex Deucher
4db4b3447d pci_ids: add new vega20 pci id
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-12-14 14:48:39 -05:00
Alex Deucher
56cf25a114 pci_ids: add new vega10 pci ids
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2018-12-14 14:48:18 -05:00
Rafael Antognolli
5c454661c6 i965/gen9: Add workarounds for object preemption.
Gen9 hardware requires some workarounds to disable preemption depending
on the type of primitive being emitted.

We implement this by adding a function that checks the primitive type
and number of instances right before the 3DPRIMITIVE.

For now, we just ignore blorp.  The only primitive it emits is
3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can
safely leave preemption enabled when it happens. Or it will be disabled
by a previous 3DPRIMITIVE, which should be fine too.

v3:
 - Apply missing workarounds for instanced rendering and line loop (Ken)
 - Move workaround code to brw_draw_single_prim()

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-14 09:40:27 -08:00
Rafael Antognolli
d8b50e152a i965/gen10+: Enable object level preemption.
Set bit when initializing context.

v3:
 - Always toggle preemption bool to false before enabling it for the
 first time, so the state gets emitted (Chris Wilson).
 - Emit end of pipe sync with PIPE_CONTROL_RENDER_TARGET_FLUSH (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-14 09:40:27 -08:00
Rafael Antognolli
019a92ffa4 intel/genxml: Add register for object preemption.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-12-14 09:40:27 -08:00
Ian Romanick
a6b7d1151c util/slab: Rename slab_mempool typed parameters to mempool
Now everything with type 'struct slab_child_pool *' is name pool, and
everything with type 'struct slab_mempool *' is named mempool.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-14 07:36:05 -08:00
Ian Romanick
ba5402ec9a nir/phi_builder: Internal users should use nir_phi_builder_value_set_block_def too
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-14 07:36:05 -08:00
Christian Gmeiner
489ffaf0c1 etnaviv: drop redundant ctx function parameter
There is no need to have an extra ctx paramter as all the other
parameters carry all the needed information.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2018-12-14 11:23:00 +01:00
Kenneth Graunke
0b44644ca6 genxml: Consistently use a numeric "MOCS" field
When we first started using genxml, we decided to represent MOCS as an
actual structure, and pack values.  However, in many places, it was more
convenient to use a numeric value rather than treating it as a struct,
so we added secondary setters in a bunch of places as well.

We were not entirely consistent, either.  Some places only had one.
Gen6 had both kinds of setters for STATE_BASE_ADDRESS, but newer gens
only had the struct-based setters.  The names were sometimes "Constant
Buffer Object Control State" instead of "Memory", making it harder to
find.  Many had prefixes like "Vertex Buffer MOCS"...in a vertex buffer
packet...which is a bit redundant.

On modern hardware, MOCS is simply an index into a table, but we were
still carrying around the structure with an "Index to MOCS Table" field,
in addition to the direct numeric setters.  This is clunky - we really
just want a number on new hardware.

This patch eliminates the struct-based setters, and makes the numeric
setters be consistently called "MOCS".  We leave the struct definition
around on Gen7-8 for reference purposes, but it is unused.

v2: Drop bonus "Depth Buffer MOCS" fields on Gen7.5 and Gen9

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2018-12-14 00:44:54 -08:00
Timothy Arceri
a2ec78883f nir: fix opt_if_loop_last_continue()
The pass did not correctly handle loops ending in:

	if ssa_7 {
		block block_8:
		/* preds: block_7 */
		continue
		/* succs: block_1 */
	} else {
		block block_9:
		/* preds: block_7 */
		break
		/* succs: block_11 */
	}

The break will get eliminated by another opt but if this pass gets
called first (as it does on RADV) we ended up inserting
instructions after the break.

Fixes: 5921a19d4b ("nir: add if opt opt_if_loop_last_continue()")
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-12-14 17:21:35 +11:00
Rob Clark
0ac5acaeaa freedreno/a6xx: fix resource_copy_region()
pctx->resource_copy_region() needs to fall back to sw copy for
non-renderable formats.  But previously for things that we could
not use the blitter for, would fall back to 3d.  Which won't work
if 3d can't render to the dst format either.

Instead rework things to fallback to fd_resource_copy_region(),
which will try 3d core and then fall back to memcpy().

Fixes (for example) dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
4ec2f6129b freedreno: move fd_resource_copy_region()
Code-motion prep for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
57b76ee2a8 freedreno/a6xx: more blitter fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
d15fc787bc freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
532f8c0043 gallium/aux: add is_unorm() helper
We already had one for is_snorm() but not unorm.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
85cd4df47f freedreno/a6xx: fix blitter crash
Fixes a crash with unsupported formats in dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot

Also fixes gpu hangs with some formats that are supported, but which we
don't know what internal-format to use for the blitter, for ex
dEQP-GLES3.functional.texture.format.sized.2d_array.rgb10_a2_pot

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
cca1e9606c freedreno/ir3: don't remove unused input components
Fixes: 0d240c2214 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
c19c4bf488 freedreno/ir3: fix crash
Fixes a crash in dEQP-GLES3.functional.shaders.fragdepth.compare.fragcoord_z

Fixes: 0d240c2214 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
3e8e033f4c freedreno: also set DUMP flag on shaders
If we emit shader as a pointer to a GEM object, also set the RELOC_DUMP
flag as a hint to kernel that this is a useful buffer to snapshot for
debug dumps.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
4cd016b5d6 freedreno: debug GEM obj names
With a recent enough kernel, set debug names for GEM BOs, which will
show up in $debugfs/gem

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Rob Clark
7ef722861b freedreno/drm: sync uapi and enable softpin
Pull in updated UAPI and use kernel API version to enable softpin.
Since MSM_SUBMIT_BO_DUMP flag was added at same time, use that to
signal to kernel that cmdstream buffers are useful to dump for
debugging/cmdstream-traces.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-12-13 15:51:01 -05:00
Eric Anholt
4407e688cd nir: Move intel's half-float image store lowering to to nir_format.h.
I needed the same function for v3d.  This was originally in d3e046e76c
("nir: Pull some of intel's image load/store format conversion to
nir_format.h") before we made am istake about simplifying the function.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 12:24:26 -08:00
Eric Anholt
3a417a044e Revert "intel: Simplify the half-float packing in image load/store lowering."
This reverts commit 06fbcd2cd5.
nir_pack_half_2x16_split *isn't* vectorizable, it's 1-component only, thus
why we had this split-scalar code in the first place.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 12:24:24 -08:00
Eric Anholt
c2c44dba7a nir: Print the format of image variables.
This helps a lot when debugging image load/store lowering on large
testcases.  Unfortunately the Mesa enum name stuff is under src/mesa and
we can't get at it from the compiler.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 12:24:12 -08:00
Eric Anholt
19ffcba161 mesa/st: Expose compute shaders when NIR support is advertised.
We have a NIR path, and V3D doesn't have TGSI input for compute (only what
TTN can handle for the various gallium-internal shaders).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-12-13 11:44:47 -08:00
Dave Airlie
b3f2b03ece radv/xfb: fix counter buffer bounds checks.
If we gave this function 0 counter buffers, we'd still try and
access pCounterBuffers[0] as this check was incorrect.

Fixes crash with ext_transform_feedback-pipeline-basic-primgen
on zink on radv.

Fixes: 677b496b6 (radv: fix begin/end transform feedback with 0 counter buffers.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-13 19:27:05 +00:00
Jason Ekstrand
9ebc00f32e i965: Enable nir_opt_idiv_const for 32 and 64-bit integers
The pass should work for all bit sizes but it's less clear that the
extra instructions are worth it on small integers.  Also, the hardware
doesn't do mul_high on anything other than 32-bit integers and, absent
any decent mechanism for testing the pass on 8 and 16-bit types, it's
probably best to just leave it disabled for now.

Shader-db results on Sky Lake:

    total instructions in shared programs: 15105795 -> 15111403 (0.04%)
    instructions in affected programs: 72774 -> 78382 (7.71%)
    helped: 0
    HURT: 265

Note that hurt here actually means helped because we're getting rid of
integer quotient operations (which are a send on some platforms!) and
replacing them with fairly cheap ALU ops.

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Jason Ekstrand
455ec7327d i965/vec4: Implement nir_op_uadd_sat
Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Ian Romanick
e639d39faf i965/fs: Implement nir_op_uadd_sat
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 17:49:48 +00:00
Jason Ekstrand
74492ebad9 nir: Add a pass for lowering integer division by constants
It's a reasonably well-known fact in the world of compilers that integer
divisions by constants can be replaced by a multiply, an add, and some
shifts.  This commit adds such an optimization to NIR for easiest case
of udiv.  Other division operations will be added in following commits.
In order to provide some additional driver control, the pass takes a
minimum bit size to optimize.

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Ian Romanick
090e282407 nir: Add a saturated unsigned integer add opcode
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 17:49:48 +00:00
Jason Ekstrand
39198a1238 nir/lower_int64: Add support for [iu]mul_high
Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Jason Ekstrand
9525971e2b nir: Allow [iu]mul_high on non-32-bit types
Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Emil Velikov
a95ec13879 glx: mandate xf86vidmode only for "drm" dri platforms
Currently we have the three dri "platforms" - drm, apple and windows.

Since xf86vidmode is a thing only for the drm one, adjust the
preprocessor guards and correctly check for the dependency.

v2: terminate the GLX_USE_WINDOWSGL hunk

Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Fixes: 5bc509363b ("glx: make xf86vidmode mandatory for direct rendering")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-13 17:38:19 +00:00
Alejandro Piñeiro
c7bdcd67aa nir: remove unused variable
To avoid the following warning:
./src/compiler/nir/nir_loop_analyze.c:807:16: warning: unused variable ‘ns’ [-Wunused-variable]
    nir_shader *ns = impl->function->shader;
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-13 16:35:21 +01:00
Erik Faye-Lund
e888f28d1f virgl: work around bad assumptions in virglrenderer
Virglrenderer does the wrong thing when given an instance divisor;
it tries to use the element-index rather than the binding-index as
the argument to glVertexBindingDivisor(). This worked fine as long
as there was a 1:1 relationship between elements and bindings,
which was the case util 19a91841c3 "st/mesa: Use Array._DrawVAO in
st_atom_array.c.".

So let's detect instance divisors, and restore a 1:1 relationship in
that case. This will make old versions of virglrenderer behave
correctly. For newer versions, we can consider making a better
interface, where the instance divisor isn't specified per element,
but rather per binding. But let's save that for another day.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 19a91841c3 "st/mesa: Use Array._DrawVAO in st_atom_array.c."
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
2018-12-13 16:12:10 +01:00
Erik Faye-Lund
8447b64238 virgl: wrap vertex element state in a struct
This just has one member for now; the handle. But this is about to
change.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
2018-12-13 16:12:10 +01:00