Commit graph

180 commits

Author SHA1 Message Date
Eric Anholt
f49d112a01 v3d: Implement ALPHA_TO_COVERAGE.
There's a convenient "FTOC" instruction for generating the coverage now,
unlike vc4.  This fixes
dEQP-GLES3.functional.multisample.fbo_4_samples.proportionality_alpha_to_coverage
2018-06-20 09:30:46 -07:00
Eric Anholt
07b243674f v3d: Add missing always_flush debug flag.
The #define existed and was checked in the driver.
2018-06-19 09:42:20 -07:00
Eric Anholt
778594ae12 v3d: Limit shader threading according to our maximum TMU fifo usage.
Fixes simulator assertion failures in
dEQP-GLES3.functional.shaders.texture_functions.texture.samplercubeshadow_bias_fragment
and similar complicated cases.
2018-06-15 16:09:39 -07:00
Eric Anholt
e130ada243 v3d: Fix shaders using pixel center W but no varyings.
The docs called this field "uses both center W and centroid W", but
actually it's "do you need center W even if varyings don't obviously call
for it?"

Fixes dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_w
2018-06-15 16:09:39 -07:00
Eric Anholt
d91e06a065 v3d: Fix configuration setup of mixed f32 and f16 render targets.
Fixes dEQP-GLES3.functional.fragment_out.random.26 and 6 others.
2018-06-14 16:52:25 -07:00
Eric Anholt
48011c42aa v3d: Remove unused QUNIFORM_STENCIL left over from vc4. 2018-06-14 16:52:25 -07:00
Eric Anholt
a40bc33b11 v3d: Fix undefined results for a swap_color_rb RT from a float shader output.
Fixes segfaults and undefined behavior in
dEQP-GLES3.functional.fragment_out.basic.fixed.srgb8_alpha8_lowp_float
2018-06-14 16:52:25 -07:00
Eric Anholt
9d5860310d v3d: Enable the new NIR bitfield operation lowering paths.
These together get the GLSL 3.00 unorm/snorm pack functions and
MESA_shader_integer operations working.

v2: Fix commit message typo.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-06-06 13:44:28 -07:00
Eric Anholt
2b1b2cbf61 v3d: Be more explicit about include directory from our generated code.
You'd need src/broadcom/cle/ in the -I previously, for srcdir != builddir.
nir was fine at that, but automake didn't have it.

Bugzilla: https://github.com/anholt/mesa/issues/104
2018-06-05 12:44:49 -07:00
Eric Anholt
97894b1267 v3d: Add support for glSampleMask / glSampleCoverage. 2018-05-17 15:09:46 +01:00
Eric Anholt
9bbc3f8cf1 v3d: Enable NaN propagation in the VS and CS as well.
Fixes piglit vs-isnan-*.shader_test at the expense of gl-1.0-spot-light.
2018-05-17 15:09:12 +01:00
Eric Anholt
8c47ebbd23 v3d: Rename the driver files from "vc5" to "v3d". 2018-05-16 21:19:07 +01:00
Eric Anholt
c4c488a2ae v3d: Rename the vc5_dri.so driver to v3d_dri.so.
This allows the driver to load against the merged kernel DRM driver.  In
the process, rename most of the build system variables and gallium
plumbing functions.
2018-05-16 21:19:07 +01:00
jenny.q.cao
ff7521c9ba android: change include "cutils/log.h" to "log/log.h" on Android API >=26
There is a compile warning from Android 8 (API version 26) from "include cutils/log.h"
warning: "Deprecated: don't include cutils/log.h, use either android/log.h or log/log.h"-W#warnings,
Change to include "log/log.h" on Android 8 or later major version to avoid this warning

Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-14 08:08:31 +03:00
Eric Anholt
76ee9edcb4 broadcom/vc5: Add support for centroid varyings.
It would be nice to share the flags packet emit logic with flat shade
flags, but I couldn't come up with a good way while still using our pack
macros.  We need to refactor this to shader record setup at compile time,
anyway.

Fixes ext_framebuffer_multisample-interpolation * centroid-*
2018-04-26 11:30:22 -07:00
Eric Anholt
77b4f30bae broadcom/vc5: Add validation that we don't violate GFXH-1633 requirements.
We don't use ldunifa yet, but we will eventually for UBOs.
2018-04-26 11:30:22 -07:00
Eric Anholt
089c32eefd broadcom/vc5: Add validation that we don't violate GFXH-1625 requirements.
We don't use TMUWT yet, but we will once we do SSBOs.
2018-04-26 11:30:22 -07:00
Eric Anholt
dc4cb04ee5 broadcom/vc5: Add QPU validation for register writes after thrend.
The next shader gets to start writing the register file during these
slots, so make sure we don't stomp over them.

The only case of hitting this that I could imagine would be dead writes.
2018-04-26 11:30:22 -07:00
Eric Anholt
503716fa86 broadcom/vc5: Remove leftover vc4 MSAA lowering setup in the FS key. 2018-04-25 09:21:54 -07:00
Eric Anholt
5710532e9e broadcom/vc5: Fix tile load/store of MSAA surfaces on 4.x.
For single-sample we have to always program SAMPLE_0, but for multisample
we want to store all the samples.
2018-04-25 09:21:54 -07:00
Ian Romanick
d76c204d05 util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero
The new name make the zero-input behavior more obvious.  The next
patch adds a new function with different zero-input behavior.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:09:23 -07:00
Aaron Watry
1dae92f150 broadcom/vc4: Fix out-of-tree build with automake.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-03-28 17:48:41 -07:00
Eric Anholt
81f82ecc56 broadcom/vc5: Start using nir_opt_move_load_ubo().
In the absence of a general NIR or VIR-level scheduler, this at least
avoids spilling in
GTF-GLES3.gtf.GL3Tests.uniform_buffer_object.uniform_buffer_object_storage_layouts
2018-03-28 17:48:41 -07:00
Eric Anholt
c2b13627d9 broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes.
Just like TLB without a config uniform, we don't have a register index.
2018-03-26 17:46:23 -07:00
Eric Anholt
d7a015cbc6 broadcom/vc5: Account for InstanceID/VertexID in VPM segment size.
Fixes failure in
GTF-GLES3.gtf.GL3Tests.draw_instanced.draw_instanced_attrib_size
2018-03-22 15:12:21 -07:00
Eric Anholt
ba29b89dc7 broadcom/vc5: Set up a vertex position if the shader doesn't.
Our backend needs some sort of vertex position value to emit the scaled
viewport values and such.  Fixes potential segfaults in
KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx
2018-03-22 15:12:21 -07:00
Eric Anholt
baeb6a4b4a broadcom/vc5: Fix up the NIR types of FS outputs generated by NIR-to-TGSI.
Unfortunately TGSI doesn't record the type of the FS output like GLSL
does, but VC5's TLB writes depend on the output's base type.  Just record
the type in the key at variant compile time when we've got a TGSI input
and then fix it up.

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a
GPU hang that breaks most tests that come after it.
2018-03-21 14:02:34 -07:00
Eric Anholt
00910e3057 broadcom/vc5: Don't annotate dumps with stale live intervals.
As you're debugging register allocation, you may have changed the
intervals and not recomputed yet.  Just skip the dump in that case.
2018-03-19 16:44:20 -07:00
Eric Anholt
facc3c6f58 broadcom/vc5: Add support for register spilling.
Our register spilling support is nice to have since vc4 couldn't at all,
but we're still very restricted due to needing to not spill during a TMU
operation, or during the last segment of the program (which would be nice
to spill a value of, when there's a long-lived value being passed through
with little modification from the start to the end).

We could do better by emitting unspills for the last-segment values just
before the last thrsw, since the last segment is probably not the maximum
interference area.

Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3
others.
2018-03-19 16:44:06 -07:00
Eric Anholt
271fc58ba1 broadcom/vc5: Remove redundant last_inst lookup.
The point was to get the MOV, which the MOV_dest already returned.
2018-03-19 16:42:59 -07:00
Eric Anholt
34dc64f627 broadcom/vc5: On QPU pack error, dump the instruction and return cleanly.
This is nice for debugging when you've made a bad instruction.
2018-03-19 16:42:59 -07:00
Eric Anholt
d721348dcd broadcom/vc5: Add cursors to the compiler infrastructure, like NIR's.
This will let me do lowering late in compilation using the same
instruction builder as we use in nir_to_vir.
2018-03-19 16:42:59 -07:00
Eric Anholt
c81d681742 broadcom/vc5: Move the umul macro to a header.
Anywhere we want to multiply, we probably want this.
2018-03-19 16:42:59 -07:00
Eric Anholt
9e28c18cd1 broadcom/vc5: Correct the arg count of TIDX/EIDX. 2018-03-19 16:42:59 -07:00
Eric Anholt
55bf298333 broadcom/vc5: Re-do live variables after removing thrsws.
Otherwise our start/ends ips won't line up with the actual instructions.
2018-03-19 16:42:59 -07:00
Eric Anholt
c3a504f470 broadcom/vc5: Add a QPU helper for instructions using the TLB.
This will be used for detecting last thread segment in register spilling.
2018-03-19 16:42:59 -07:00
Eric Anholt
09c4dd1971 broadcom/vc5: Introduce v3d_qpu_reads_vpm()/v3d_qpu_writes_vpm().
These helpers will be used in register spilling to determine where to add
a last thrsw if needed, and might help refactor QPU scheduling.
2018-03-19 16:42:59 -07:00
Eric Anholt
407f21ef1b broadcom/vc5: The ldvpm signal also a case of using the VPM.
The QPU scheduling code calling this function already separately checked
this signal.
2018-03-19 16:42:59 -07:00
Eric Anholt
4760040c09 broadcom/vc5: Extract v3d_qpu_writes_tmu() helper.
This will be reused in register spilling.
2018-03-19 16:42:59 -07:00
Timothy Arceri
a050ea60ee nir: add lower_ldexp to nir compiler options
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-28 09:23:49 +11:00
Eric Anholt
e29988c908 broadcom/vc5: Fix "hardwrae" typo in a field name in XML. 2018-02-05 13:53:38 +00:00
Eric Anholt
8bb000f460 broadcom/vc5: Try to merge more than 2 QPU instructions together.
Obviously it would be good to have an ADD and a MUL and a signal together,
but we can even potentially have multiple signals merged, as well.

total instructions in shared programs: 100423 -> 97874 (-2.54%)
instructions in affected programs:     78812 -> 76263 (-3.23%)
2018-02-05 09:29:37 +00:00
Eric Anholt
dc78643ace broadcom/vc5: Remove no-op MOVs after register allocation.
We emit some MOVs to track lifetimes of payload registers, but we don't
need there to be actual MOV instructions for them.

total instructions in shared programs: 101045 -> 100423 (-0.62%)
instructions in affected programs:     37083 -> 36461 (-1.68%)
2018-02-05 09:29:37 +00:00
Eric Anholt
f3978a7380 broadcom/vc5: Add missing shader-db instruction counting.
I must have misplaced it in the instruction packing rework.
2018-02-05 09:29:37 +00:00
Eric Anholt
353b42ccc7 broadcom/vc5: Fix a segfault on mix of booleans.
We don't have a src1 to look up if the compare instruction is "i2b".
2018-02-01 11:02:29 -08:00
Timothy Arceri
9a2e085680 nir: add lower_all_io_to_temps flag
This will be used for freedreno and vc4 which require all inputs
and outputs to be copied to temps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Eric Anholt
71c7e9bea1 broadcom/vc5: Enable CLIF dumping of V3D 4.2. 2018-01-27 19:04:21 +11:00
Eric Anholt
91f899cbc1 broadcom/vc5: Update the compiler for V3D 4.2. 2018-01-27 19:04:21 +11:00
Eric Anholt
f2e41daac5 broadcom/vc5: Update QPU instruction pack/unpack for v4.2.
After the 4.1 spec, 4.2 retroactively renamed patchid to barrierid because
it's used for other barriers in compute.
2018-01-27 19:03:55 +11:00
Eric Anholt
96d3e8f134 broadcom/vc5: Add XML for V3D 4.2. 2018-01-27 18:57:58 +11:00