Commit graph

6 commits

Author SHA1 Message Date
Eric Anholt
2780a99ff8 v3d: Move the stores for fixed function VS output reads into NIR.
This lets us emit the VPM_WRITEs directly from
nir_intrinsic_store_output() (useful once NIR scheduling is in place so
that we can reduce register pressure), and lets future NIR scheduling
schedule the math to generate them.  Even in the meantime, it looks like
this lets NIR DCE some more code and make better decisions.

total instructions in shared programs: 6429246 -> 6412976 (-0.25%)
total threads in shared programs: 153924 -> 153934 (<.01%)
total loops in shared programs: 486 -> 483 (-0.62%)
total uniforms in shared programs: 2385436 -> 2388195 (0.12%)

Acked-by: Ian Romanick <ian.d.romanick@intel.com> (nir)
2019-03-05 10:59:40 -08:00
Eric Anholt
81b9361b68 v3d: Stop scalarizing our uniform loads.
We can pull a whole vector in a single indirect load.  This saves a bunch
of round-trips to the TMU, instructions for setting up multiple loads,
references to the UBO base in the uniforms, and apparently manages to
reduce register pressure as well.

instructions in affected programs: 3086665 -> 2454967 (-20.47%)
uniforms in affected programs: 919581 -> 721039 (-21.59%)
threads in affected programs: 1710 -> 3420 (100.00%)
spills in affected programs: 596 -> 522 (-12.42%)
fills in affected programs: 680 -> 562 (-17.35%)

Improves 3dmmes performance by 2.29312% +/- 0.139825% (n=5)
2019-01-04 15:41:23 -08:00
Eric Anholt
b0e0086257 v3d: Remove dead switch cases and comments from v3d_nir_lower_io.
Moving things to NIR left this mess around.  All we lower now is uniforms.
2019-01-04 15:41:23 -08:00
Eric Anholt
2977c77758 v3d: Use the original bit size when scalarizing uniform loads.
Prevents a regression in jekstrand's 1-bit series.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-16 21:03:01 +00:00
Eric Anholt
cc54e1acf9 v3d: Use nir_remove_unused_io_vars to handle binner shader output DCE
We were doing this late after nir_lower_io, but we can just reuse the core
code.  By doing it at this stage, we won't even set up the VS attributes
as inputs, reducing our VPM size.
2018-10-30 10:46:52 -07:00
Eric Anholt
ade416d023 broadcom: Add VC5 NIR compiler.
This is a pretty straightforward fork of VC4's NIR compiler to VC5.  The
condition codes, registers, and I/O have all changed, making the backend
hard to share, though their heritage is still recognizable.

v2: Move to src/broadcom/compiler to match intel's layout, rename more
    "vc5" to "v3d", rename QIR to VIR ("V3D IR") to avoid symbol conflicts
    with vc4, use new v3d_debug header, add compiler init/free functions,
    do texture swizzling in NIR to allow optimization.
2017-10-10 11:42:04 -07:00