Commit graph

64380 commits

Author SHA1 Message Date
Eric Anholt
d9d1c14430 vc4: Add dead code elimination.
This cleans up a bunch of noise in the compiled coordinate shaders (since
we don't need the varying outputs), and also from writemasked instructions
with negated src operands.
2014-08-08 18:59:47 -07:00
Eric Anholt
1d23d55ae9 vc4: Add an initial pass of algebraic optimization.
There was a lot of extra noise in my piglit shader dumps because of silly
CMPs.
2014-08-08 18:59:47 -07:00
Eric Anholt
4c53087c67 vc4: Add support for CMP.
This took a couple of tries, and this is the squash of those attempts.

v2: Fix register file conflicts on the args in the
    destination-is-accumulator case.
v3: Rebase on helper change and qir_inst4 change.
2014-08-08 18:59:47 -07:00
Eric Anholt
eea1d36915 vc4: Make scheduling of NOPs a separate step from QIR -> QPU translation.
This should also be used as a way to pair QIR instructions into QPU
instructions later.
2014-08-08 18:59:46 -07:00
Eric Anholt
c293927511 vc4: Add WIP support for varyings.
It doesn't do all the interpolation yet, but more tests can run now.

v2: Rebase on helpers.
2014-08-08 18:59:46 -07:00
Eric Anholt
db9f41ea88 vc4: Use r3 instead of r5 for temps, since r5 only has 32 bits of storage
Reserving a whole accumulator for temps is awful in the first place, but
I'll fix that later.
2014-08-08 18:59:46 -07:00
Eric Anholt
23b2bad991 vc4: Fix emit of ABS
v2: Rebase on qir helpers.
2014-08-08 18:59:46 -07:00
Eric Anholt
cf2d777fbe vc4: Add shader variant caching to handle FS output swizzle. 2014-08-08 18:59:46 -07:00
Eric Anholt
6cf86dd487 vc4: Load the tile buffer before incrementally drawing.
We will want to occasionally disable this again when we do clear support.

v2: Squash with the previous commit (I accidentally committed at two
    stages of writing the change)
2014-08-08 18:59:46 -07:00
Eric Anholt
c3f96060a8 vc4: Don't reallocate the tile alloc/state bos every frame.
This was a problem for the simulator since we don't free memory back to
it, and it would soon just run out.
2014-08-08 18:59:46 -07:00
Eric Anholt
21db430210 vc4: Add VC4_DEBUG env option
v2: Fix an accidental deletion of some characters from the copyright
    message (caught by Ilia Mirkin)
2014-08-08 18:59:46 -07:00
Eric Anholt
2e35981d4d vc4: Add support for SNE/SEQ/SGE/SLT. 2014-08-08 18:59:46 -07:00
Eric Anholt
7108c24fd0 vc4: Use the user's actual first vertex attribute.
This is hardcoded to read it as RGBA32F so far, but starts to get more
tests working.
2014-08-08 18:59:46 -07:00
Eric Anholt
427f934f9e vc4: Fix UBO allocation when no uniforms are used.
We do rely on a real BO getting allocated, so make sure we ask for a non-zero size.
2014-08-08 18:59:46 -07:00
Eric Anholt
db8712bcbc vc4: Add initial support for math opcodes 2014-08-08 18:59:46 -07:00
Eric Anholt
792d1c92df vc4: Switch to actually generating vertex and fragment shader code from TGSI.
This introduces an IR (QIR, for QPU IR) to do optimization on.  It's a
scalar, SSA IR in general.  It looks like optimization is pretty easy this
way, though I haven't figured out if it's going to be good for our weird
register allocation or not (or if I want to reduce to basically QPU
instructions first), and I've got some problems with it having some
multi-QPU-instruction opcodes (SEQ and CMP, for example) which I probably
want to break down.

Of course, this commit mostly doesn't work, since many other things are
still hardwired, like the VBO data.

v2: Rewrite to use a bunch of helpers (qir_OPCODE) for emitting QIR
    instructions into temporary values, and make qir_inst4 take the 4 args
    separately instead of an array (all later callers wanted individual
    args).
2014-08-08 18:59:46 -07:00
Eric Anholt
e59890aebb vc4: Start converting the driver to use vertex shaders.
Note: This is the cutoff point where I switched from developing primarily
on the Pi to developing o the simulator.  As a result, from this point on
the code is untested on the Pi (the kernel code I have currently wasn't
rendering anything at this commit, though the simulator renders
successfully, suggesting kernel bugs).
2014-08-08 18:59:46 -07:00
Eric Anholt
1850d0a1cb vc4: Initial skeleton driver import.
This mostly just takes every draw call and turns it into a sequence of
commands that clear the FBO and draw a single shaded triangle to it,
regardless of the actual input vertices or shaders.  I copied the initial
driver skeleton mostly from freedreno, and I've preserved Rob Clark's
copyright for those.  I also based my initial hardcoded shaders and
command lists on Scott Mansell (phire)'s "hackdriver" project, though the
bit patterns of the shaders emitted end up being different.

v2: Rebase on gallium megadrivers changes.
v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change.
v4: Rely on simpenrose actually being installed when building for
    simulation.
v5: Add more header duplicate-include guards.
v6: Apply Emil's review (protection against vc4 sim and ilo at the same
    time, and dropping the dricommon drm bits) and fix a copyright header
    (thanks, Roland)
2014-08-08 18:59:46 -07:00
Roland Scheidegger
f017e32c0a draw: (trivial) use information about gs being present from variant key
This is a purely cosmetic change.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-08-09 03:52:58 +02:00
Roland Scheidegger
6d2ecdb4a6 draw: don't use clipvertex output if user plane clipping is disabled
The non-llvm path made sure that both clip and pre_clip_pos point to the data
output by position, not clipvertex, if user based clipping is disabled.
However, the llvm path did not, which apparently led to failures if
gl_ClipVertex was written but user plane clipping not enabled (bug 80183).
Why I have no idea really, but just make it match the non-llvm behavior...

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-08-09 03:52:58 +02:00
Chris Forbes
0f4c5a70c6 i965: Get rid of backend_instruction::sampler
The generators no longer use this.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:35 +12:00
Chris Forbes
298da9fa2a i965/vec4/Gen8: Use src1 for sampler_index instead of ->sampler field
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:33 +12:00
Chris Forbes
6be68767b9 i965/vec4/Gen4-7: Use src1 for sampler_index instead of ->sampler field
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:31 +12:00
Chris Forbes
1a3fd11aef i965/vec4: Pass sampler index in src1 for texture ops
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:29 +12:00
Chris Forbes
2f4e12a835 i965/vec4: Collect all emits of texture ops into one place
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:27 +12:00
Chris Forbes
db09fd5957 i965/fs/Gen8: Pass sampler_index to generate_tex
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:25 +12:00
Chris Forbes
ba5f7a361a i965/fs/Gen4-7: Pass sampler_index to generate_tex
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:23 +12:00
Chris Forbes
191bc64f82 i965/blorp: Put sampler index in src1 of texture ops
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:21 +12:00
Chris Forbes
a578592fd2 i965/fs: pass sampler as src1 of texture op
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:17 +12:00
Chris Forbes
f6a0192f7d i965/fs: Collect all emits of texture ops for Gen5/6 into one place
Reduces duplication, and will do so even more when we change the sampler
plumbing.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:13 +12:00
Chris Forbes
d1b136fdd0 i965/fs: Collect all emits of texture ops for Gen4 into one place
Reduces duplication, and will do so even more when we change the sampler
plumbing.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:11:33 +12:00
Pali Rohár
39a4cc45a4 configure: check for dladdr via AC_CHECK_FUNC/AC_CHECK_LIB
Use both macros as in some cases using AC_CHECK_FUNCS alone may fail.
Thus HAVE_DLADDR will not be defined, and as a result most of the code
in megadriver_stub.c will not be compiled. Breaking the backwards
compatibility between older libGL/xserver(s) and DRI megadrivers.

Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
[Emil Velikov] Commit message.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-08-08 19:26:39 +01:00
Emil Velikov
16826a36ef util: remove ralloc_test
The tests in an empty stub, which we're currently building twice.
If anyone is interested in expanding it (adding actual tests) they
can always bring it back.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-08-08 19:23:25 +01:00
Darius Goad
5492296318 gallivm: Handle MSAA textures in emit_fetch_texels
This support is preliminary due to the fact that MSAA is not
actually implemented.

However, this patch does fix the piglit test:
spec/!OpenGL 3.2/glsl-resource-not-bound 2DMS (bug #79740).

(v2 RS: don't emit 4th coord as explicit lod)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-08-08 18:54:08 +02:00
Roland Scheidegger
394ea139c7 draw: hack around weird primitive id input in gs
The distinction between system values and ordinary inputs is not very
obvious in gallium - further fueled by the fact that they use the same
semantic names.
Still, if there's any value which imho really is a system value, it's the
primitive id input into the gs (while earlier (tessleation) stages could read
it, it is _always_ generated by the system). For some odd reason though (which
I'd classify as a bug but seems too complicated to fix) the glsl compiler in
mesa treats this as an ordinary varying, and everything else after that
(including the state tracker and other drivers) just go along with that.
But input fetching in gs for llvm based draw was definitely limited to the
ordinary (2-dimensional) inputs so only worked with other state trackers,
the code was also additionally relying on tgsi_scan_shader filling
uses_primid correctly which did not happen neither (would set it only for
all stages if it was a system value, but only set it for the fragment shader
if it was an input value).
This fixes piglit glsl-1.50-geometry-primitive-id-restart and primitive-id-in
in llvmpipe.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-08-08 18:54:08 +02:00
Roland Scheidegger
92a059d294 draw: fix prim id float cast for non-llvm path
These values are always uints, casting them to floats does no good.
Fixes piglit glsl-1.50-geometry-primitive-id-restart tests for softpipe.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-08-08 18:54:07 +02:00
Bruno Jiménez
ec73778f1f clover: Add support for CL_MAP_WRITE_INVALIDATE_REGION
OpenCL 1.2 CL_MAP_WRITE_INVALIDATE_REGION sounds a lot like
PIPE_TRANSFER_DISCARD_RANGE:

From OpenCL 1.2 spec:
    The contents of the region being mapped are to be discarded.

From p_defines.h:
    Discards the memory within the mapped region.

v2: Move the code for validating flags to the front-end as
    suggested by Francisco Jerez

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-08-08 18:06:14 +03:00
Chia-I Wu
8d853468bd ilo: break down the format table
The PRMs no longer have a single table for format capabilities.  Multiple
tables take up less space, and are easier to maintain.

Encode typed write information while at it.
2014-08-08 20:23:56 +08:00
Kenneth Graunke
ae95b9dd9b i965: Emit a performance warning on conditional rendering.
We have a CPU-side implementation of conditional rendering; it really
should be done on the GPU.  It's not necessarily that hard, but nobody
has gotten to fixing it yet.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-08-08 00:52:10 -07:00
Kenneth Graunke
e9a9d441f0 i965: Set ExecSize to 16 for loop instructions in SIMD16 shaders.
Previously, we explicitly set the execution size to BRW_EXECUTE_8 and
disabled compression for loop instructions.  I can't imagine how this
could be correct in SIMD16 mode.

Looking at the history, it appears that this code has used BRW_EXECUTE_8
since 2007, when we had a SIMD8 backend that supported control flow and
a separate SIMD16 backend that didn't.  Presumably, when we added SIMD16
support for shaders with control flow, we simply neglected to update it.

Note that Gen4-5 don't support SIMD16 on shaders with control flow.

This might be a candidate for stable, but would need to be rewritten
completely due to the brw_inst API changes in master.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-08-08 00:51:50 -07:00
Kenneth Graunke
e64dbd050d i965/eu: Merge brw_CONT and gen6_CONT.
The only difference is setting PopCount on Gen4-5.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-08-08 00:51:44 -07:00
Kenneth Graunke
e7a7b3317c i965/eu: Drop redundant brw_set_src0/brw_set_dest from gen6_CONT.
We shouldn't need to set them, then set them differently.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-08-08 00:51:34 -07:00
Juha-Pekka Heikkila
d64be94294 util: add src/util/format_srgb.c to .gitignore
format_srgb.c is generated by format_srgb.py python script, having
format_srgb.c in git ignore list will silence git complaints about
untracked file.

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-08-08 09:49:52 +03:00
Ian Romanick
89d92fc00e mesa: Fold _mesa_uniform_merge_location_offset into its only caller
Also delete the comment before that function.  Everything in that
comment was either stale, wrong, or captured elsewhere.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-07 16:17:55 -07:00
Ian Romanick
1c759e32d8 mesa: Fold _mesa_uniform_split_location_offset into its only caller
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-07 16:17:53 -07:00
Ian Romanick
e0c867372a glsl_to_tgsi: Delete unused function set_uniform_initializer
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-07 16:17:50 -07:00
Ian Romanick
8f81f4e185 mesa: Use MAX2 to calculate maximum uniform element
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-07 16:17:48 -07:00
Ian Romanick
411abcb237 mesa: Have validate_uniform_parameters return the gl_uniform_storage pointer
This simplifies all the callers, and it enables the removal of one of
the function parameters.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-08-07 16:17:45 -07:00
Carl Worth
f28a105868 glsl/glcpp: Rename one test to avoid a duplicate test number
With two tests both numbered 118, there was a confusing off-by-two difference
between the last test number and the total number of tests (as reported by
glcpp-test).

With this rename, there's only an off-by-one difference left, (which is easy
to understand given the zero-based test numbering).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-07 16:08:29 -07:00
Carl Worth
41540997fb glsl/glcpp: Fix handling of commas that result from macro expansion
Here is some additional stress testing of nested macros where the expansion
of macros involves commas, (and whether those commas are interpreted as
argument separators or not in subsequent function-like macro calls).

Credit to the GCC documentation that directed my attention toward this issue:

	https://gcc.gnu.org/onlinedocs/gcc-3.2/cpp/Argument-Prescan.html

Fixing the bug required only removing code from glcpp. When first testing the
details of expansions involving commas, I had come to the mistaken conclusion
that an expanded comma should never be treated as an argument separator, (so
had introduced the rather ugly COMMA_FINAL token to represent this).

In fact, an expanded comma should be treated as a separator, (as tested here),
and this treatment can be avoided by judicious use of parentheses (as also
tested here).

With this simple removal of the COMMA_FINAL token, the behavior of glcpp
matches that of gcc's preprocessor for all of these hairy cases.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-07 16:08:29 -07:00