Commit graph

66141 commits

Author SHA1 Message Date
José Fonseca
3fd220e2eb gallivm: Fix white-space.
Replace tabs with spaces.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-23 10:19:33 +01:00
José Fonseca
013ff2fae1 gallivm,llvmpipe,clover: Bump required LLVM version to 3.3.
We'll need to update gallivm for the interface changes in LLVM 3.6, and
the fewer the number of older LLVM versions we support the less hairy that
will be.

As consequence HAVE_AVX define can disappear.  (Note HAVE_AVX meant
whether LLVM version supports AVX or not.  Runtime support for AVX is
always checked and enforced independently.)

Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-10-23 10:18:56 +01:00
Ilia Mirkin
9ad80d1d18 mesa: remove conditional render and rgtc from ES3 requirements
The functionality exposed by those extensions does not appear in ES3

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-23 00:45:08 -04:00
Brian Paul
c9a6ec1978 u_blitter: put a comment on util_blitter_cache_all_shaders()
Trivial.
2014-10-22 17:33:40 -06:00
Brian Paul
f82a84c097 u_blitter: use ctx->bind_fs_state(), not pipe->bind_fs_state()
Consistently use the function pointer we saved earlier.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-22 17:33:40 -06:00
Brian Paul
0bcd9f5469 u_blitter: create basic fs shaders in util_blitter_cache_all_shaders()
We need to create all fs shaders in this function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-22 17:33:40 -06:00
Brian Paul
27de89d266 u_blitter: do error checking assertions for shader caching
If the user calls util_blitter_cache_all_shaders() set a flag and assert
that we never try to create any new fragment shaders after that point.
If the assertions fails, it means we missed generating some shader in
util_blitter_cache_all_shaders().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-22 17:33:40 -06:00
Anuj Phogat
7a652c41b4 glsl: Use signed array index in update_max_array_access()
Avoids a crash in case of negative array index is used in a
shader program.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-22 16:13:37 -07:00
Anuj Phogat
6f0089e92e glsl: Fix crash due to negative array index
Currently Mesa crashes with a shader like this:

[fragmnet shader]
float[5] array;
int idx = -2;
void main()
{
   gl_FragColor = vec4(0.0, 1.0, 0.0, array[idx]);
}

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-10-22 16:13:37 -07:00
Marek Olšák
8ec40adf7e radeonsi: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:05:00 +02:00
Marek Olšák
a3591da1a0 r600g: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:04:58 +02:00
Marek Olšák
8ddd2f7aee r300g: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-10-22 21:04:56 +02:00
Michel Dänzer
ae879718c4 r600g: Drop references to destroyed blend state
Fixes use-after-free when the currently bound blend state is destroyed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85267
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84140

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

Cc: mesa-stable@lists.freedesktop.org
2014-10-22 17:09:43 +09:00
Kenneth Graunke
6dc6e6e0d9 i965/vec4: Generate better code for ir_triop_csel.
Previously, we generated an extra CMP instruction:

   cmp.ge.f0(8)    g6<1>D          g1<0,4,1>F      0F
   cmp.nz.f0(8)    null            g6<4,4,1>D      0D
   (+f0) sel(8)    g5<1>F          g1.4<0,4,1>F    g2<0,4,1>F

The first operand is always a boolean, and we want to predicate the SEL
on that.  Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.

Now we generate:

   cmp.ge.f0(8)    null            g1<0,4,1>F      0F
   (+f0) sel(8)    g5<1>F          g1.4<0,4,1>F    g2<0,4,1>F

No difference in shader-db.

v2: Remember to delete the old code (thanks Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:14:03 -07:00
Kenneth Graunke
f5c3f095b9 i965/vec4: Simplify visit(ir_expression *)'s result_src/dst setup.
Using dst_reg(this, ir->type) automatically sets the writemask to the
proper size for the type; src_reg(dst_reg) preserves that.  This should
be equivalent, but less code.

Note that src_reg(dst_reg) either uses SWIZZLE_XXXX or SWIZZLE_XYZW, so
the old code did need the manual writemask adjustment, since it
constructed the registers the other way around.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:14:00 -07:00
Kenneth Graunke
cb36e79f96 i965/vec4: Delete some dead code in visit(ir_expression *).
Nothing uses the vector_elements temporary variable.

Setting this->result.file is dead because we overwrite this->result a
few lines later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:13:37 -07:00
Kenneth Graunke
4d34c4b582 i965/fs: Generate better code for ir_triop_csel.
Previously, we generated an extra CMP instruction:

   cmp.ge.f0(8)   g4<1>D          g2<0,1,0>F      0F
   cmp.nz.f0(8)   null            g4<8,8,1>D      0D
   (+f0) sel(8)   g120<1>F        g2.4<0,1,0>F    g3<0,1,0>F

The first operand is always a boolean, and we want to predicate the SEL
on that.  Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.

Now we generate:

   cmp.ge.f0(8)    null            g2<0,1,0>F      0F
   (+f0) sel(8)    g124<1>F        g2.4<0,1,0>F    g3<0,1,0>F

total instructions in shared programs: 5473459 -> 5473253 (-0.00%)
instructions in affected programs:     6219 -> 6013 (-3.31%)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-21 21:13:37 -07:00
Kenneth Graunke
32364a1fe5 glsl: Delete unused gl_uniform_driver_format enum values.
A while back, Matt made the uniform upload functions simply upload
ctx->Const.UniformBooleanTrue for boolean values instead of 0/1, which
removed the need to convert it later.  We also set UniformBooleanTrue to
1.0f for drivers which want to treat booleans as 0.0/1.0f.

Nothing ever sets these, so they are dead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-21 18:53:13 -07:00
Rob Clark
36310d9d56 freedreno/a3xx: fix depth/stencil restore format
Also fix z16 restore format which was completely wrong.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark
2bc2ab66d9 freedreno/a3xx: fix viewport state during clear
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark
3eb8289aa4 freedreno: mark scissor state dirty when enable bit changes
We don't have a scissor enable bit in hw, so when a raster state change
results in scissor enable bit changing, we need to also mark scissor
state as dirty.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Rob Clark
01b757e2b0 freedreno: clear vs scissor
The optimization of avoiding restore (mem2gmem) if there was a clear
falls down a bit if you don't have a fullscreen scissor.  We need to
make the decision logic a bit more clever to keep track of *what* was
cleared, so that we can (a) completely skip mem2gmem if entire buffer
was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that
were completely cleared.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-21 20:08:49 -04:00
Vinson Lee
1ab6543431 clover: Fix build error with LLVM 3.4.
DataLayoutPass was added in LLVM 3.5 r202168, commit
57edc9d4ff1648568a5dd7e9958649065b260dca "Make DataLayout a plain
object, not a pass.".

This patch fixes this build error with LLVM 3.4.

  CXX      llvm/libclllvm_la-invocation.lo
llvm/invocation.cpp: In function 'void {anonymous}::optimize(llvm::Module*, unsigned int, const std::vector<llvm::Function*>&)':
llvm/invocation.cpp:324:18: error: expected type-specifier
       PM.add(new llvm::DataLayoutPass(mod));
                  ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85189
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-10-21 15:40:47 -07:00
Marek Olšák
43b2432368 r600g,radeonsi: convert TGSI shader type to LLVM shader type
The values are hardcoded in the LLVM backend, but the TGSI definitions are
going to be changed with tessellation, e.g. TGSI_PROCESSOR_COMPUTE will be
increased by 2.

We'll use VS for LS and HS, because there's nothing special about them
from the LLVM backend point of view, even though the hardware side is
different. We do the same for ES.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:50 +02:00
Marek Olšák
c5a44cf3f8 radeonsi: add some missing register definitions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:50 +02:00
Marek Olšák
fc3b3354d7 radeonsi: load ring resource descriptors only once
v2: document the new functions

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:39:35 +02:00
Marek Olšák
d787608957 radeonsi: clarify shader constant load functions
I'll need indexed loads without the meta data flag for tessellation later.
Also rename load_const to buffer_load_const to distinguish it from indexed
const loads.

v2: add comments

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:35:44 +02:00
Marek Olšák
55a9b778c8 radeonsi: statically declare resource and sampler arrays
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:48 +02:00
Marek Olšák
e827bb6fe7 radeonsi: remove conversion of DX9 FACE input to GL
st/mesa and gallium expect the DX9 format, so this is useless.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:41 +02:00
Marek Olšák
a18f803a86 radeonsi: revert hack for random failures in glsl-max-varyings
This reverts commit 032e5548b3.

I've run glsl-max-varyings 30 times and it always passed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:29 +02:00
Marek Olšák
b9b0973db2 radeonsi: generate shader pm4 states right after shader compilation
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:26 +02:00
Marek Olšák
c94af8f0d7 radeonsi: make pm4 state generation for shaders independent of the context
The si_pm4_delete_state calls became useless, because the pm4 state is
always generated only once.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:22 +02:00
Marek Olšák
139bde061a radeonsi: inline si_pm4_alloc_state
It seemed like the function needed a context pointer. Let's remove it
to make it less confusing.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-21 22:17:15 +02:00
Marek Olšák
22c5886f3f r300g: replace r300_get_num_samples with a util variant 2014-10-21 22:03:55 +02:00
Marek Olšák
013850a1b7 glsl_to_tgsi: use _mesa_copy_linked_program_data
This deduplicates some code.
2014-10-21 22:01:16 +02:00
Marek Olšák
9ec305ead7 glsl_to_tgsi: fix the value of gl_FrontFacing with native integers
We must convert it to boolean from the DX9 float encoding that Gallium
specifies.

Later, we should probably define that FACE should be 0 or ~0 if native
integers are supported.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2014-10-21 22:01:16 +02:00
Marek Olšák
e8764a4673 st/mesa: add ST_DEBUG=wf option which enables wireframe rendering
Useful for tessellation.
2014-10-21 22:01:16 +02:00
Marek Olšák
5f5b83cbba gallium: add PIPE_SHADER_CAP_MAX_OUTPUTS and use it in st/mesa
With 5 shader stages and various combinations of enabled and disabled shaders,
the maximum number of outputs in one shader doesn't have to be equal to
the maximum number of inputs in the following shader.

v2: return 32 for softpipe and llvmpipe
2014-10-21 21:59:02 +02:00
Eric Anholt
ef280c95f2 vc4: Fix SRC_ALPHA_SATURATE blending.
Fixes glean blendFunc.
2014-10-21 15:46:48 +01:00
Eric Anholt
cc298023c9 vc4: Fix stencil writemask handling.
If the writemask doesn't compress, then we want to put in the uncompressed
writemask, not the compressed writemask failure value (all-on).

Fixes glean's stencil2 and fbo-clear-formats on stencil.
2014-10-21 15:16:41 +01:00
Eric Anholt
48f6351940 vc4: Don't look at back stencil state unless two-sided stencil is enabled.
Fixes regressions in the next bugfix, because gallium util stuff leaves
the back stencil state as 0 if !back->enabled.
2014-10-21 15:16:41 +01:00
Rob Clark
4f17e026bb freedreno/ir3: add debug flag to disable cp
FD_MESA_DEBUG=nocp will disable copy propagation pass.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Ilia Mirkin
f0ca26725e freedreno: positions come out as integers, not half-integers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark
3fcb021201 freedreno/a3xx: disable early-z when we have kill's
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark
8a0ffedd8d freedreno/ir3: fix potential gpu lockup with kill
It seems like the hardware is unhappy if we execute a kill instruction
prior to last input (ei).  Probably the shader thread stops executing
and the end-input flag is never set.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark
ab33a24089 freedreno/ir3: comment + better fxn name
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark
94bb33617d freedreno/a3xx: only emit dirty consts
If app only updates (for example) vertex uniforms, it would be nice to
only re-emit those and not also frag uniforms.  Means we need to mark
the first frag shader const buffer dirty after a clear.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Rob Clark
74069e324e freedreno/a3xx: more layer/level fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-20 21:42:44 -04:00
Brian Paul
aafbd89c5e mesa: fix 'feeedback' typo in comment
Trivial.
2014-10-20 11:53:34 -06:00
Brian Paul
4676c6c25b mesa: fix 'misalgned' typos in error messages
Trivial.
2014-10-20 11:50:49 -06:00