Commit graph

67987 commits

Author SHA1 Message Date
Axel Davy
5c61f6344a st/nine: fix early basetexture destruction
Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Patrick Rudolph
dfeca90419 st/nine: Do not leak private data in volume9.
This->data was allocated by nine, but not freed.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-02-06 00:07:18 +01:00
Patrick Rudolph
b3afcc0968 st/nine: Check block alignment for compressed textures in NineSurface9_CopySurface
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-02-06 00:07:18 +01:00
Axel Davy
65ce2b2848 st/nine: Commit sampler views again if srgb state changed.
This fixes a wine test and some minor visual issues on some games.

The patch is not optimal, there is probably a more efficient way to
fix this issue, but the code there already has some innefficiencies.
There is plans to rewrite that part of the code to make it more
efficient.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
2d2286d17c st/nine: Fix use of D3DSP_NOSWIZZLE
D3DSP_NOSWIZZLE already contains the shift.
Detected with Clang.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Reviewed-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
1f3b7d4039 st/nine: Check for the correct number of constants.
This removes unneeded hack for Anno 1404.
This app is not checking the number of supporting
constants, and rely on the shader compilation to fail
if it puts too many constants.

This patch also checks for the correct number of constants for ps.

Note that we don't check the official limitations for old vs and ps
versions. The restrictions were fixed, unlike for the number of vertex
shader constants for later versions. Likely apps use the correct number,
and it's not a problem for us if it wants use more.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
d0aeb4422b st/nine: Introduce failure handling for shader parsing.
Instead of crashing on buggy shaders, we should return an error.
This patch introduces this behaviour in the case of invalid constant
access

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
6fcc2c8872 st/nine: Print warnings for r500 when shader is likely to go wrong
r500 hasn't enough float constants for vs to fill all needs.
Overlapping issues can happen with complex shaders.
The fix would be to recompile shaders to include the integer
and boolean constants, instead of reserving slots for them.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
70a523818f st/nine: Declare constants only up to the maximum needed.
Previously 276 constants were declared everytime.

This patch makes shaders declare constants up to the maximum
constant needed and moves the moment we print the TGSI
shader after the moment we declare the constants.

This is needed for r500, since when indirect addressing is used,
it cannot reduce the amount of constants needed, and that it is
restricted to 256 constant slots.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
a249c7a161 st/nine: Refactor how user constbufs sizes are calculated
Count explicitly the slots for float, int and bool constants,
and deduce the constbuf size in nine_shader.

Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
65ca8e4b3d st/nine: Explicit nine requirements
This patch raises nine requirements and disables nine for old
hw that don't match them.

Currently for these cards only games that don't have tight requirements
would work well with nine. However nine is missing several checks
regarding these limitations.
To make code and future patches less heavy, dropping support for these old
card seems a good solution.

That makes r500 the only dx9 generation cards supported by nine. It seems the one
with the less limitations for nine. Still not everything is ok, and we'll have
for example to implement shader recompilation for these cards to include
integer and boolean constants in the shader.
Eventually when this is done, we can reintroduce support for older cards.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Axel Davy
eb1c12d20d gallium: Add MULTISAMPLE_Z_RESOLVE cap
Resolving a multisampled depth texture into
a single sampled texture is supported on >= SM4.1
hw. It is possible some previous hw support it.

The ability was tested on radeonsi and nvc0.
Apparently is is also supported for radeon >= r700.

This patch adds the MULTISAMPLE_Z_RESOLVE cap and
add it to the drivers. It is advertised for drivers
for which it is sure the ability is supported.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-02-06 00:07:18 +01:00
Laura Ekstrand
77cc799853 GL: Update glext.h to Revision 29735 (20150202).
Khronos modified glext.h to get rid of GL_TEXTURE_BINDING, a special enum
added for ARB_direct_state_access.  This enum was ruled unimplementable.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Laura Ekstrand <laura@jlekstrand.net>
2015-02-05 11:41:26 -08:00
Jose Fonseca
08efcc0960 llvmpipe: Trivially advertise PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT.
Nothing special needs to be done.

Even though llvmpipe copies constant (ie uniform) buffers internally, the
application is supposed to flush and sync, so all should work.

All bufferstorage piglit tests pass.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-02-05 16:16:47 +00:00
Matt Turner
2335153ff2 i965: Remove now unnecessary Gen8 CMP destination type override.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-04 12:14:35 -08:00
Matt Turner
6b3a301f61 i965: Set CMP's destination type to src0's type.
Allows CMP instructions with float sources to be compacted and coissued.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-04 12:14:34 -08:00
Matt Turner
7e60794392 i965/fs: Implement the WaCMPInstFlagDepClearedEarly work-around.
Prevents piglit regressions from the next patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-04 12:14:34 -08:00
Jose Fonseca
661c8bb220 gallium/util: Don't implement u_bit_scan64 on MSVC.
As ffsll doesn't exist in MSVC yet, and u_bit_scan64 is only used by
radeonsi which is never built with MSVC.

This is just a stop-gap fix to unbreak MSVC build until we refactor these
mathematical portability wrappers into src/util.

Trivial.
2015-02-04 15:22:59 +00:00
Jose Fonseca
46f1033067 gallium/util: Define ffsll on MinGW.
Trivial.

(Fixing MSVC will be far less so, as _BitScanForward64 is only supported on x64.)
2015-02-04 14:58:20 +00:00
Marek Olšák
6c5af1dc4e radeonsi: implement polygon stippling
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
6895dfb184 radeonsi: add polygon stipple texture slot
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
1fe7ba8c69 radeonsi: deduce rasterizer primitive type at the beginning of draw_vbo
I will need this for polygon stippling.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
8f65e6eae8 radeonsi: allow 64 descriptors per array
We need a slot for the stipple texture and the pixel shader already uses
32 textures (16 API slots + 16 FMASK slots).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
9af943c32e radeonsi: add support for sampler views where resource = NULL
The hardware obeys swizzles even if the resource is NULL.
This will be used by set_polygon_stipple.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
70e4243f07 radeonsi: add support for NULL texture sampler views that return (0,0,0,1)
This used to hang.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
82f64a68a4 radeonsi: fix a crash when binding a NULL sampler view list
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
b142dd2f24 radeonsi: move the buffer descriptor to the end of the image descriptor
This will allow supporting NULL textures.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
afe1e6acdd radeonsi: don't use tgsi_parse_context to get processor type
Also remove unused "tokens".

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
50908a8918 radeonsi: fix instanced arrays with non-zero start instance
Fixes piglit ARB_base_instance/arb_base_instance-drawarrays.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
658f1d4cfe r600g,radeonsi: don't append to streamout buffers that haven't been used yet
The FILLED_SIZE counter is uninitialized at the beginning, so we can't use it.
Instead, use offset = 0, which is what we always do when not appending.

This unexpectedly fixes spec/ARB_texture_multisample/sample-position/*.
Yes, the test does use transform feedback.

Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
b616429ca8 gallium: set PIPE_MAX_SAMPLERS to 18
For drivers that use higher slots not to crash in tgsi_shader_info.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
8fc542aa89 gallium/u_pstipple: add ability to specify a fixed texture unit
E.g. r600g can use slot 17, which is outside of the API range.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
50433ea526 gallium/util: add u_bit_scan64
Same as u_bit_scan, but for uint64_t.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-04 14:34:13 +01:00
Marek Olšák
f2328ffdc8 tgsi: add tgsi_get_processor_type helper from radeon
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-04 14:34:13 +01:00
Kenneth Graunke
ccbe15f332 i965/fs: Fix saturate on MAD and LRP with the NIR backend.
Fixes misrendering in "Witcher 2" with INTEL_USE_NIR=1, and probably
many other programs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-04 00:34:57 -08:00
Iago Toral Quiroga
1b029f8a4a mesa: Fix _mesa_format_convert fallback path when src is not an array format
When a rebase swizzle is provided and we call _mesa_swizzle_and_convert
after unpacking the source format we were always passing normalized=false.
We should pass true or false depending on the formats involved in the
conversion for the byte and float paths (the integer path cannot ever be
normalized).

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2015-02-04 08:08:34 +01:00
Park, Jeongmin
6fd4a61ad6 st/osmesa: Fix osbuffer->textures indexing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930
Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-03 15:46:56 -07:00
Connor Abbott
ab24e12706 i965/nir: use redundant phi optimization
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 16:00:13 -05:00
Connor Abbott
a135f34080 nir: add an optimization to remove useless phi nodes
This removes phi nodes whose sources all point to the same thing.

Shader-db results:

total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%)
NIR instructions in affected programs:     126564 -> 122480 (-3.23%)
helped:                                615
HURT:                                  0

total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%)
FS instructions in affected programs:     24622 -> 23174 (-5.88%)
helped:                                138
HURT:                                  0

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 16:00:13 -05:00
Jason Ekstrand
572d1f6e41 nir/validate: Ensure that phi sources are SSA-only
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 12:52:42 -08:00
Jason Ekstrand
5420774510 nir/validate: Validate that only float ALU outputs are saturated
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 12:46:55 -08:00
Jason Ekstrand
c0df85cca4 nir/lower_source_mods: Don't lower saturate for non-float outputs
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 12:46:38 -08:00
Jason Ekstrand
8776b1b14b i965/fs_nir: Get rid of get_alu_src
Originally, get_alu_src was supposed to handle resolving swizzles and
things like that.  However, now that basically every instruction we have
only takes scalar sources, we don't really need it anymore.  The only case
where it's still marginally useful is for the mov and vecN operations that
are left over from SSA form.  We can handle those cases as a special case
easily enough.  As a side-effect, we don't need the vec_to_movs pass
anymore.

v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Rework the way we detect if we need an extra copy for swizzling.  The
   old code involved a pile of confusing switch fall-throughs; we now use a
   loop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-03 12:33:11 -08:00
Jason Ekstrand
112d738b91 i965/fs: Use NIR's scalarizing abilities and stop handling vectors
Now that we can scalarize with NIR, there's no need for all this code
anymore.  Let's get rid of it and just do scalar operations.

v2: run copy prop before lowering phi nodes

v3: Get rid of the "emit(...)->saturate = foo" pattern

v4: Run alu_to_scalar as an optimization pass

total instructions in shared programs: 5998321 -> 5974070 (-0.40%)
instructions in affected programs:     732075 -> 707824 (-3.31%)
helped:                                3137
HURT:                                  191
GAINED:                                18
LOST:                                  0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-03 12:33:11 -08:00
Jason Ekstrand
f2adcd36cb nir: Add a pass to lower vector phi nodes to scalar phi nodes
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Add better comments
 - Use nir_ssa_dest_init and nir_src_for_ssa more places
 - Fix some void * casts

v3 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Rework the way we determine whether or not to sccalarize a phi node to
   make the recursion non-bogus
 - Treat load_const instructions as scalarizable

v4 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Allow uniform and input loads to be scalarizable

v5 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Also consider loads of inputs (varying, uniform, or ubo) to be
   scalarizable.  We were already doing this for load_var on uniforms and
   inputs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-03 12:33:11 -08:00
Matt Turner
e87928a494 i965/fs: Add support for constant propagating into sources with modifiers.
All but 16 of the programs helped were ARB fp programs.

total instructions in shared programs: 5949286 -> 5945470 (-0.06%)
instructions in affected programs:     275162 -> 271346 (-1.39%)
helped:                                1197
GAINED:                                1

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-02-03 12:25:14 -08:00
Matt Turner
cfa2165642 i965/vec4: Use abs/negate functions in const propagation.
No changes in shader-db.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-02-03 12:25:14 -08:00
Matt Turner
dbd4c22a37 i965: Add function to take the abs of immediates.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-02-03 12:25:14 -08:00
Matt Turner
638beee24a i965: Add function to negate immediates.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-02-03 12:25:14 -08:00
Matt Turner
1f4bdad316 i965: Mark UB/B immediates as unreachable.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-02-03 12:25:14 -08:00