Commit graph

27608 commits

Author SHA1 Message Date
Patrick Rudolph
7f58ba45a8 st/nine: Fix Volumetexture9_LockBox
Check for valid locked box dimensions.

Fixes failing wine tests device.c test_lockbox_invalid.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Axel Davy
35047681ff st/nine: Fix ATI2 pitch for non-square
Fixes crash for non-square textures.
We were using the height instead of the
width for some calculations.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
eeeab8d6b4 st/nine: Support D3DFMT_R8G8B8
Add support for D3DFMT_R8G8B8. It allows format conversion for
surfaces of pool scratch.

Usually gallium formats equivalents for d3d9 formats
have their names reversed.

The gallium format PIPE_FORMAT_R8G8B8_UNORM is the right
equivalent here, and its name is likely wrong (reversed).

Fixes a crash in TmNationsForever.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
a3e7525ada st/nine: Use cso for viewport
Use CSO to catch redundant viewport changes.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
495727af6b st/nine: Fix shade mode flat
Shade mode flat is only working if pixelshaders have interpolate
set to TGSI_INTERPOLATE_COLOR on color inputs.

Fixes failing WINE tests visual.c test_shademode().

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
fa887ba65b st/nine: Clear rendertarget on creation
Clear every rendertarget on creation.
Fixes https://github.com/iXit/Mesa-3D/issues/139

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
b142f61621 st/nine: Allow ColorFill on D3DFMT_NULL surfaces
Report success instead of failing as there's no resource for those surfaces.
Fixes a crash in Crysis: Warhead.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Axel Davy
04e22a04a6 st/nine: Introduce STREAMFREQ state
Previous vertex elements code update
was protected by
'if ((group & (NINE_STATE_VDECL | NINE_STATE_VS)) ||
state->changed.stream_freq & ~1)'
itself protected by
'if (group & (NINE_STATE_COMMON | NINE_STATE_VS))'

If no state is changed except the stream frequency,
no update would happen.

This patch solves the problem by adding a new
NINE_STATE_STREAMFREQ state.
Another way would be to add state->changed.stream_freq & ~1
check to the main test.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
2016-02-04 22:12:17 +01:00
Axel Davy
15ce2778fb st/nine: Catch redundant SetStreamSourceFreq calls
Some apps do redundant SetStreamSourceFreq calls.
Catch them to improve performance.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
ea3f504f7c st/nine: Squash indexbuffer9 and vertexbuffer9
The indexbuffer9 codebase was lagging behind the one of vertexbuffer9.

Add buffer9 as common code base for indexbuffer9 and vertexbuffer9.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Axel Davy
b6bb8d561a st/nine: Unset vtxbuf on reset
We forgot to reset vtxbuf.
This fixes some crashes.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
2016-02-04 22:12:17 +01:00
Axel Davy
b63c144d1e st/nine: Use pipe_resource_reference for vtxbuf
This seems cleaner to actually reference the resources for vtxbuf,
rather than relying on the fact the bound d3d streams do.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Axel Davy
b5876e4762 st/nine: Use ff vertex shader when position_t is used
When an application sets a vertex shader, we are supposed
to use it, and when no vertex shader are set, we are supposed
to revert to fixed function vertex shader.

It seems there is an exception: when the vertex declaration
has a position_t index, we should revert to fixed function
vertex shader.

Up to know we were checking if device->state.vs is set
to know whether to use programmable shader or not.

With this commit we determine whether we use programmable shader
or not when vertex shader/declaration are set, but
stateblocks do complicate things a bit.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
531acbc56b st/nine: Don't increment refcount on VertexDeclaration creation failure
NineUnknown_ctor increments the refcount even in case of an error.
Restructure the code to prevent refcount increments.
Fixes a couple of wine tests.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Axel Davy
b39fd5b1da st/nine: Change StretchRect check order
Textures in SYSTEMMEM don't have resources attached.
Instead of returning an error for them, StretchRect
was crashing.
This changes the check order to fix that case.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
2016-02-04 22:12:17 +01:00
Axel Davy
a82e67812a st/nine: Initialize lights in stateblocks
This fixes a crash.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
9c1d93f8e7 st/nine: Fix fixed-function blendweights
The last weighted element is one minus the sum of all previous weights.
Fixes WINE test visual.c test_vertex_blending.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
cc830dc214 st/nine: Always normalize hitDir
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
ed7e1046b6 st/nine: Replace r[0] with tmp
Replace r[0] with tmp to ease code reading.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
9856203f5a st/nine: Fix ff calculation of midVec
In case of non local viewer the value has to be subtracted.
Fixes failing WINE tests in test_specular_lighting() (visual.c)

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
921f3eac58 st/nine: Implement D3DRS_SPECULARENABLE
Implement fixed function D3DRS_SPECULARENABLE.
Fixes failing WINE tests in test_specular_lighting() (visual.c)

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
9c26fa1b13 st/nine: Fix D3DRS_LOCALVIEWER being ignored
Set key->localviewer to D3DRS_LOCALVIEWER.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Axel Davy
aa4454ae85 st/nine: Fix rounding issue with vs1.1 a0 reg
vs1.1 rounds a0 to lowest integer, while
other versions do round to closest.

To use the same path as the other versions (with ARR),
we were substracting 0.5 for vs1.1 to get round to lowest.

This gives wrong result if a0 is set to 0:
round(0 - 0.5) = -1

Instead just use ARL for vs1.1

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
2016-02-04 22:12:17 +01:00
Axel Davy
dbb03f6b5b st/nine: Fix D3DPMISCCAPS_FOGANDSPECULARALPHA support
The documentation of the flag doesn't make sense.
To sum up the doc, if not set, specular alpha contains fog,
and if set specular alpha contains 0 (except for ff).

However in practice when the flag is there, apps do use specular alpha
as if it could be used normally, which makes much more sense than the doc.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
2016-02-04 22:12:17 +01:00
Patrick Rudolph
9298a0b81b st/nine: Fix AlphaCmpCaps
AlphaCmpCaps should advertise D3DPCMPCAPS_NEVER as well.

Fixes https://github.com/iXit/Mesa-3D/issues/142

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-02-04 22:12:17 +01:00
Marek Olšák
bff640b3e0 radeonsi: implement PK2H and UP2H opcodes
Based on a gallivm patch by Ilia Mirkin.

+8 piglit regressions due to precision issues (I blame the tests)

The benefit is that we'll get v_cvt_f32_f16 and v_cvt_f16_f32 instead
of emulation with integer instructions. They are GLSL 4.00 intrinsics.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-02-04 19:52:28 +01:00
Marek Olšák
8ec24678ac radeonsi: fix Hyper-Z on Stoney
Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-04 16:47:41 +01:00
Ilia Mirkin
edd494ddf0 nv50/ir: make sure to fetch all sources before creating instruction
We must fetch all sources into the instruction stream before generating
the instruction that uses them. Otherwise we'll define values after
using them, which won't work so well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-02-03 18:40:38 -05:00
Ilia Mirkin
a9d5c64c34 nv50: avoid freeing the symbols if they're about to be stored
Spotted by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-02-03 18:40:26 -05:00
Nicolai Hähnle
43a401a792 gallium: fix the documentation of PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE
This parameter is equivalent to the corresponding OpenGL implementation
limit which is in texels, not bytes.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-02-03 14:12:37 +01:00
Nicolai Hähnle
7dd31b81fe gallium/radeon: support PIPE_CAP_SURFACE_REINTERPRET_BLOCKS
This is already used internally in si_resource_copy_region for compressed
textures, so the only real change here is the adjusted surface size
computation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-03 14:10:37 +01:00
Nicolai Hähnle
6af6d7b08a gallium: Add PIPE_CAP_SURFACE_REINTERPRET_BLOCKS
This cap indicates whether pipe->create_surface can reinterpret a texture
as a surface with a format of different block width/height (but equal
block size).

v2: fix whitespace

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-03 14:10:34 +01:00
Nicolai Hähnle
3abb548ef6 gallium: Add PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY
This cap indicates that the driver only supports R, RG, RGB and RGBA
formats for PIPE_BUFFER sampler views.

v2: move into "unsupported features" section for nouveau (Ilia Mirkin)

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-02-03 14:10:34 +01:00
Leo Liu
6ad2e55a14 st/omx/dec/h264: fix corruption when scaling matrix present flag set
The scaling list should be filled out with zig zag scan

v2: integrate zig zag scan for list 4x4 to vl(Christian)
v3: move list determination out from the loop(Ilia)

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-02-02 20:29:47 -05:00
Leo Liu
4f598f2173 vl: add zig zag scan for list 4x4
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-02-02 20:29:43 -05:00
Roland Scheidegger
848a023c05 llvmpipe: use scissor_planes_needed helper function
So it doesn't get out of sync in multiple places.
2016-02-03 01:25:45 +01:00
Niels Ole Salscheider
fb44cfadce winsys/radeon: Do not deinit the pb cache if it was not initialized
This fixes a crash in pb_cache_release_all_buffers.

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-02-02 21:11:15 +01:00
Marek Olšák
84a6d2d7d6 tgsi/scan: add tgsi_shader_info::reads_samplemask 2016-02-02 21:04:52 +01:00
Marek Olšák
0d68b91220 radeonsi: rework RB+ for Stoney
This fixes it.

States which also need to be taken into account:
- SPI color formats - each down-conversion format supports only a limited set
  of SPI formats
- whether MSAA resolving and logic op are enabled

These need special handling:
- blending
- disabled channels

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-02 21:03:19 +01:00
Marek Olšák
066d76c2f4 radeonsi: rename cb_target_mask state to cb_render_state
and rename a variable in the function.

SX_PS_DOWNCONVERT will be emitted here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-02 21:03:19 +01:00
Marek Olšák
5f0f9a5619 radeonsi: treat intensity render targets exactly like red
The motivation is to simplify the Stoney RB+ code.
Intensity is already treated as red except here.

No piglit regressions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-02 21:03:18 +01:00
Marek Olšák
f96f94966d tgsi: set correct src type for UP2H
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-02 21:02:26 +01:00
Dave Airlie
e7a27f70b9 virgl: mark function as static
This is fallout from the previous changes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93961

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-02 17:55:40 +10:00
Roland Scheidegger
7221b8aec6 gallivm: add PK2H/UP2H support
Add support for these opcodes, the conversion functions were already
there albeit need some new packing stuff.
Just like the tgsi version, piglit won't like it for all the same
reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing
tests, albeit since PK2H won't due to those rounding differences I don't
know if that one works or not as the piglit test is rather difficult to
deal with).

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-02 05:58:20 +01:00
Roland Scheidegger
5171ec9ca9 gallivm: add PK2H/UP2H support
Add support for these opcodes, the conversion functions were already
there albeit need some new packing stuff.
Just like the tgsi version, piglit won't like it for all the same
reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing
tests, albeit since PK2H won't due those rounding differences I don't
know if that one works or not as the piglit test is rather difficult to
deal with).
2016-02-02 05:58:19 +01:00
Roland Scheidegger
dc16086e3b tgsi: add PK2H/UP2H support
The util functions handle the half-float conversion.
Note that piglit won't like it much due to:
a) The util functions use magic float mul conversion but when run inside
softpipe/llvmpipe, denorms are flushed to zero, therefore when the conversion
is from/to f16 denorm the result will be zero. This is a bug which should be
fixed in these functions (should not rely on denorms being available), but
will happen elsewhere just the same (e.g. conversion to f16 render targets).
b) The util functions use trunc round mode rather than round-to-nearest. This
is NOT a bug (as it is a d3d10 requirement). This will result of rounding not
representable finite values to MAX_F16 rather than INFINITY. My belief is the
piglit tests are wrong here but it's difficult to tell (generally glsl
rounding mode is undefined, however I'm not sure if rounding mode might need
to be consistent for different operations). Nevertheless, for gl it would be
better to use round-to-nearest, but using different rounding for GL and d3d10
is an unsolved problem (as it affects things like conversion to f16 render
targets, clear colors, this shader opcode).

Hence for now don't enable the cap bit (so the code is unused).
(Code is from imirkin, comment from sroland)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmvware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-02 05:58:19 +01:00
Roland Scheidegger
99bd96abbb llvmpipe: drop scissor planes early if the tri is fully inside them
If the tri is fully inside a scissor edge (or rather, we just use the
bounding box of the tri for the comparison), then we can drop these
additional scissor "planes" early. We do not even need to allocate
space for them in the tri.

The math actually appears to be slightly iffy due to bounding boxes
being rounded, but it doesn't matter in the end.

Those scissor rects are costly - the 4 planes from the scissor are
already more expensive to calculate than the 3 planes from the tri itself,
and it also prevents us from using the specialized raster code for small
tris.

This helps openarena performance by about 8% or so. Of course, it helps
there that while openarena often enables scissoring (and even moves the
scissor rect around) I have not seen a single tri actually hit the
scissor rect, ever.

v2: drop individual scissor edges, and do it earlier, not even allocating
space for them.
v3: help the compiler a bit with simpler code, suggested by Brian.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-02 05:58:19 +01:00
Roland Scheidegger
9d2a34e105 llvmpipe: minor cleanup of sse2 for calc_fixed_position
Just slightly simpler assembly.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-02 05:58:19 +01:00
Roland Scheidegger
8aa168eb8f llvmpipe: use vector loads for (optimized) tri raster funcs
When we switched to 64bit rasterization, we could no longer use straight
aligned loads for loading the plane data. However, what the code actually
does for loading 3 planes, is 12 scalar loads + 9 unpacks, and then there's
another 8 unpacks for the transpose we need (!).

It would be possible to do the (scalar) loads of course already transposed
(at least saving the additional unpacks), however instead just use
(un)aligned vector loads, and recalculate the eo values, which is much less
instructions (note in case of the triangle_32_3_4 case, the eo values are
not even used, making the scalar loads + unpacks for them all the more
pointless).

This drops execution time of the triangle_32_3_4 function considerably,
albeit it doesn't really make a measurable difference (for small tris we're
essentially limited by vertex throughput in any case), for triangle_32_3_16
it's essentially noise (the loop is more costly than the initial code there).

(I'm thinking about just ditching storing the eo values in the plane data,
so could switch back to using aligned planes, however right now they are
still used in the other raster functions dealing with planes with scalar
code. Also not touching the ppc code, might not be that bad there in any
case.)

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-02 05:58:19 +01:00
Roland Scheidegger
116e4dc995 mesa: fix typo in python scripts
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-02-02 05:58:19 +01:00