Commit graph

24332 commits

Author SHA1 Message Date
Axel Davy
a30684712e st/nine: Revert to sw cursor in case of failure to set hw cursor
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Axel Davy
df6f1f77cc st/nine: Do not call ID3DPresent_GetCursorPos for sw cursor
For sw cursor we do not tell wine the cursor position (the app
tells us directly). We shouldn't use ID3DPresent_GetCursorPos.

device->cursor.pos already contains the coordinates the app
gave us.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Axel Davy
78b304e2f9 st/nine: Force hw cursor for Windowed mode
According to the spec, Windowed mode must
have hw cursor

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Axel Davy
1b20eaff67 st/nine: Hide hardware cursor when we don't use it
We have either hardware cursor or software cursor.
When we use software cursor, we should hide the hardware
cursor.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Axel Davy
3470878383 st/nine: fix D3DRS_DITHERENABLE wrong state group
D3DRS_DITHERENABLE was assigned to the rasterizer state
group, but it was used for the blend group.

Assign it to the blend group.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-08-21 22:21:45 +02:00
Patrick Rudolph
1b645df2f3 st/nine: Account POINTSIZE_MIN and POINTSIZE_MAX for point size
When using D3DRS_POINTSIZE make sure the value is at least
D3DRS_POINTSIZE_MIN but not greater than D3DRS_POINTSIZE_MAX.

Fixes some Wine tests.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-08-21 22:21:45 +02:00
Patrick Rudolph
886227d363 st/nine: Align texture memory
Align texture memory on 32 byte boundry to allow
SSE/AVX memcpy to work on locked rects.

This fixes some crashes with games using SSE.

Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-08-21 22:21:45 +02:00
Axel Davy
3c4864fa55 st/nine: Always set point_quad_rasterization to 1
Both Points and Point Sprites are rasterized like quads,
according to d3d9 doc and gallium rasterizer doc.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-08-21 22:21:45 +02:00
Axel Davy
74de849bd4 st/nine: Fix Swizzle for ATI2 format
We had red and green in the wrong channels
for the ATI2 format (RGTC2).

Found thanks to wine tests.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2015-08-21 22:21:45 +02:00
Patrick Rudolph
cb2d680232 target/d3dadapter9: Return Windows like card names
Add support for multiple cards and fill in Win
like card name, driver name and version info.
Use fallback for unknown vendors and unknown card names.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
2015-08-21 22:21:45 +02:00
Zoltan Gilian
df5cdec132 clover: fix llvm 3.5 build error
There is no MDOperand in llvm 3.5.

v2: Check if kernel metadata is present to avoid crash (EdB).
v3: Second attempt to avoid crash: switch off metadata query for llvm < 3.6.

Reviewed-by: Serge Martin (EdB) <edb+mesa@sigluy.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-08-21 14:18:10 +03:00
Eric Anholt
8cae9f2fda vc4: Add algebraic opt for rcp(1.0).
We're generating rcps as part of backend lowering of the packed coordinate
in the CS, and we don't want to lower them in NIR because of the extra
newton-raphson steps in the common case.  However, GLB2.7 is moving a
vertex attribute with a 1.0 W component to the position, and that makes us
produce some silly RCPs.

total instructions in shared programs: 97590 -> 97580 (-0.01%)
instructions in affected programs:     74 -> 64 (-13.51%)
2015-08-20 23:43:04 -07:00
Eric Anholt
c800fef2e2 vc4: Allow unpack_8[abcd]_f's src to stay in r4.
I had QPU emit code to do it, but forgot to flag the register class.

total instructions in shared programs: 97974 -> 97590 (-0.39%)
instructions in affected programs:     25291 -> 24907 (-1.52%)
2015-08-20 23:43:04 -07:00
Eric Anholt
8b36d107fd vc4: Pack the unorm-packing bits into a src MUL instruction when possible.
Now that we do non-SSA QIR instructions, we can take a NIR SSA src that's
only used by the unorm packing and just stuff the pack bits into it.

total instructions in shared programs: 98136 -> 97974 (-0.17%)
instructions in affected programs:     4149 -> 3987 (-3.90%)
2015-08-20 23:43:04 -07:00
Eric Anholt
572a48366d vc4: Add a QIR helper for whether the op is a MUL type. 2015-08-20 23:42:59 -07:00
Eric Anholt
fd74da11c4 vc4: Drop an unused algebraic op.
NIR now handles this optimization for us.
2015-08-20 23:42:53 -07:00
Eric Anholt
98728ce071 vc4: Switch QPU_PACK_SCALED to be two non-SSA instructions.
total instructions in shared programs: 98159 -> 98136 (-0.02%)
instructions in affected programs:     12279 -> 12256 (-0.19%)
2015-08-20 23:42:45 -07:00
Eric Anholt
69ef08d303 vc4: Make the pack-to-unorm instructions be non-SSA.
This helps ensure that the register allocator doesn't force the later pack
operations to insert extra MOVs.

total instructions in shared programs: 98170 -> 98159 (-0.01%)
instructions in affected programs:     2134 -> 2123 (-0.52%)
2015-08-20 23:42:17 -07:00
Eric Anholt
0bba4fa070 vc4: Allow QIR registers to be non-SSA.
Now that we have NIR, most of the optimization we still need to do is
peepholes on instruction selection rather than general dataflow
operations.  This means we want to be able to have QIR be a lot closer to
the actual QPU instructions, just with virtual registers.  Allowing
multiple instructions writing the same register opens up a lot of
possibilities.
2015-08-20 23:40:22 -07:00
Eric Anholt
ceb1a31842 vc4: We can now move TEX_RESULT accesses across other r4 ops.
No difference on shader-db.
2015-08-20 23:40:16 -07:00
Ilia Mirkin
8483577f6b nv50/ir: pre-compute BFE arg when both bits and offset are imm
Due to a quirk in how the nv50 opt passes run, the algebraic
optimization that looks for these BFE's happens before the constant
folding pass. Rearranging these passes isn't a great idea, but this is
easy enough to fix. Allows a following cvt to eliminate the bfe in
certain situations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 22:16:46 -04:00
Glenn Kennard
4237dfb978 r600g: Fix handling of TGSI_OPCODE_ARR with SB
FLT_TO_INT goes in the vector pipes on evergreen/NI,
not the trans unit as on earlier chips.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-21 09:46:13 +10:00
Edward O'Callaghan
7a32652231 r600: Turn 'r600_shader_key' struct into union
This struct was getting a bit crowded, following the lead of
radeonsi, mirror the idea of having sub-structures for each
shader type. Turning 'r600_shader_key' into an union saves
some trivial memory and CPU cycles for the shader keys.

[airlied: drop as_ls, and reorder so larger fields at start.]
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-21 09:46:13 +10:00
Edward O'Callaghan
e2145de74d r600: Rewrite r600_shader_selector_key() to use a switch stmt
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-21 09:46:13 +10:00
Tobias Klausmann
3e6adbd761 nv50/ir: Handle OP_CVT when folding constant expressions
[imirkin: handle more type combinations, use macro]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Ilia Mirkin
f5b926183d nvc0/ir: undo more shifts still by allowing a pre-SHL to occur
This happens with unpackSnorm lowering. There's yet another
bitfield-extract behind it, but there's too much variation to be worth
cutting through.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Ilia Mirkin
9ebe7dc094 nvc0/ir: don't require AND when the high byte is being addressed
unpackUnorm* lowering doesn't AND the high byte/word as it's
unnecessary. Detect that situation as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Ilia Mirkin
63cb85e567 nvc0/ir: detect i2f/i2i which operate on specific bytes/words
Some Unigine shaders have been observed to unpack bytes out of 32-bit
integers and convert them to floats. I2F/I2I can handle this sort of
thing directly. Detect the handleable situations.

This misses 16-bit word capabilities in nv50, but I haven't seen shaders
that would actually make use of that.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Ilia Mirkin
51499bb5ff nvc0/ir: detect AND/SHR pairs and convert into EXTBF
Some shaders appear to extract bits using shift/and combos. Detect
(some) of those and convert to EXTBF instead.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Chih-Wei Huang
2a4af36517 nv50/ir: support different unordered_set implementations
If build with C++11 standard, use std::unordered_set.

Otherwise if build on old Android version with stlport,
use std::tr1::unordered_set with a wrapper class.

Otherwise use std::tr1::unordered_set.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-20 17:58:30 -04:00
Marek Olšák
3b1e283d88 radeonsi: fix a typo as_es -> as_ls in a string
Trivial.
2015-08-19 12:04:51 +02:00
Marek Olšák
5fb0180592 winsys/amdgpu: fix the type of memory usage counters
If the 32-bit types overflowed, the driver could submit an IB that uses much
more memory than is available.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-08-19 12:03:01 +02:00
Marek Olšák
421b809db1 radeonsi: fix indirect indexing of MSAA textures
FMASK wasn't handled correctly.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-08-19 12:03:01 +02:00
Jason Ekstrand
f01bdb0484 util/ra: Make allocating conflict lists optional
Since i965 is now using make_reg_conflicts_transitive and doesn't need
q-value computations, they are disabled on i965.  They are enabled
everywhere else so that they get the old behavior.  This reduces the time
spent in eglInitialize() on BDW by around 10-15%.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-08-18 17:48:53 -07:00
Rob Clark
4a0bea3863 freedreno: use fd_pipe_wait_timeout()
To properly support the case of waiting on a fence with a 0 timeout, we
still need to call down to the kernel.  Which requires the use of the
new fd_pipe_wait_timeout() API.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-08-18 15:36:30 -04:00
Rob Clark
fd7a14f8dd freedreno: fence fix
Don't take current timestamp/fence from current ring, as we might have
already rolled over to new rb.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-08-18 15:36:30 -04:00
Neil Roberts
885762e182 Add mesa.icd to the .gitignore
Since 4d7e0fa8c7 this file is generated by the configure script.
Reviewed-by: Tapani Palli <tapani.palli@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-08-18 12:12:15 -07:00
Grazvydas Ignotas
97f5d00648 radeon/uvd: remove unused variables
Recent commits introduced new unused variable warnings, fix them.

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-08-18 14:11:48 +02:00
Marcos Paulo de Souza
df97126731 nouveau: recognize tess stages in nouveau_compiler
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 23:05:00 -04:00
Marcos Paulo de Souza
723a5a2e68 tgsi: fix parsing of tessellation shader inputs/outputs
Tessellation control shaders write to outputs as OUT[ADDR[0].x][0], make
sure to parse the indirect dimension on outputs.

Also tess control inputs/outputs and tess eval input declarations need
to receive the same treatment as geometry shader inputs.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 23:05:00 -04:00
Marcos Paulo de Souza
a37fa7653b tgsi: set implicit array size for tess stages
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 22:50:16 -04:00
Ilia Mirkin
5af71fb5ac freedreno/a3xx: add s3tc texture format support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 11:38:38 -04:00
Ilia Mirkin
581cbfdec1 freedreno/a3xx: fix up logic for handling block formats
This only appears in cubemaps which have have packed layers, so are very
sensitive to any layout disagreement between sw and hw.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 11:38:38 -04:00
Ilia Mirkin
12e1bf0b68 freedreno/a3xx: double the polygon offset value
A few other drivers do this, fixes the gl-1.4-polygon-offset piglit test

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 11:38:38 -04:00
Ilia Mirkin
1af0641db3 nvc0: implement the color buffer 0 is integer rule for alpha-to-one/cov
The hardware checks for multisampling being enabled, but does not have
the rule about cbuf0 being an integer format. Only enable
alpha-to-one/alpha-to-coverage if cbuf0 is not an integer format.

Fixes piglits
  ext_framebuffer_multisample-int-draw-buffers-alpha-to-one
  ext_framebuffer_multisample-int-draw-buffers-alpha-to-coverage

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 04:21:18 -04:00
Ilia Mirkin
2f5ee9bf27 gk110/ir: fix sched calculator to consider all registers in the ISA
GK110/GK208 have 256 registers, not 64. Find out the number of registers
from the target to avoid unnecessary iteration for pre-GK110.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 02:46:16 -04:00
Ilia Mirkin
ae5cf4f3f7 nvc0: program smooth line width when multisampling is enabled
There are separate line widths for smooth and aliased lines. The smooth
one is selected when multisampling is enabled even if line smoothing
isn't explicitly turned on.

Fixes the ext_framebuffer_multisample-line-smooth piglits

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 01:01:02 -04:00
Ilia Mirkin
884b4df3b6 nvc0: bind a fake tess control program when there isn't one available
Apparently this is necessary in order for tess factors to work in a tess
eval program without a tess control program bound. Probably because it
uses the fake program's shader header to work out the number of patch
constants.

Fixes vs-tes-tessinner-tessouter-inputs

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 01:01:02 -04:00
Ilia Mirkin
f13073b775 gm107/ir: avoid letting the lowering pass get out of sync
There's a lot of functionality duplicated in the gm107 lowering pass
from the nvc0 pass. As that one gets updated, the gm107 one falls
behind. Avoid this by sharing the code.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-17 01:01:02 -04:00
Ilia Mirkin
2514c78fba nv50,nvc0: take level into account when doing eng2d multi-layer blits
This fixes arb_get_texture_sub_image-get, and any situation where the 2d
engine was being used for multi-layer blits to a non-0 level.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
2015-08-17 01:01:02 -04:00