Commit graph

68364 commits

Author SHA1 Message Date
Eduardo Lima Mitev
2aa71e9485 mesa: Fix error validating args for TexSubImage3D
The zoffset and depth values were not being considered when calling
error_check_subtexture_dimensions().

Fixes 2 dEQP tests:
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_offset
* dEQP-GLES3.functional.negative_api.texture.texsubimage3d_invalid_offset

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedestkop.org>
2015-02-24 08:58:53 +01:00
Samuel Iglesias Gonsalvez
fbd6eba72b i965/blorp: round to nearest when converting float into integer
Fixes:

dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_y_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_y_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_y_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_y_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_x_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_x_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_y_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_y_linear

No piglit regressions.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-24 08:58:53 +01:00
Carl Worth
4a6c6c49a7 i965: Perform program state upload outside of atom handling
Across the board of the various generations, the intial few atoms in
all of the atom lists are basically the same, (performing uploads for
the various programs). The only difference is that prior to gen6
there's an ff_gs upload in place of the later gs upload.

In this commit, instead of using the atom lists for this program state
upload, we add a new function brw_upload_programs that calls into the
per-stage upload functions which in turn check dirty bits and return
immediately if nothing needs to be done.

This commit is intended to have no functional change. The motivation
is that future code, (such as the shader cache), wants to have a
single function within which to perform various operations before and
after program upload, (with some local variables holding state across
the upload).

It may be worth looking at whether some of the other functionality
currently handled via atoms might also be more cleanly handled in a
similar fashion.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-23 14:54:15 -08:00
Vivek Kasireddy
1e96eece30 egl, wayland: RGB565 format support on Back-buffer
In current code, color format is always hardcoded to
__DRI_IMAGE_FORMAT_ARGB8888 when buffer or DRI image is
allocated in function calls, get_back_bo and dri2_get_buffers,
regardless of current target's color format. This problem
may leads to incorrect render pitch calculation, which
eventually ends up with wrong offset of pixels in
the frame buffer when the image is in different color format
from dri surf's, especially with different bpp. (e.g. RGB565-16bpp)

Attached code patch simply adds RGB565 and XRGB8888 cases to two
functions noted above to resolve the issue.

v2: added a case of XRGB8888, format and bpp selection is done
    via switch-case (not "if-else" anymore)

Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-23 14:07:02 -08:00
Brian Paul
cbd287f094 mesa: move math-related function into new c99_math.h file
The alternative would be to include math.h in c99_compat.h but that
seems heavy-handed.

This patch also replaces INLINE with inline in the c99 math function
wrappers.

Fixes MSVC build.

Acked-by: Matt Turner <mattst88@gmail.com>
2015-02-23 14:45:14 -07:00
Jason Ekstrand
9b9ef2aeee nir/gcm: Add some missing break statements
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-23 13:20:13 -08:00
Jason Ekstrand
cb4b2ad44a nir: Copy-propagate vecN operations that are actually moves
We were already do this for ALU operations but we haven't for non-ALU
operations.  This changes that.

total NIR instructions in shared programs: 2039883 -> 2022338 (-0.86%)
NIR instructions in affected programs:     1768850 -> 1751305 (-0.99%)
helped:                                    14244
HURT:                                      124

total FS instructions in shared programs: 4083960 -> 4084036 (0.00%)
FS instructions in affected programs:     7302 -> 7378 (1.04%)
helped:                                   12
HURT:                                     51

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-23 13:19:05 -08:00
Francisco Jerez
f80af89d48 ra: Disable round-robin strategy for optimistically colorable nodes.
The round-robin allocation strategy is expected to decrease the amount
of false dependencies created by the register allocator and give the
post-RA scheduling pass more freedom to move instructions around.  On
the other hand it has the disadvantage of increasing fragmentation and
decreasing the number of equally-colored nearby nodes, what increases
the likelihood of failure in presence of optimistically colorable
nodes.

This patch disables the round-robin strategy for optimistically
colorable nodes.  These typically arise in situations of high register
pressure or for registers with large live intervals, in both cases the
task of the instruction scheduler shouldn't be constrained excessively
by the dense packing of those nodes, and a spill (or on Intel hardware
a fall-back to SIMD8 mode) is invariably worse than a slightly less
optimal scheduling.

Shader-db results on the i965 driver:

total instructions in shared programs: 5488539 -> 5488489 (-0.00%)
instructions in affected programs:     1121 -> 1071 (-4.46%)
helped:                                1
HURT:                                  0
GAINED:                                49
LOST:                                  5

v2: Re-enable round-robin already for the lowest one of the nodes
    pushed optimistically onto the sack (Connor).
v3: Use UINT_MAX instead of ~0, open-code MIN2 (Jason, Connor).

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-23 20:55:40 +02:00
Francisco Jerez
34c93fd7f1 i965/fs: Fix lower_load_payload() not to use an incorrect half for immediates and uniforms.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-23 20:55:40 +02:00
Francisco Jerez
ea7b4d25c8 i965/fs: Fix lower_load_payload() to take into account non-zero reg_offset.
Fixes metadata guess when instructions in the program specify a
destination register with non-zero reg_offset and when the payload of
a LOAD_PAYLOAD spans several registers.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-23 20:55:40 +02:00
Francisco Jerez
08b4c8f7bf i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().
MRFs cannot be read from anyway so they cannot possibly be a valid
source of LOAD_PAYLOAD.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-23 20:55:40 +02:00
Francisco Jerez
8e47f51a5a i965/fs: Less broken handling of force_writemask_all in lower_load_payload().
It's perfectly fine to read the second half of a register written with
force_writemask_all from a first half MOV instruction or vice versa, and
lower_load_payload shouldn't mark the whole MOV as belonging to the second
half in that case.  Replicate the same metadata to both halves of the
destination when writemasking is disabled.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-23 20:55:40 +02:00
Matt Turner
57d80d11b1 mesa/vbo: Use unreachable to silence uninitialized var warning.
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:49:57 -08:00
Matt Turner
bb2a897dbc mesa: Move START/END_FAST_MATH macros to their only use.
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:49:48 -08:00
Matt Turner
08bc7cf8f6 mesa: Remove definition of NULL.
If your stdlib.h doesn't define this you should fix your stdlib.h.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:49:47 -08:00
Matt Turner
bfcdb84383 mesa: Use assert() instead of ASSERT wrapper.
Acked-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:49:47 -08:00
Matt Turner
52049f8fd8 mesa: Remove CHECK macro.
There's some commentary about how it's defined by other "modules", and
maybe that was true in 2000 when the code was added.

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:41:22 -08:00
Matt Turner
6a587a4461 mesa: Remove dead CAPI define.
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-23 10:41:22 -08:00
Matt Turner
14ded5ee61 gallium: Use util_cpu_to_le{16,32} in many more places.
... and util_le{16,32}_to_cpu. I think I've used the right ones for
describing the actual operation performed (even though they're both just
"byte-swap this if I'm on big-endian").

The Linux Kernel has typedefs __le32/__be32 and friends that static
analysis tools can use to check that byte-orderings are correct. It
might be interesting to apply that here as well.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:41:22 -08:00
Matt Turner
3492e88090 gallium/util: Use HAVE___BUILTIN_* macros.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:41:22 -08:00
Matt Turner
5a191f49ad mesa: Move C99 MSVC compatibility code from u_math.h to c99_compat.h.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:41:21 -08:00
Matt Turner
0b6d43e329 i965: Link test programs with gtest before pthreads.
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540962
2015-02-23 10:41:21 -08:00
Brian Paul
5dc6c8c570 osmesa: add gallium include dirs to Makefile.am
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89260
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:07:48 -07:00
Brian Paul
44375a3b13 util: move pipe_prim_names array into u_prim_name()
Also, wrapping the array in #ifdef DEBUG / #endif doesn't seem necessary.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:02:39 -07:00
Brian Paul
f1c67e37e6 util: rewrite debug_print_transfer_flags() using debug_dump_flags()
Add add missing PIPE_TRANSFER_PERSISTENT, PIPE_TRANSFER_COHERENT flags.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-23 10:02:39 -07:00
Eduardo Lima Mitev
0bfe21e8e0 mesa: Adds missing error condition in _mesa_check_sample_count()
This corrects a trivial error introduced in commit
19252fee46. That patch was merged recently
and omits one condition (that 'samples' is greater than zero) in one of
the error checks. That error will definitely cause regressions.

Also corrects the reference to the specification above the error check,
which was wrongly quoting OpenGL instead of OpenGL-ES.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-02-23 15:04:26 +01:00
Marek Olšák
050bf75c8b radeonsi: fix a warning caused by previous commit
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-02-23 11:45:00 +01:00
Marek Olšák
7820a11e3d radeonsi: fix point sprites
Broken by a27b74819a.

This fix is critical and should be ported to stable ASAP.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
2015-02-23 11:40:55 +01:00
Ben Widawsky
6e62a52865 i965/skl: Use 1 register for uniform pull constant payload
When under dispatch_width=16 the previous code would allocate 2 registers for
the payload when only one is needed. This manifested itself through bugs on SKL
which needs to mess with this instruction.

Ken though this might impact shader-db, but apparently it doesn't

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Timo Aaltonen <timo.aaltonen@canonical.com>
2015-02-22 12:27:35 -08:00
Eric Anholt
4359954d84 nir: Generalize the optimization of subs of subs from 0.
I initially wrote this based on the "(('fneg', ('fneg', a)), a)" above,
but we can generalize it and make it more potentially useful.  In the
specific original case of a 0 for our new 'a' argument, it'll get further
algebraic optimization once the 0 is an argument to the new add.

No shader-db effects.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
345c2b288a nir: Collapse repeated bcsels on the same argument.
vc4 results:
total instructions in shared programs: 39881 -> 39794 (-0.22%)
instructions in affected programs:     6302 -> 6215 (-1.38%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
a38038ca5e nir: When faced with a csel on !condition, just flip the arguments.
total NIR instructions in shared programs: 39426 -> 39411 (-0.04%)
NIR instructions in affected programs:     3748 -> 3733 (-0.40%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
8e1152cb33 nir: Allow nir_opt_algebraic to see booleanness through &&, ||, ^, !.
We have some useful optimizations to drop things like 'ine a, 0' on a
boolean argument, but if 'a' came from logical operations on bools, it
couldn't tell.  These kinds of constructs appear as a result of TGSI->NIR
quite frequently (at least with if flattening), so being a little more
aggressive in detecting booleans can pay off.

v2: Add ixor as a booleanness-preserving op (Suggestion by Connor).

vc4 results:
total instructions in shared programs: 40207 -> 39881 (-0.81%)
instructions in affected programs:     6677 -> 6351 (-4.88%)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
dc982f4a85 nir: Add a couple of simplifications of csel operations.
vc4 was already cleaning these up, but it does shave 4 NIR instructions in
shader-db.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Ilia Mirkin
c2ece77678 glsl: ensure that enter/leave record get a record type
May make life easier for tools like Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-21 17:27:24 -05:00
Ilia Mirkin
1763494b31 tgsi: avoid returning pointer to local var, make it static
Spotted by Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-21 17:27:24 -05:00
Rob Clark
51e335742e freedreno/a4xx: set PC_PRIM_VTX_CNTL.VAROUT properly
Fixes xonotic, some webgl stuff, and really pretty much anything with
more than 4 varyings.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:02 -05:00
Rob Clark
fb1301e40a freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:02 -05:00
Rob Clark
bdf023482a freedreno/a4xx: bit of cleanup
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:02 -05:00
Rob Clark
9153dd4b7e loader: not having a pci-id should not be a warn
If there is no pci-id, which is valid for vc4 and freedreno, just emit
an info msg.  Keep malformed but existing pci-id's as a warning.

Mostly just to clean up a warning that confuses users for the non-pci
devices.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:02 -05:00
Rob Clark
e17437386c freedreno: implement fence
I never actually implemented the stubbed out fence stuff back in the
early days.  Fix that.

We'll need a few libdrm_freedreno changes to handle timeout properly,
so ignore that for now to avoid a libdrm_freedreno dependency bump.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:02 -05:00
Rob Clark
6855226653 freedreno/a2xx: fix increment in assert
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88883
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-02-21 17:11:01 -05:00
Jordan Justen
49a938a265 i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data
The brw_imm_ud will yield a HW_REG which then will introduce a barrier
for certain optimization opportunities.

No piglit regressions seen with gen8 (simd8vs).

Suggested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-21 11:40:53 -08:00
Jordan Justen
17fbd854e0 i965/fs: Set pixel/sample mask for compute shaders atomic ops
For fragment programs, we pull this mask from the payload header. The same
mask doesn't exist for compute shaders, so we set all bits to enabled.

Previously we were setting 0xff to support SIMD8 VS, but with CS we
support SIMD16, and therefore we change this to 0xffff.

Related commits for SIMD8 VS:

commit d9cd982d55
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Sun Feb 15 20:06:59 2015 -0800
    i965/simd8vs: Fix SIMD8 atomics

commit 4a95be9772
Author: Jordan Justen <jordan.l.justen@intel.com>
Date:   Tue Feb 17 09:57:35 2015 -0800
    i965/simd8vs: Fix SIMD8 atomics (read-only)

Note: this mask is ANDed with the execution mask, so some channels may not end
up issuing the atomic operation.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-02-21 11:40:53 -08:00
Chia-I Wu
9fe81879c5 ilo: R32G32B32_FLOAT need no special care on Gen8+
Gen8+ must use VALIGN_4.  Unlike prior Gens, R32G32B32_FLOAT should supposedly
support VALIGN_4.
2015-02-21 11:33:54 +08:00
Chia-I Wu
226109436f ilo: 128 BPP formats can use TiledY on Gen7.5+
The restriction is lifted.
2015-02-21 11:33:54 +08:00
Ilia Mirkin
f8e4792b22 nvc0: enable double support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:51:50 -05:00
Ilia Mirkin
5491458843 nvc0/ir: remove merge/split pairs to allow normal propagation to occur
Because the TGSI interface creates merges for each instruction source
and then splits them back out, there are a lot of unnecessary
merge/split pairs which do essentially nothing. The various modifier/etc
propagation doesn't know how to walk though those, so just remove them
when they're unnecessary.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:51:50 -05:00
Ilia Mirkin
93812dc10a nvc0/ir: add support for new TGSI double opcodes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:51:43 -05:00
Ilia Mirkin
ef8f09be33 nvc0/ir: handle zero and negative sqrt arguments
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-20 19:30:28 -05:00