Commit graph

80736 commits

Author SHA1 Message Date
Michel Dänzer
860210ccfc clover: Fix build against clang SVN >= r267772
(Re-pushing previous fix for clang SVN r265359, which was reverted in
the meantime)

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2016-04-28 12:57:03 +09:00
Lars Hamre
32cb7d61a9 glsl: fix lowering outputs for early/nested returns
Return statements in conditional blocks were not having their
output varyings lowered correctly.

This patch fixes the following piglit tests:
/spec/glsl-1.10/execution/vs-float-main-return
/spec/glsl-1.10/execution/vs-vec2-main-return
/spec/glsl-1.10/execution/vs-vec3-main-return

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-04-28 11:01:51 +10:00
Connor Abbott
122d27e998 nir: rewrite nir_foreach_block and friends
Previously, these were functions which took a callback. This meant that
the per-block code had to be in a separate function, and all the data
that you wanted to pass in had to be a single void *. They walked the
control flow tree recursively, doing a depth-first search, and called
the callback in a preorder, matching the order of the original source
code. But since each node in the control flow tree has a pointer to its
parent, we can implement a "get-next" and "get-previous" method that
does the same thing that the recursive function did with no state at
all. This lets us rewrite nir_foreach_block() as a simple for loop,
which lets us greatly simplify its users in some cases. This does
require us to rewrite every user, although the transformation from the
old nir_foreach_block() to the new nir_foreach_block() is mostly
trivial.

One subtlety, though, is that the new nir_foreach_block() won't handle
the case where the current block is deleted, which the old one could.
There's a new nir_foreach_block_safe() which implements the standard
trick for solving this. Most users don't modify control flow, though, so
they won't need it. Right now, only opt_select_peephole needs it.

The old functions are reimplemented in terms of the new macros, although
they'll go away after everything is converted.

v2: keep an implementation of the old functions around
v3 (Jason Ekstrand): A small cosmetic change and a bugfix in the loop
   handling of nir_cf_node_cf_tree_last().
v4 (Jason Ekstrand): Use the _safe macro in foreach_block_reverse_call

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 15:05:40 -07:00
Connor Abbott
958300137f nir/opt_cp: use nir_block_get_following_if()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 15:05:34 -07:00
Jordan Justen
aaaa22c775 vbo: Return INVALID_OPERATION during draw with a mapped buffer
Fixes the OpenGLES 3.1 CTS:
 * ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos

Because this is triggering the error message after the normal API
validation phase, we don't have the API function name available, and
therefore we generate an error message without the draw call name:

Mesa: User error: GL_INVALID_OPERATION in draw call (vertex buffers are mapped)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95142
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 14:30:06 -07:00
Nanley Chery
28d0bc72fb anv/formats: Return proper error code for unsupported formats
Fixes some failures in dEQP-VK.api.info.image_format_properties.* and
enables the test group to execute without assert failing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 11:28:30 -07:00
Nanley Chery
5f7e8eac42 anv/device: Set the compressed texture feature flags correctly
Sampling from an ETC2 texture is supported on Bay Trail and
from Gen8 onwards. While ASTC_LDR is supported on Gen9, the
logic to handle such formats has not yet been implemented in
the driver.

Fixes dEQP-VK.api.info.format_properties.compressed_formats.

v2: Enable ETC2 for Bay Trail (Kenneth Graunke)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-27 11:28:30 -07:00
Jason Ekstrand
e0806930ad nir/algebraic: Add a bit-size validator
This commit adds a validator that ensures that all expressions passed
through nir_algebraic are 100% non-ambiguous as far as bit-sizes are
concerned.  This way it's a compile-time error rather than a hard-to-trace
C exception some time later.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand
8a3e344180 nir/opt_algebraic: Fix some expressions with ambiguous bit sizes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand
7e0ee3a38b nir/search: Respect the bit_size parameter on nir_search_value
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand
fcc1c8a437 nir/algebraic: Add a mechanism for specifying the bit size of a value
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand
cafb885e45 nir/algebraic: Use "uint" instead of "unsigned" for uint types
This is consistent with the rename done for the rest of NIR.  Currently,
"bool" is the only type specifier used in nir_opt_algebraic.py so this is
really a no-op.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Jason Ekstrand
736ee0bef7 nir/algebraic: Do better error reporting of bad expressions
Previously, if an exception was encountered anywhere, nir_algebraic would
just die in a fire with no indication whatsoever as to where the actual bug
is.  This commit makes it print out the particular search-and-replace
expression that is causing problems along with the exception.  Also, it
will now report all of the errors it finds and then exit at the end like a
standard C compiler would do.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-27 11:21:06 -07:00
Alejandro Piñeiro
b1dcedf393 isl: move -lm at the end of tests_ldadd
The test was failing to build with "undefined reference to `roundf'" errors,
so Make check on mesa was failing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-27 20:14:56 +02:00
Topi Pohjolainen
aef6a6c382 i965/blorp/gen8: Fix blitting of interleaved msaa surfaces
Fixes ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8_multisample.

Current logic divides given layer of one by number of samples (four)
trashing the layer to zero. Layer adjustment is only to be used with
non-interleaved msaa surfaces where samples for particular layer are
in multiple slices.

I copy-pasted a bit of documentation from
brw_blorp.c::brw_blorp_compute_tile_offsets().

Also took the opportunity to fix the comment regarding sampling
as 2D, cube textures are the only exception.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-04-27 19:57:40 +03:00
Brian Paul
1d242b6882 llvmpipe: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
23c55e5c23 tgsi: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
419e386571 os: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
d902504a67 hud: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
e522a76226 gallivm: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
489df4a71a draw: s/Elements/ARRAY_SIZE/
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Brian Paul
f93802c465 softpipe: s/Elements/ARRAY_SIZE/
Try to standardize on the later, which is defined in the common util/
directory.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-27 10:23:19 -06:00
Nicolai Hähnle
562c4a17b7 winsys/radeon: remove use_reusable_pool parameter from buffer_create
All callers set this parameter to true.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:41 -05:00
Nicolai Hähnle
13acf2b243 gallium/radeon: remove use_reusable_pool parameter from r600_init_resource
All callers set it to true.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:41 -05:00
Nicolai Hähnle
c868974396 radeon/video: always use the reusable buffer pool
A semantic error was introduced in a past refactoring that caused the bind
parameter to be passed into the use_reusable_pool parameter of buffer_create.
Since this clearly makes no sense, and there is no clear reason why the
cache _shouldn't_ be used, just use the cache always.

Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:41 -05:00
Nicolai Hähnle
8c43c06e04 radeonsi: work around an MSAA fast stencil clear problem
A piglit test (arb_texture_multisample-stencil-clear) has been sent.
This problem was discovered analyzing

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle
7a215a3e27 radeonsi: expclear must be disabled on first Z/S clear
The documentation and the HW team say so.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle
01a3bb5d8b radeonsi: move blend choice out of loop in si_blit_decompress_color
It does not depend on the level or layer.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle
450ff0f0d5 radeonsi: use level mask for early out in si_blit_decompress_color
Mostly for consistency with the other decompress functions, but note that
in the non-DCC decompress case, the function can now early-out in slightly
more (albeit probably rare) cases.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle
0ff05b55c6 radeonsi: si_blit_decompress_depth is only used for staging
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle
0b70fc2db4 radeonsi: only decompress the required ZS planes from si_blit
This happens to "fix" a rendering bug in KotOR2, because it avoids a still
not quite understood bug with MSAA fast stencil clear decompress. For the
stencil clear bug, I have sent a piglit test (arb_texture_multisample-stencil-clear).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle
def53a0b3d radeonsi: decompress Z & S planes in one pass
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle
dc6fc2f390 radeonsi: early out of si_blit_decompress_depth_in_place based on dirty mask
Avoid dirtying the db_render_state atom when possible.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle
d14d6c3f58 radeonsi: use MIN2 instead of expanded ?: operator
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle
159f182a57 radeonsi: fix brace style
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle
91fb4bb2e9 gallium/util: add u_bit_consecutive for generating a consecutive range of bits
There are some undefined behavior subtleties, so having a function to match
the u_bit_scan_consecutive_range makes sense.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Tim Rowley
504df3a1d7 swr: s/Elements/ARRAY_SIZE/
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 11:07:34 -05:00
Nicolai Hähnle
836cab51c8 radeonsi: emit s_waitcnt for shader memory barriers and volatile
Turns out that this is needed after all to satisfy some strengthened
coherency tests. Depends on support in LLVM, added in r267729.

v2: updated to reflect changes to the LLVM intrinsic

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2016-04-27 10:54:05 -05:00
Tim Rowley
e7201bd31b swr: [rasterizer] warning cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:54 -05:00
Tim Rowley
24f23817d2 swr: [rasterizer core] implement legacy depth bias enable
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:45 -05:00
Tim Rowley
fa36f8ec9c swr: [rasterizer jitter] support for dumping x86 asm
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:32 -05:00
Tim Rowley
a646ffdacf swr: [rasterizer core] more backend refactoring
BackendPixelRate should be easier to read/maintain now hopefully.

Small perf bump by moving some of the pfn's to inline functions
without template params.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:21 -05:00
Tim Rowley
8e815ff72c swr: [rasterizer jitter] add mSimdInt1Ty
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:41:12 -05:00
Tim Rowley
4e1e0b3a32 swr: [rasterizer core] backend refactor
Lump all template args into a bundle of traits, and add some
functionality to the MSAA traits.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-27 10:40:44 -05:00
Brian Paul
43f46caf76 svga: use the SVGA3D_DEVCAP_MAX_FRAGMENT_SHADER_INSTRUCTIONS query
Instead of a hard-coded 512.  The query typically returns 65536 now.
Fall back to 512 if the query fails as we do for vertex shaders (which
should never happen).

Note that we don't actually enforce this limit in our shaders but it gets
reported via the glGetProgramivARB(GL_MAX_PROGRAM_INSTRUCTIONS_ARB) query.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-04-27 08:43:33 -06:00
Hans de Goede
b5e7907f30 nouveau: codegen: LOAD: Take src swizzle into account
The llvm TGSI backend uses pointers in registers and does things
like:

LOAD TEMP[0].y, MEMORY[0], TEMP[0]

Expecting the data at address TEMP[0].x to get loaded to
TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be
loaded instead.

This commit adds support for a swizzle suffix for the 1st source
operand, which allows using:

LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0]

And actually getting the desired behavior

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-27 16:11:48 +02:00
Hans de Goede
90f45357ab nouveau: codegen: LOAD: Do not call fetchSrc(1) if the address is immediate
"off" later gets set to NULL when the address is immediate, so move the
fetchSrc(1) call to the non-immediate branch of the if-else. This brings
handleLOAD's offset handling inline with how it is done in handleSTORE.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-27 16:11:48 +02:00
Hans de Goede
1958397a58 nouveau: codegen: LOAD: Always use component 0 when getting the address
LOAD loads upto 4 components from the specified resource starting at
the passed in x value of the 2nd source operand, the y, z and w
components of the address should not be used.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-27 16:11:48 +02:00
Stefan Dirsch
7d25ed7036 dri3: Check for dummyContext to see if the glx_context is valid
According to the comments in src/glx/glxcurrent.c __glXGetCurrentContext()
always returns a valid pointer. If no context is made current, it will
contain dummyContext. Thus a test for NULL will always fail.

https://lists.freedesktop.org/archives/mesa-dev/2016-April/113962.html

Signed-off-by: Stefan Dirsch <sndirsch@suse.de>
Reviewed-by: Egbert Eich <eich@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-27 13:03:34 +01:00
Egbert Eich
4d9b518ad2 dri2: Check for dummyContext to see if the glx_context is valid
According to the comments in src/glx/glxcurrent.c __glXGetCurrentContext()
always returns a valid pointer. If no context is made current, it will
contain dummyContext. Thus a test for NULL will always fail.

https://bugzilla.opensuse.org/show_bug.cgi?id=962609

Tested-by: Olaf Hering <ohering@suse.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-27 13:03:11 +01:00