Commit graph

57701 commits

Author SHA1 Message Date
Ian Romanick
01c21c459f glsl: Fix bad indentation
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-19 15:04:04 -07:00
Ian Romanick
47e2a74a5a i965: Silence unused parameter warning
brw_fs_visitor.cpp:2400:1: warning: unused parameter 'ir' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-19 15:04:01 -07:00
Ian Romanick
22b9641edf i965: Silence 'comparison is always true' warning
The parameter is an int16_t, and we're check that it's value will fit in
16-bits.  Yes, the value that is stored in 16-bits will surely fit in
16-bits.

brw_inst.h: In function 'brw_inst_set_gen6_jump_count':
brw_inst.h:321:66: warning: comparison is always true due to limited range of data type [-Wtype-limits]
brw_inst.h:321:66: warning: comparison is always true due to limited range of data type [-Wtype-limits]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-19 15:03:57 -07:00
Ian Romanick
1946612b7d i965: Silence many unused parameter warnings
brw_inst.h: In function 'brw_inst_set_src1_vstride':
brw_inst.h:118:76: warning: unused parameter 'brw' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-19 15:03:49 -07:00
Jason Ekstrand
ecd3e89b32 main/format_pack: Fix a wrong datatype in pack_ubyte_R8G8_UNORM
Before it was only storing one of the color components due to truncation.
With this patch it now properly stores all of them.

Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-07-18 18:34:36 -07:00
Jason Ekstrand
f14d217f5c Add support for RGBA8 and RGBX8 textures in intel_texsubimage_tiled_memcpy
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2014-07-17 18:20:09 -07:00
Jason Ekstrand
765f4b8c04 i965: Improve debug output in intelTexImage and intelTexSubimage
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2014-07-17 18:20:09 -07:00
Marek Olšák
d808de31bd radeonsi: only update vertex buffers when they need updating
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák
6210d6fdc2 radeonsi: remove nr_vertex_buffers
Unused.

Also inline util_set_vertex_buffers_count and simplify it.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák
0ed0bf0696 radeonsi: move vertex buffer descriptors from IB to memory
This removes the intermediate storage (pm4 state) and generates descriptors
directly in a staging buffer.

It also reduces the number of flushes, because the descriptors no longer
take CS space.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák
1635ded828 radeonsi: add support for fine-grained sampler view updates
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák
bea8f2f46d radeonsi: move si_set_sampler_views to si_descriptors.c
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák
dd46841bc9 radeonsi: move sampler descriptors from IB to memory
Sampler descriptors are now represented by si_descriptors.
This also adds support for fine-grained sampler state updates and
the border color update is now isolated in a separate function.

Border colors have been broken if texturing from multiple shader stages is
used. This patch doesn't change that.

BTW, blitting already makes use of fine-grained state updates.
u_blitter uses 2 textures at most, so we only have to save 2.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák
2a7b57ad42 radeonsi: implement ARB_draw_indirect
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:58 +02:00
Marek Olšák
887b69a233 radeonsi: don't add info->start to the index buffer offset
info->start will be invalid once info->indirect isn't NULL, so it shouldn't
be added to ib.offset.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:58 +02:00
Marek Olšák
09056b352d radeonsi: use an SGPR instead of VGT_INDX_OFFSET
The draw indirect packets cannot set VGT_INDX_OFFSET, they can only set user
data SGPRs. This is the only way to support start/index_bias with indirect
drawing.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:58 +02:00
Marek Olšák
a66d934139 radeonsi: assume LLVM 3.4.2 is always present
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:58 +02:00
Marek Olšák
3a86ca54df st/mesa,gallium: add a workaround for Unigine Heaven 4.0 and Valley 1.0
Most (all?) Unigine shaders fail to compile without this if sample shading
is advertised. This is, of course, Unigine developers' fault.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-07-18 01:58:58 +02:00
Marek Olšák
b0ff18bd34 glsl: add a mechanism to allow #extension directives in the middle of shaders
This is needed to make Unigine Heaven 4.0 and Unigine Valley 1.0 work
with sample shading.

Also, if this is disabled, the error message at least makes sense now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-18 01:58:58 +02:00
Glenn Kennard
392c9f8dfe r600g: Implement GL_ARB_texture_gather
Only supported on evergreen and later. Currently limited
to single component textures as the hardware GATHER4
instruction ignores texture swizzles.

Piglit quick run passes on radeon 6670 with all
applicable textureGather tests, no regressions.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2014-07-18 01:58:58 +02:00
Anuj Phogat
984a02ba55 i965: Fix z_offset computation in intel_miptree_unmap_depthstencil()
The bug is triggered by using glTexSubImage2d() with GL_DEPTH_STENCIL
as base internal format and non-zero x, y offsets. Currently x, y
offsets are ignored while updating the texture image.

Fixes Khronos GLES3 CTS tests:
npot_tex_sub_image_2d
npot_tex_sub_image_3d
npot_pbo_tex_sub_image_2d
npot_pbo_tex_sub_image_2d

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-07-17 15:52:27 -07:00
Anuj Phogat
5d9f5cd35b Revert "i965: Extend compute-to-mrf pass to understand blocks of MOVs"
This reverts commit bbefb15e01.
Fixes the 11 regressions caused in framebuffer_blit tests in
Khronos GLES3 CTS tests:

Original patch reduced the instruction count but had no performance
benefits. So, it's safe to revert it without causing any performance
regressions.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-07-17 15:49:46 -07:00
Adel Gadllah
b656e3c603 i915: Fix up intelInitScreen2 for DRI3
Commit 442442026e updated both i915 and i965 for DRI3 support,
but one check in intelInitScreen2 was missed for i915 causing crashes
when trying to use i915 with DRI3.

So fix that up.

Reported-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
References: https://bugzilla.redhat.com/show_bug.cgi?id=1115323
References: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=754297
Tested-by: František Zatloukal <Zatloukal.Frantisek@gmail.com>
Tested-by: Dirk Griesbach <spamthis@freenet.de>
Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-07-17 14:42:35 -07:00
Pavel Popov
4ceb612a10 mesa: Fix regression introduced by commit "mesa: fix packing of float texels to GL_SHORT/GL_BYTE".
This commit "mesa: fix packing of float texels to GL_SHORT/GL_BYTE" replaced *_TO_BYTE to *_TO_BYTE_TEX because *_TO_FLOAT_TEX are used to unpack the texels to floats.
In this case *_TO_FLOATZ in function extract_float_rgba also should be replaced to *_TO_FLOAT_TEX. Underline that these macros automatically preserve zero when converting.

The regression was observed on 3 oglconform tests:
    snorm-textures basic.getTexImage
    snorm-textures advanced.mipmap.manual.getTex
    snorm-textures advanced.mipmap.upload.getTex

Signed-off-by: Pavel Popov <pavel.e.popov@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-07-18 08:01:07 +12:00
Thorsten Glaser
3cfe6bc9cc nv50: fix build failure on m68k due to invalid struct alignment assumptions
Make alignment assumptions explicit by inserting correct padding with
unknown struct members.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
2014-07-17 10:31:30 -04:00
Tom Stellard
74dfd86ed6 clover: Call end_query before getting timestamp result v2
v2:
  - Move the end_query() call into the timestamp constructor.
  - Still pass false as the wait parameter to get_query_result().

Reviewed-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

CC: "10.2" <mesa-stable@lists.freedesktop.org>
2014-07-17 09:33:37 -04:00
Tapani Pälli
48deb4dbf2 glsl: handle a switch where default is in the middle of cases
This fixes following tests in es3conform:

   shaders.switch.default_not_last_dynamic_vertex
   shaders.switch.default_not_last_dynamic_fragment

and makes following tests in Piglit pass:

   glsl-1.30/execution/switch/fs-default-notlast-fallthrough
   glsl-1.30/execution/switch/fs-default_notlast

No Piglit regressions.

v2: take away unnecessary ir_if, just use conditional assignment
v3: use foreach_in_list instead of foreach_list

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v2)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v3)
2014-07-17 07:39:12 +03:00
Kenneth Graunke
9e47ed2f77 glsl: Make the tree rebalancer use vector_elements, not components().
components() includes matrix columns, so if this code encountered a
matrix, it would ask for something like a vec9 or vec16.  This is
clearly not what you want.

Earlier code now prevents this from seeing matrices, but we should still
use vector_elements, for clarity.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-16 15:43:13 -07:00
Kenneth Graunke
7db75927ca glsl: Guard against error_type in the tree rebalancer.
This helped me track down the bug fixed in the previous commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-16 15:43:13 -07:00
Kenneth Graunke
9697f8088f glsl: Make the tree rebalancer bail on matrix operands.
It doesn't handle things like (vector * matrix) correctly, and
apparently Matt's intention was to bail.

Fixes shader compilation in Natural Selection 2.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-16 15:43:13 -07:00
Kenneth Graunke
99f8ea295f Revert "i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams."
This reverts commit 3178d2474a.

This caused GPU hangs on Ivybridge for some users and huge (80%)
performance regressions across the board on multiple platforms.

We need to find a better solution.  I've made several attempts, but none
of them have worked yet.  In the meantime, we should revert this.

Reverting it breaks GL_PRIMITIVES_GENERATED for non-zero streams, but
that's okay, since we don't expose GL_ARB_gpu_shader5 yet.

Fixes Piglit's EXT_transform_feedback/generatemipmap prims_generated
test case on Haswell.
2014-07-16 14:19:29 -07:00
Chia-I Wu
1661f7559b ilo: add some missing formats
Map more pipe formats to hardware formats.  Enable more VB formats on Haswell.
2014-07-16 14:31:59 +08:00
Chia-I Wu
69cd3ebd6f ilo: update and tailor the surface format table
Recreate the table from scratch with the help of a pdf-table-to-csv converter.
Switch to a form that is more suitable for ilo.
2014-07-16 14:31:59 +08:00
Kenneth Graunke
a2de656278 i965: Don't copy propagate abs into Broadwell logic instructions.
It's not clear what abs on logical instructions means on Broadwell, and
it doesn't appear to do anything sensible.

Fixes 270 Piglit tests (the bitand/bitor/bitxor tests with abs).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81157
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-07-15 22:12:15 -07:00
Kenneth Graunke
cf1b5eee7f i965/fs: Use WE_all for gl_SampleID header register munging.
This code should execute without regard to the currently executing
channels.  Asking for gl_SampleID inside control flow might break in
strange ways.  It appears to break even at the top of the program in
SIMD16 mode occasionally as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2014-07-15 22:10:10 -07:00
Kenneth Graunke
e5adc560cc i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.
gen8_fs_generator uses these to decide whether to set the execution size
to 8 or 16, so we incorrectly made both of these MOVs the full width in
SIMD16 shaders.  (It happened to work out on Gen4-7.)

Setting them should also help inform optimization passes what's really
going on, which could help avoid bugs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2014-07-15 22:10:06 -07:00
Kenneth Graunke
2eaf3f670f i965: Set execution size to 8 for instructions with force_sechalf set.
Both inst->force_uncompressed and inst->force_sechalf mean that the
generated instruction should be uncompressed and have an execution size
of 8.  We don't require the visitor to set both flags - setting
inst->force_sechalf by itself is supposed to be enough.

On Gen4-7, guess_execution_size() demoted instructions to 8-wide based
on the default compression state.  On Gen8+, we instead set a default
execution size, which worked great...except that we forgot to check
inst->force_sechalf when deciding whether to use 8 or 16.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2014-07-15 22:09:49 -07:00
Christoph Bumiller
4198711006 nvc0: fix translate path for PRIM_RESTART_WITH_DRAW_ARRAYS
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-15 17:57:45 -04:00
Christoph Bumiller
a284a0afa2 nvc0: add support for indirect drawing
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-15 17:57:45 -04:00
Ilia Mirkin
bbc4a7bd31 nouveau: check if a fence has already been signalled
nouveau_fence_update does real work unconditionally. Avoid doing that if
the fence we're checking on has already been signalled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-15 17:57:45 -04:00
Matt Turner
c11096c749 glsl: Don't declare variables in for-loop declaration.
Reported-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-07-15 12:17:48 -07:00
Connor Abbott
58270c2fac exec_list: Make various places use the new length() method.
Instead of hand-rolling it.

v2 [mattst88]: Rename get_size to length. Expand comment in ir_reader.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
2014-07-15 11:16:16 -07:00
Connor Abbott
7b0f69225a exec_list: Add a function to give the length of a list.
v2 [mattst88]: Remove trailing whitespace. Rename get_size to length.
               Mark as const.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
2014-07-15 11:16:16 -07:00
Connor Abbott
28c4fd4bc6 exec_list: Add a prepend function.
This complements the existing append function. It's implemented in a
rather simple way right now; it could be changed if performance is a
concern.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
2014-07-15 11:16:16 -07:00
Ian Romanick
9a723b970e mesa: Don't allow GL_TEXTURE_{LUMINANCE,INTENSITY}_* queries outside compat profile
There are no queries for GL_TEXTURE_LUMINANCE_SIZE,
GL_TEXTURE_INTENSITY_SIZE, GL_TEXTURE_LUMINANCE_TYPE, or
GL_TEXTURE_INTENSITY_TYPE in any version of OpenGL ES or desktop OpenGL
core profile.

NOTE: Without changes to piglit, this regresses
required-sized-texture-formats.

v2: Rebase on different initial change.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2 <mesa-stable@lists.freedesktop.org>
2014-07-15 10:46:33 -07:00
Ian Romanick
750286600b mesa: Don't allow GL_TEXTURE_BORDER queries outside compat profile
There are no texture borders in any version of OpenGL ES or desktop
OpenGL core profile.

Fixes piglit's gl-3.2-texture-border-deprecated.

v2: Rebase on different initial change.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2 <mesa-stable@lists.freedesktop.org>
2014-07-15 10:46:33 -07:00
Ian Romanick
ee58c71a65 mesa: Handle uninitialized textures like other textures in get_tex_level_parameter_image
Instead of catching the special case early, handle it by constructing a
fake gl_texture_image that will cause the values required by the OpenGL
4.0 spec to be returned.

Previously, calling

    glGenTextures(1, &t);
    glBindTexture(GL_TEXTURE_2D, t);
    glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, 0xDEADBEEF, &value);

would not generate an error.

Anuj: Can you verify this does not regress proxy_textures_invalid_size?

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Suggested-by: Brian Paul <brianp@vmware.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
2014-07-15 10:46:33 -07:00
Matt Turner
83214edf8a i965/fs: Relax interference check in register coalescing.
A similar attempt was made in commit 5ff1e446 and was reverted in commit
a39428cf after causing a regression in an ES 3 conformance test. The
test still passes after this commit.

total instructions in shared programs: 1994827 -> 1992858 (-0.10%)
instructions in affected programs:     128247 -> 126278 (-1.54%)
GAINED:                                0
LOST:                                  1

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-15 10:12:29 -07:00
Matt Turner
1d97212007 i965/fs: Perform CSE on sends-from-GRF rather than textures.
Should potentially allow a few more cases, while avoiding doing CSE on
texture operations on Gen <= 6 with the MRF.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80211
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: lu hua <huax.lu@intel.com>
2014-07-15 10:12:29 -07:00
Matt Turner
103716a862 glsl: Update expression types after rebalancing the tree.
If we saw a tree that looked like

            vec3
           /   \
         vec3 float
        /   \
      vec3 float
     /   \
   vec3 float

We would see that all of the expression types were vec3, and then
rebalance to

           vec3
        /        \
      vec3       vec3 <-- should be float
     /   \      /    \
   vec3 float float float

This patch adds code to visit the rebalanced tree and update the
expression types from the bottom up.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80880
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-15 10:12:29 -07:00