Commit graph

85652 commits

Author SHA1 Message Date
Kristian Høgsberg Kristensen
ab36eae5e7 anv: Remove left-over bits of sparse-descriptor code 2016-03-05 13:50:07 -08:00
Jason Ekstrand
1afdfc3e6e anv/pipeline: Implement the depth compare EQUAL workaround on gen8+ 2016-03-05 09:59:28 -08:00
Jason Ekstrand
7c1660aa14 anv: Don't allow D16_UNORM to be combined with stencil
Among other things, this can cause the depth or stencil test to spurriously
fail when the fragment shader uses discard.
2016-03-05 09:59:28 -08:00
Jason Ekstrand
9a90176d48 anv/pipeline: Calculate the correct max_source_attr for 3DSTATE_SBE 2016-03-05 09:59:28 -08:00
Brian Paul
a4678311be st/mesa: 78-column wrapping in st_extensions.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:21:05 -07:00
Brian Paul
9e6a6bd575 gallium/util: add new comments, assertions in u_debug_refcnt.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:34 -07:00
Brian Paul
b6a607b221 gallium/util: update comments and URL in u_debug_refcnt.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:28 -07:00
Brian Paul
cbca6964e2 gallium/util: make stream variable static in u_debug_refcnt.c
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:23 -07:00
Brian Paul
fb0abedce7 gallium/util: re-indent u_debug_refcnt.[ch]
Wrap comments to 78 columns, etc.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-03-05 09:20:14 -07:00
Brian Paul
a7ba29f6d8 gallium/tests: silence warning in compute.c
compute.c: In function ‘launch_grid’:
compute.c:435:20: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
         info.input = input;
                    ^

Maybe the pipe_grid_info::input field should be const void *?

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-05 09:15:44 -07:00
Timothy Arceri
31943e6ba5 glsl: replace remaining tabs in link_varyings.cpp
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2016-03-05 20:50:10 +11:00
Timothy Arceri
e2415e8467 glsl: replace remaining tabs in link_uniforms.cpp
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2016-03-05 20:50:05 +11:00
Jordan Justen
81f30e2f50 anv/hsw: Move query code to genX file for Haswell
This fixes many CTS cases, but will require an update to the kernel
command parser register whitelist. (The CS GPRs and TIMESTAMP
registers need to be whitelisted.)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-05 01:08:07 -08:00
Timothy Arceri
3322cb7b8d docs: mark align layout qualifier as DONE
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:39:13 +11:00
Timothy Arceri
037f68d81e glsl: apply align layout qualifier rules to block offsets
From Section 4.4.5 (Uniform and Shader Storage Block Layout
Qualifiers) of the OpenGL 4.50 spec:

  "The align qualifier makes the start of each block member have a
  minimum byte alignment.  It does not affect the internal layout
  within each member, which will still follow the std140 or std430
  rules. The specified alignment must be a power of 2, or a
  compile-time error results.

  The actual alignment of a member will be the greater of the
  specified align alignment and the standard (e.g., std140) base
  alignment for the member's type. The actual offset of a member is
  computed as follows: If offset was declared, start with that
  offset, otherwise start with the next available offset. If the
  resulting offset is not a multiple of the actual alignment,
  increase it to the first offset that is a multiple of the actual
  alignment. This results in the actual offset the member will have.

  When align is applied to an array, it affects only the start of
  the array, not the array's internal stride. Both an offset and an
  align qualifier can be specified on a declaration.

  The align qualifier, when used on a block, has the same effect as
  qualifying each member with the same align value as declared on
  the block, and gets the same compile-time results and errors as if
  this had been done. As described in general earlier, an individual
  member can specify its own align, which overrides the block-level
  align, but just for that member.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:39:07 +11:00
Timothy Arceri
5a27fefffe glsl: parse align layout qualifier
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:39:01 +11:00
Timothy Arceri
22b0082b9d docs: mark explicit byte offsets as DONE
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:55 +11:00
Timothy Arceri
802262c0af glsl: use explicit offset when lowering buffer access
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:49 +11:00
Timothy Arceri
96527c3cf2 glsl: copy explicit offset to uniform storage
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:44 +11:00
Timothy Arceri
e12a49ac12 glsl: update comment on offset field
The old comment was for the location not the offset, we now use
the field for block members so mention that also.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:39 +11:00
Timothy Arceri
9f24f42c49 glsl: add offset to glsl interface type
In this patch we also copy the offset value from the ast and
implement offset linking rules by adding it to the record_compare()
function.

From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers)
of the GLSL 4.50 spec:

   "Two blocks linked together in the same program with the same block
   name must have the exact same set of members qualified with
   offset and their integral-constant-expression values must be the
   same, or a link-time error results."

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:34 +11:00
Timothy Arceri
8abed7f185 glsl: apply compile-time rules for the offset layout qualifier
This implements the rules for the offset qualifier on block members.

From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers)
of the GLSL 4.50 spec:

   "The offset qualifier can only be used on block members of blocks
   declared with std140 or std430 layouts."

   ...

   "It is a compile-time error to specify an offset that is smaller than
   the offset of the previous member in the block or that lies within the
   previous member of the block."

   ...

   "The specified offset must be a multiple of the base alignment of the
   type of the block member it qualifies, or a compile-time error results."

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:30 +11:00
Timothy Arceri
6f45484ac7 glsl: enable offset layout qualifier for ARB_enhanced_layouts
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-03-05 19:38:26 +11:00
Timothy Arceri
1824ff1c2a glsl: reject invalid input layout qualifiers
Global in validation is already handled, this will do the validation
for variables, blocks and block members.

This fixes some CTS tests for the new enhanced layouts transform
feedback qualifiers.

V2: add some more valid input flags
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:07:09 +11:00
Timothy Arceri
bd53cc7b45 glsl: only apply default stream to output blocks
This is needed to allow invalid qualifier checks on inputs.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:07:04 +11:00
Timothy Arceri
78d3098c05 glsl: rework parsing of blocks
Previously interface blocks were giving the global default flags of
uniform blocks. This meant we could not check for invalid qualifiers
on interface blocks because they always contained invalid flags.

This changes parsing so that interface blocks now get an empty
set of layouts.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:07:00 +11:00
Timothy Arceri
d244986bf2 glsl: don't apply uniform/buffer layouts to interface blocks
If the following patch we will stop setting these layouts by default
on interface blocks, so we need to do this to avoid hitting the
assert.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-05 19:06:56 +11:00
Nanley Chery
4e75f9b219 anv: Implement VK_REMAINING_{MIP_LEVELS,ARRAY_LAYERS}
v2: Subtract the baseMipLevel and baseArrayLayer (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-04 21:25:23 -08:00
Kenneth Graunke
4ba7ad6cc1 i965: Only magnify depth for 3D textures, not array textures.
When BaseLevel > 0, we magnify the dimensions to fill out the size of
miplevels [0..BaseLevel).  In particular, this was magnifying depth,
thinking that the depth doubles at each level.  This is perfectly
reasonable for 3D textures, but dead wrong for array textures.

Changing the depth != 1 condition to a target == GL_TEXTURE_3D check
should make this only happen in the appropriate cases.

Fixes about 32 dEQP tests:
- dEQP-GLES31.functional.texture.gather.*.level_{1,2}

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2016-03-04 21:25:08 -08:00
Jason Ekstrand
c1436e80ef anv/meta_clear: Set the right number of dynamic states 2016-03-04 19:18:20 -08:00
Juan A. Suarez Romero
2f76a9924e i965/vec4: add opportunistic behaviour to opt_vector_float()
opt_vector_float() transforms several scalar MOV operations to a single
vectorial MOV.

This is done when those MOV covers all the components of the destination
register. So something like:

mov vgrf3.0.xy:D, 0D
mov vgrf3.0.w:D, 1065353216D
mov vgrf3.0.z:D, 0D

is transformed in:

mov vgrf3.0:F, [0F, 0F, 0F, 1F]

But there are cases where not all the components are written. For
example, in:

mov vgrf2.0.x:D, 1073741824D
mov vgrf3.0.xy:D, 0D
mov vgrf3.0.w:D, 1065353216D
mov vgrf4.0.xy:D, 1065353216D
mov vgrf4.0.w:D, 0D
mov vgrf6.0:UD, u4.xyzw:UD

Nor vgrf3 nor vgrf4 .z components are written, so the optimization is
not applied.

But it could be applied anyway with the components covered, using a
writemask to select the ones written. So we could transform it in:

mov vgrf2.0.x:D, 1073741824D
mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F]
mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F]
mov vgrf6.0:UD, u4.xyzw:UD

This commit does precisely that: opportunistically apply
opt_vector_float() when possible.

total instructions in shared programs: 7124660 -> 7114784 (-0.14%)
instructions in affected programs: 443078 -> 433202 (-2.23%)
helped: 4998
HURT: 0

total cycles in shared programs: 64757760 -> 64728016 (-0.05%)
cycles in affected programs: 1401686 -> 1371942 (-2.12%)
helped: 3243
HURT: 38

v2: change vectorize_mov() signature (Matt).
v3: take in account predicates (Juan).
v4 [mattst88]: Update shader-db numbers. Fix some whitespace issues.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2016-03-04 19:16:52 -08:00
Jason Ekstrand
cc57efc67a anv/pipeline: Fix depthBiasEnable on gen7
The first time I tried to fix this, I set the wrong fields.
2016-03-04 17:56:12 -08:00
Jason Ekstrand
653261285e anv/cmd_buffer: Reset the state streams when resetting the command buffer 2016-03-04 17:54:29 -08:00
Jason Ekstrand
f700d16a89 anv/cmd_buffer: Include Haswell in set_subpass 2016-03-04 17:54:29 -08:00
George Kyriazis
feb71117ae st/xlib: Don't destroy screen on XCloseDisplay()
screen may still be used by other resources that are not yet freed.
To correctly fix this there will be a need to account for resources
differently, but this quick fix is not any worse than the original
code that leaked screens anyway.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-04 18:14:46 -07:00
Nanley Chery
a6fb62a864 isl: Fix RenderTargetViewExtent for mipmapped 3D surfaces
Match the comment stated above the assignment.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-04 13:20:44 -08:00
Nanley Chery
b80c8ebc45 isl: Get rid of isl_surf_fill_state_info::level0_extent_px
This field is no longer needed.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-03-04 13:20:03 -08:00
Jason Ekstrand
d154a5ebd6 anv/cmd_buffer: Let the pipeline set StencilBufferWriteEnable on gen9 2016-03-04 12:23:01 -08:00
Jason Ekstrand
f374765ce6 anv/cmd_buffer: Mask stencil reference values 2016-03-04 12:22:32 -08:00
Jason Ekstrand
d61dcec64d anv/clear: Pull the stencil write mask from the pipeline
The stencil write mask wasn't getting set at all so we were using whatever
write mask happend to be left over by the application.
2016-03-04 12:03:00 -08:00
Jason Ekstrand
ec18fef88d anv/pipeline: Set StencilBufferWriteEnable from the pipeline
The hardware docs say that StencilBufferWriteEnable should only be set if
StencilTestEnable is set.  It seems reasonable to set them together.
2016-03-04 12:03:00 -08:00
Jason Ekstrand
fcd8e57185 anv/pipeline: More competent gen8 clipping 2016-03-04 12:03:00 -08:00
Jason Ekstrand
a8afd29653 anv/pipeline: Use the right provoking vertex for triangle fans 2016-03-04 12:03:00 -08:00
Jason Ekstrand
fa8539dd6b anv/pipeline: Respect pRasterizationState->depthBiasEnable 2016-03-04 12:03:00 -08:00
Matt Turner
1f862e923c i965/fs: Optimize float conversions of byte/word extract.
instructions in affected programs: 31535 -> 29966 (-4.98%)
   helped: 23

   cycles in affected programs: 272648 -> 266022 (-2.43%)
   helped: 14
   HURT: 1

The patch decreases the number of instructions in the two Unigine
programs by:

 #1721: 4374 -> 4155 instructions (-5.01%)
 #1706: 3582 -> 3363 instructions (-6.11%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-04 11:52:34 -08:00
Matt Turner
905ff86198 nir: Recognize open-coded extract_u16.
No shader-db changes, but does recognize some extract_u16 which enables
the next patch to optimize some code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-04 11:52:34 -08:00
Matt Turner
76289fbfa8 nir: Recognize open-coded extract_u8.
Two shaders that appear in Unigine benchmarks (Heaven and Valley) unpack
three bytes from an integer and convert each into a float:

   float((val >> 16u) & 0xffu)
   float((val >>  8u) & 0xffu)
   float((val >>  0u) & 0xffu)

Instead of shifting, masking, and type converting like this:

   shr(8)          g15<1>UD        g25<8,8,1>UD    0x00000010UD
   and(8)          g16<1>UD        g15<8,8,1>UD    0x000000ffUD
   mov(8)          g17<1>F         g16<8,8,1>UD

   shr(8)          g18<1>UD        g25<8,8,1>UD    0x00000008UD
   and(8)          g19<1>UD        g18<8,8,1>UD    0x000000ffUD
   mov(8)          g20<1>F         g19<8,8,1>UD

   and(8)          g21<1>UD        g25<8,8,1>UD    0x000000ffUD
   mov(8)          g22<1>F         g21<8,8,1>UD

i965 can simply extract a byte and convert to float in a single
instruction:

   mov(8)          g17<1>F         g25.2<32,8,4>UB
   mov(8)          g20<1>F         g25.1<32,8,4>UB
   mov(8)          g22<1>F         g25.0<32,8,4>UB

This patch implements the first step: recognizing byte extraction. A
later patch will optimize out the conversion to float.

   instructions in affected programs: 28568 -> 27450 (-3.91%)
   helped: 7

   cycles in affected programs: 210076 -> 203144 (-3.30%)
   helped: 7

This patch decreases the number of instructions in the two Unigine
programs by:

 #1721: 4520 -> 4374 instructions (-3.23%)
 #1706: 3752 -> 3582 instructions (-4.53%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-04 11:52:34 -08:00
Kenneth Graunke
9d7faadd8a anv: Fix backwards shadow comparisons
sample_c is backwards from what GL and Vulkan expect.

See intel_state.c in i965.

v2: Drop unused vk_to_gen_compare_op.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-04 11:35:46 -08:00
George Kyriazis
01e92e7010 st/xlib: Hang off screen destructor off main XCloseDisplay() callback.
This resolves some order dependencies between the already existing
callback the newly created one.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-04 10:57:24 -07:00
George Kyriazis
51e562c3ea st/xlib: Support unlimited number of display connections
There is a limit of 10 display connections, which was a
problem for apps/tests that were continuously opening/closing display
connections.

This fix uses XAddExtension() and XESetCloseDisplay() to keep track
of the status of the display connections from the X server, freeing
mesa-related data as X displays get destroyed by the X server.

Poster child is the VTK "TimingTests"

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-03-04 10:57:09 -07:00