Commit graph

82384 commits

Author SHA1 Message Date
Ilia Mirkin
912babba7b mesa/copyimage: allow width/height to not be multiples of block
For compressed textures, the image size is not necessarily a multiple of
the block size (e.g. the last mip levels). Section 18.3.2 (Copying
Between Images) of the OpenGL 4.5 Core Profile spec says:

    An INVALID_VALUE error is generated if the dimensions of either
    subregion exceeds the boundaries of the corresponding image
    object, or if the image format is compressed and the dimensions of
    the subregion fail to meet the alignment constraints of the
    format.

and Section 8.7 (Compressed Texture Images) says:

    An INVALID_OPERATION error is generated if any of the following
    conditions occurs:

      * width is not a multiple of four, and width + xoffset is not
        equal to the value of TEXTURE_WIDTH.
      * height is not a multiple of four, and height + yoffset is not
        equal to the value of TEXTURE_HEIGHT.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2015-11-11 14:37:55 -05:00
Jason Ekstrand
80890eb0d3 i965/brw_reg: Add a brw_VxH_indirect helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-11 10:52:30 -08:00
Brian Paul
68993f77cd mesa: remove old comments in arrayobj.c 2015-11-11 09:38:22 -07:00
Brian Paul
9870a5c6c9 st/wgl: clarify code in stw_framebuffer_from_hwnd_locked()
Just a minor code change to make it obvious that NULL is returned when
we don't find the given HWND.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-11-11 09:38:22 -07:00
Brian Paul
004ed6f4a9 st/wgl: improve some function comments
In particular, explain when stw_framebuffer objects are
locked/unlocked/etc.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-11-11 09:38:22 -07:00
Brian Paul
b93cb6c1dc st/wgl: whitespace/formatting fixes 2015-11-11 09:38:22 -07:00
Brian Paul
eb812921ac st/wgl: fix locking issue in stw_st_framebuffer_present_locked()
When stw_st_framebuffer_present_locked() is called, the
stw_framebuffer's mutex will already be locked.  Normally, the
stw_framebuffer_present_locked() function calls
stw_framebuffer_release() to unlock the mutex when it's done.  But if
for some reason the 'resource' pointer in
stw_st_framebuffer_present_locked() is null, we'd return without
unlocking the stw_framebuffer.  This fixes that to avoid potential
deadlocks.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2015-11-11 09:38:22 -07:00
Kenneth Graunke
e42a29531a i965: Print force_writemask_all in dump_instructions().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-11-11 08:35:15 -08:00
Kenneth Graunke
ecb5e0a986 i965: Combine BRW_NEW_*_BINDING_TABLE dirty bits.
A while back, we moved to directly emitting the Gen7+ state when
constructing the binding tables.  These flags are only used on
Gen4-6, which emit all the binding table pointers at once.

We gain nothing by having separate flags, so combine them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-11-11 08:33:58 -08:00
Kenneth Graunke
a2987ff57f i965: Map GL_PATCHES to 3DPRIM_PATCHLIST_n.
Inspired by a patch by Fabian Bieler.

Fabian defined a _3DPRIM_PATCHLIST_0 macro (which isn't actually a valid
topology type); I instead chose to make a macro that takes an argument.
He also took the number of patch vertices from _mesa_prim (which was set
to ctx->TessCtrlProgram.patch_vertices) - I chose to use it directly to
avoid the need for the VBO patch.

v2: Change macro to 0x20 + (n - 1) instead of 0x1F + n to better match
    the documentation (suggested by Ian).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-11-11 08:33:48 -08:00
Emil Velikov
cbb7d90e57 docs: add news item and link release notes for 11.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-11 11:18:32 +00:00
Emil Velikov
6435d8ac5a docs: add sha256 checksums for 11.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 66c949d0a1)
2015-11-11 11:16:43 +00:00
Emil Velikov
07948b03fb docs: add release notes for 11.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ee57c22141)
2015-11-11 11:16:42 +00:00
Jason Ekstrand
3a3d79b38e anv/gen7: Implement the VS state depth-stall workaround 2015-11-10 16:42:34 -08:00
Jason Ekstrand
750b8f9e98 anv/gen7: Properly handle a GS with zero invocations 2015-11-10 16:41:23 -08:00
Jason Ekstrand
9d18555c8d anv/gen7: Add push constant support 2015-11-10 15:14:11 -08:00
Glenn Kennard
3f45d29fe4 r600g: Pass conservative depth parameters to hw
Supported on R700 and up.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-11-11 09:06:25 +10:00
Dave Airlie
b3e793f2db Revert "r600g: Pass conservative depth parameters to hw"
This reverts commit a1fc78911e.

I pushed the wrong patch.
2015-11-11 09:05:50 +10:00
Jason Ekstrand
427978d933 anv/device: Use an actual int64_t in WaitForFences 2015-11-10 15:02:52 -08:00
Jason Ekstrand
d9079648d0 anv/meta: Create a sampler in meta_emit_blit 2015-11-10 14:43:18 -08:00
Glenn Kennard
c878d61124 r600g: Implement ARB_texture_view
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-11-11 08:36:08 +10:00
Glenn Kennard
a1fc78911e r600g: Pass conservative depth parameters to hw
Supported on R700 and up.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-11 08:32:35 +10:00
Eduardo Lima Mitev
de51676b41 i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const
When both fadd and fmul instructions have at least one operand that is a
constant and it is only used once, the total number of instructions can
be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
the constants will be progagated as immediate operands of fmul and fadd.

This patch detects these situations and prevents fusing fmul+fadd into ffma.

Shader-db results on i965 Haswell:

total instructions in shared programs: 6235835 -> 6225895 (-0.16%)
instructions in affected programs:     1124094 -> 1114154 (-0.88%)
total loops in shared programs:        1979 -> 1979 (0.00%)
helped:                                7612
HURT:                                  843
GAINED:                                4
LOST:                                  0

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-10 21:13:35 +01:00
Eduardo Lima Mitev
fb3b5669ce util: Add list_is_singular() helper function
Returns whether the list has exactly one element.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-11-10 21:13:35 +01:00
Eduardo Lima Mitev
94ff35204d nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driver
Because the next patch will add an optimization that is specific to i965,
we want to move this loweing pass to that driver altogether.

This is safe because i965 is the only consumer.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-10 21:13:35 +01:00
Kristian Høgsberg Kristensen
96b22fb080 glsl: Use array deref for access to vector components
We've assumed that we could lower per-component vector access from

  vec[i] = scalar

to

  vec = ir_triop_vector_insert(vec, scalar, i)

but with SSBOs (and compute shader SLM and tesselation outputs) this is
no longer valid. If a vector is "externally visible", multiple threads
can write independent components simultaneously. With lowering to
ir_triop_vector_insert, each thread read the entire vector, changes one
component, then writes out the entire vector. This is racy.

Instead of generating a ir_binop_vector_extract when we see v[i], we
generate ir_dereference_array. We then add a lowering pass to lower the
ir_dereference_array to ir_binop_vector_extract for rvalues and for to
vector_insert for lvalues in a separate lowering pass.

The resulting IR is the same as before, but we now have a window between
ast->ir conversion and the lowering pass where v[i] appears in the IR as
an array deref. This lets us run lowering passes that lower the vector
access to I/O (eg for SSBO load/store) before we lower the per-component
access to full vector writes.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2015-11-10 12:02:46 -08:00
Kristian Høgsberg Kristensen
60dd5287ff glsl: Lower UBO and SSBO access in glsl linker
All GLSL IR consumers run this lowering pass so we can move it to the
linker. This moves the pass up quite a bit, but that's the point: it
needs to run before we throw away information about per-component vector
access.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2015-11-10 12:02:46 -08:00
Kristian Høgsberg Kristensen
f0e95c2500 glsl: Drop exec_list argument to lower_ubo_reference
We always pass in shader->ir and we already pass in the shader, so just
drop the exec_list. Most passes either take just a exec_list or a
shader, so this seems more consistent.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2015-11-10 12:02:46 -08:00
Jason Ekstrand
b461744c52 anv/gen7: Properly handle VS with VertexID but no vertices 2015-11-10 11:31:31 -08:00
Jason Ekstrand
aafc87402d anv/device: Work around the i915 kernel driver timeout bug
There is a bug in some versions of the i915 kernel driver where it will
return immediately if the timeout is negative (it's supposed to wait
indefinitely).  We've worked around this in mesa for a few months but never
implemented the work-around in the Vulkan driver.

I rediscovered this bug again while working on Ivy Bridge becasuse the
drive in my Ivy Bridge currently has Fedora 21 installed which has one of
the offending kernels.
2015-11-10 11:24:11 -08:00
Connor Abbott
213f86416f nir/glsl: switch to using the builder
v2: use nir_bulder_cf_insert (Ken)

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:56:43 -05:00
Connor Abbott
fbbfb7c025 nir/glsl: make emit() take nir_ssa_def * sources
Again, this matches what the builder will have to do.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:56:35 -05:00
Connor Abbott
a60e990dd2 nir/glsl: convert nir_visitor::result to a nir_ssa_def *
Its only user now returns a nir_ssa_def *, and we'll need this since the
builder returns a nir_ssa_def *.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:55:54 -05:00
Connor Abbott
30fe8eaa8e nir/glsl: make evaluate_rvalue() return a nir_ssa_def *
A long time ago, before NIR was even merged to master, glsl_to_nir used
registers and these sources were actually register sources. But nowadays
everything in glsl_to_nir is an SSA value, so stop pretending that by
evaluating an rvalue we can get an arbitrary nir_src. Most importantly,
we need this since the builder takes nir_ssa_def * sources directly.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:55:14 -05:00
Jose Fonseca
6f42162329 st/mesa: Destroy buffer object's mutex.
Ideally we should have a _mesa_cleanup_buffer_object function in
src/mesa/bufferobj.c so that the destruction logic resided in a single
place.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-11-10 11:04:28 +00:00
Kenneth Graunke
db54673b54 nir: Store PatchInputsRead and PatchOutputsWritten in nir_shader_info.
These tessellation shader related fields need plumbing through NIR.

v2: Use uint32_t instead of uint64_t to match the source type of
    GLbitfield (caught by Iago Toral).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-11-10 01:03:43 -08:00
Eric Anholt
437d7b6119 vc4: Avoid loading undefined (newly-allocated) FBO contents.
Since X has undefined contents in new pixmaps, it will allocate new
textures for an FBO and draw to them without an explicit clear.  For
VC4, it's much faster to emit a clear than the load of the actual
undefined memory contents, so just do that instead.
2015-11-09 19:17:36 -08:00
Eric Anholt
5980389bbf vc4: Return NULL when we can't make our shadow for a sampler view.
I'm not sure what the caller does is appropriate (just have a NULL sampler
at this slot), but it fixes the immediate crash.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-11-09 19:17:36 -08:00
Eric Anholt
eb8fb0064d vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails.
I was afraid our callers weren't prepared for this, but it looks like
at least for resource creation, mesa/st throws an error appropriately.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-11-09 19:17:36 -08:00
Eric Anholt
84608e07e7 vc4: Add CL dumping for GL_ARRAY_PRIMITIVE. 2015-11-09 19:17:36 -08:00
Eric Anholt
855a3ca598 vc4: Fix a compiler warning. 2015-11-09 19:17:36 -08:00
Jordan Justen
fb3da129d1 glsl: Use shared storage variable type for shared variables
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2015-11-09 17:21:24 -08:00
Jordan Justen
32746fc9b4 glsl: Add shared variable type
Shared variables are stored in a common pool accessible by all threads
in a compute shader local work group.

These variables are similar to OpenCL's local/__local variables.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2015-11-09 17:21:24 -08:00
Jordan Justen
c0ac4740a7 glsl: Add space to shader_storage in print_visitor
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2015-11-09 17:21:17 -08:00
Jordan Justen
007d96730e glsl: Align comments on variables types
v2:
 * Split from patch to add ir_var_shader_shared (tarceri)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2015-11-09 17:21:17 -08:00
Jordan Justen
8b28b35531 glsl: Parse shared keyword for compute shader variables
v2:
 * Move shared parsing under storage qualifiers (tarceri)
 * Fail to compile if shared is used in non-compute shader (tarceri)
 * Use separate shared_storage bit for shared variables (tarceri)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2015-11-09 17:21:12 -08:00
Timothy Arceri
a4a46fe3fa glsl: simplify interface block stream qualifier validation
Qualifiers on member variables are redundent all we need to do
if check if it matches the stream associated with the block and
throw an error if its not.

Reviewed-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-10 12:02:30 +11:00
Jason Ekstrand
06f466a770 anv/nir: Fix codegen in lower_push_constants 2015-11-09 16:29:05 -08:00
Jason Ekstrand
abede04314 anv/gen7: Fix the length of 3DSTATE_SF 2015-11-09 16:04:07 -08:00
Jason Ekstrand
e8c2a52a70 anv/gen7: Properly handle missing color-blend state 2015-11-09 16:04:06 -08:00