In i965 Gen7, Mesa has for a long time used the "depth coordinate
offset X/Y" settings (in 3DSTATE_DEPTH_BUFFER) to cause the GPU to
render to miplevels other than 0. Unfortunately, this doesn't work,
because these offsets must be aligned to multiples of 8, and miplevels
in the depth buffer are only guaranteed to be aligned to multiples of
4. When the offsets aren't aligned to a multiple of 8, the GPU
sometimes hangs.
As a temporary measure, to avoid GPU hangs, this patch smashes the 3
LSB's of "depth coordinate offset X/Y" to 0. This results in
incorrect rendering to mipmapped depth textures, but that seems like a
reasonable stopgap while we figure out a better solution.
Avoids GPU hangs in piglit test "depthstencil-render-miplevels" at
texture sizes that are not powers of 2.
Reviewed-by: Chad Verace <chad.versace@linux.intel.com>
Cherry-picked from 714b4f6184
Conflicts:
src/mesa/drivers/dri/i965/gen7_misc_state.c
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50271
In i965 Gen6, Mesa has for a long time used the "depth coordinate
offset X/Y" settings (in 3DSTATE_DEPTH_BUFFER) to cause the GPU to
render to miplevels other than 0. Unfortunately, this doesn't work,
because these offsets must be aligned to multiples of 8, and miplevels
in the depth buffer are only guaranteed to be aligned to multiples of
4. When the offsets aren't aligned to a multiple of 8, the GPU
sometimes hangs.
As a temporary measure, to avoid GPU hangs, this patch smashes the 3
LSB's of "depth coordinate offset X/Y" to 0. This results in
incorrect rendering to mipmapped depth textures, but that seems like a
reasonable stopgap while we figure out a better solution.
(Note that we have only ever observed this GPU hang on Gen6 when HiZ
is enabled, so another possible stopgap would be to disable HiZ).
Avoids GPU hangs in piglit test "depthstencil-render-miplevels" at
texture sizes that are not powers of 2.
Reviewed-by: Chad Verace <chad.versace@linux.intel.com>
Cherry-picked from a683012a80
Conflicts:
src/mesa/drivers/dri/i965/brw_misc_state.c
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50271
If no format is matched in the loop the value of xconf was undefined.
NOTE: This is a candidate for the 8.0 branch.
(cherry picked from commit fe2a7b7e7f)
Since live intervals are based on ip, removing an instruction trashes
the intervals unless we were to go do some surgery. These happen to
usually remove a use of a grf, so it's time to recalculate, anyway.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 8.0 release branch.
(cherry picked from commit 2343fe9a5d)
The i965 back-end needs to compile dFdy() differently for FBOs and
window system framebuffers, because Y coordinates are flipped between
the two (see commit 82d2596: i965: Compute dFdy() correctly for FBOs).
This patch avoids unnecessarily recompiling shaders that don't use
dFdy(), by only setting render_to_fbo in the wm program key if the
shader actually uses dFdy().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit d08fdacd58)
Conflicts:
src/mesa/drivers/dri/i965/brw_wm.c
The i965 back-end needs to compile dFdy() differently for FBOs and
window system framebuffers, because Y coordinates are flipped between
the two (see commit 82d2596: i965: Compute dFdy() correctly for FBOs).
This boolean will allow it to avoid unnecessarily recompiling shaders
that don't use dFdy().
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 5e310e9f83)
On i965, dFdx() and dFdy() are computed by taking advantage of the
fact that each consecutive set of 4 pixels dispatched to the fragment
shader always constitutes a contiguous 2x2 block of pixels in a fixed
arrangement known as a "sub-span". So we calculate dFdx() by taking
the difference between the values computed for the left and right
halves of the sub-span, and we calculate dFdy() by taking the
difference between the values computed for the top and bottom halves
of the sub-span.
However, there's a subtlety when FBOs are in use: since FBOs use a
coordinate system where the origin is at the upper left, and window
system framebuffers use a coordinate system where the origin is at the
lower left, the computation of dFdy() needs to be negated for FBOs.
This patch modifies the fragment shader back-ends to negate the value
of dFdy() when an FBO is in use. It also modifies the code that
populates the program key (brw_wm_populate_key() and
brw_fs_precompile()) so that they always record in the program key
whether we are rendering to an FBO or to a window system framebuffer;
this ensures that the fragment shader will get recompiled when
switching between FBO and non-FBO use.
This will result in unnecessary recompiles of fragment shaders that
don't use dFdy(). To fix that, we will need to adapt the GLSL and
NV_fragment_program front-ends to record whether or not a given shader
uses dFdy(). I plan to implement this in a future patch series; I've
left FIXME comments in the code as a reminder.
Fixes Piglit test "fbo-deriv".
NOTE: This is a candidate for stable release branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 82d25963a8)
This fixes piglit/getteximage-formats on r600g.
NOTE: This is a candidate for stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 5e7e7d96b3)
Without passing the -ldflags parameter before $(LDFLAGS) in some cases
flags will be passed to MKLIB which it does not understand.
This might be -m64, -m32 or similar.
NOTE: This is a candidate for the 8.0 branch.
Signed-off-by: Thomas Gstädtner <thomas@gstaedtner.net>
Signed-off-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 93594f38be)
Fixes uninitialized member defects reported by Coverity.
NOTE: This is a candidate for the 8.0 branch.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 70d038e46e)
This should help to prevent gpu lockups.
See https://bugs.freedesktop.org/show_bug.cgi?id=48472
NOTE: This is a candidate for the stable branches.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 783e4da72a)
Add the maximum base vertex offset to max_index for computing the
buffer size. Fixes a failed assertion in the u_upload_mgr.c code with
the VMware svga driver.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=48141
v2: incorporate Marek's suggestions.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit 299c9052e8)
Function calls may have side effects that alter variables used inside
the loop. In the fragment shader, they may even terminate the shader.
This means our analysis about loop-constant or induction variables may
be completely wrong.
In general it's impossible to determine whether they actually do or not
(due to the halting problem), so we'd need to perform conservative
static analysis. For now, it's not worth the complexity: most functions
will be inlined, at which point we can unroll them successfully.
Fixes Piglit tests:
- shaders/glsl-fs-unroll-out-param
- shaders/glsl-fs-unroll-side-effect
NOTE: This is a candidate for release branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0405bd08ca)
The function that counts the number of TGSI immediates also needs to
emit the immediates. This fixes assorted failures when using polygon
stipple with fragment shaders that have their own immediates.
NOTE: This is a candidate for the 8.0 branch.
(cherry picked from commit 7f16246ace)
The image height or depth is the array_size for array textures.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=47742
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit 318669f196)
If we failed to allocate a memory resource for the texture we'd crash
when we tried to map it. Now we propogate the NULL back up to the
texstore code and generate GL_OUT_OF_MEMORY.
Fixes a crash with the upcoming piglit max-texture-size test.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 03f8a97d71)
Similar to the previous commit. Also fix incorrect setting of the
sampler view's state after it's created. We need to specify the
first/last_level fields in the template instead.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 0315cb9f8f)
The st_renderbuffer_alloc_storage() function is used to allocate both
window-system buffers and user-created renderbuffers. The later kind
are never directly displayed so don't set PIPE_BIND_DISPLAY_TARGET for
those surfaces.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 5a70e12fc0)
Should avoid dangling pointer derreference with
glean --run results --overwrite --quick --tests texSwizzle
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 07635a4799)
This function releases the buffer that contains user-space vertex data.
The buffer_offset field points into that buffer. So reset the
buffer_offset to zero when we release the buffer so that subsequent
draws don't inadvertantly get a bad offset.
Fixes error messages / failed assertions (in the draw module's bounds/size
checking code) when running piglit's polygon-mode test.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 04341e51ce)
To fix failed assertions when calling glCopyBufferSubData().
svga_texture() asserts that the resource is a texture. Simply move the
calls to svga_texture() after the code that handles non-texture copies
so that we don't call it with non-texture resources.
Fixes glean bufferObject failure.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 7f2e12812a)
Two assignments to num_immediates were missing in
get_pixel_transfer_visitor() and get_bitmap_visitor().
The uninitialized value led to valgrind errors and crashes in some
cases.
Added new assertions to catch future problems in this area. Also
changed num_immediates to unsigned to avoid signed/unsigned
comparison warnings.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit fdae0eaf22)
If we don't find an exact PIPE_FORMAT_x for a GL_(COMPRESSED)_RED/RG format,
try uncompressed formats. We were already doing this for the RGB(A) formats.
Fixes piglit arb_texture_compression-internal-format-query test.
NOTE: This is a candidate for the stable branches.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 75f37ddba7)
Otherwise the fence will never arrive.
Also check for a NULL i915->batch.
NOTE: This is a candidate for the 8.0 branch.
(cherry picked from commit 32b07bb149)
The legal range for the device is apparently [-16.0, +15.0].
Limiting the range to [-15, +15] fixes piglit's lodbias test.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit a9eda41539)
The interaction between the mipmap lod min/max limits and the texture
base/max level limits is kind of tricky. Changing the base level
didn't work as expected before.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit fd890873b2)
This makes lod clamping more consistent with other drivers.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 5abcd198b8)
In single precision, 1.5707963 becomes 1.5707962513 which is too
small. However, 1.5707964 becomes 1.5707963705 which is just right.
The value 1.5707964 is already used in asin.ir.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
(cherry picked from commit 4bfdc83135)
Conflicts:
src/glsl/builtins/ir/acos.ir
This is an array of uniforms, not a single one.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
NOTE: This is a candidate for the 8.0 branch.
(cherry picked from commit e2e9b4b10f)
When we have multiple shared contexts, and one of them is
long-running, this will lead to never freeing those resources
since they are shared. Instead, free them right away on context
destruction since we know the other context isn't using them.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
NOTE: This is a candidate for the 8.0 branch.
(cherry picked from commit 53feb8ecdc)
Works around crashes when X connections break.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
NOTE: This is a candidate for the 8.0 branch.
(cherry picked from commit 0256edd709)
While ~loop_state() is already freeing the loop_variable_state objects
via ralloc_free(this->mem_ctx), the ~loop_variable_state() destructor
was never getting called, so the hash table inside loop_variable_state
was never getting destroyed.
Fixes a memory leak in any shader with loops.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 3603fdcebf)
OpenGL allows you to declare user-defined fragment shader outputs with
less than four components:
out ivec2 color;
This makes sense if you're rendering to an RG format render target.
Previously, we assumed that all color outputs had four components (like
the built-in gl_FragColor/gl_FragData variables). This caused us to
call emit_color_write for invalid indices, incrementing the output
virtual GRF's reg_offset beyond the size of the register.
This caused cascading failures: split_virtual_grfs would allocate new
size-1 registers based on the virtual GRF size, but then proceed to
rewrite the out-of-bounds accesses assuming that it had allocated enough
new (contiguously numbered) registers. This resulted in instructions
that accessed size-1 GRFs which register numbers beyond
virtual_grf_next (i.e. registers that were never allocated).
Finally, this manifested as live variable analysis and instruction
scheduling accessing their temporary array with an out of bounds index
(as they're all sized based on virtual_grf_next), and the program would
segfault.
It looks like the hardware's Render Target Write message requires you to
send four components, even for RT formats such as RG or RGB. This patch
continues to use all four MRFs, but doesn't bother to fill any data for
the last few, which should be unused.
+2 oglconforms.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 2f18698220)
Conflicts:
src/mesa/drivers/dri/i965/brw_fs.h
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Commit 4650aea7a5 fixed texelFetchOffset()
on Ivybridge, but didn't update the Ironlake/Sandybridge code.
+18 piglits on Sandybridge.
NOTE: This and 4650aea7a5 are both candidates for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit cb18472eca)
It appears that when using 'ld' with the offset bits, address bounds
checking happens before the offset is applied, so parts of the drawing
in piglit texelFetchOffset() with a negative texcoord go black.
(cherry picked from commit 4650aea7a5)
Commit f41ecade7b fixed texelFetchOffset()
on Ivybridge, but didn't update the Ironlake/Sandybridge code.
+15 piglits on Sandybridge.
NOTE: This and f41ecade7b are both candidates for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 217b62bf00)
This isn't saved/restored by _mesa_meta_begin, so we need to do it
manually (like we do for the read/draw framebuffers). Additionally,
we neglected to re-bind before the glRenderbufferStorage call.
+13 oglconforms.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 7fde071f04)
DeleteBuffer needs to unbind from these binding points as well, based on
the same rationale as the previous patch.
+51 oglconforms (together with the last patch).
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 3edd2ba22b)