Harvested GPUs have some of their render backends disabled, so
in order to prevent the hardware from trying to render things
with these disabled backends we need to correctly program
the PA_SC_RASTER_CONFIG register.
v2:
- Write RASTER_CONFIG for all SEs.
v3:
- Set GRBM_GFX_INDEX.INSTANCE_BROADCAST_WRITES bit.
- Set GRBM_GFX_INFEX.SH_BROADCAST_WRITES bit when done setting
PA_SC_RASTER_CONFIG.
- Get num_se and num_sh_per_se from kernel.
v4:
- Get correct value for num_se
- Remove loop for setting PA_SC_RASTER_CONFIG
- Only compute raster config when a backend has been disabled.
v5: Michel Dänzer
- Fix computation for chips with multiple SEs
https://bugs.freedesktop.org/show_bug.cgi?id=60879
CC: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 67dcbcd92c)
The GS has an interesting use for mul. Because the GS can emit multiple
vertices per input vertex, and it also has a unique count at the top of the URB
payload, the GS unit needs to be able to dynamically specify URB write offsets
(relative to the global offset). The documentation in the function has a very
good explanation from Paul on the mechanics.
This fixes around 2000 piglit tests on BSW.
v2:
Reworded commit message (Ben) no mention of CHV (Matt)
Change SHRT_MAX to USHRT_MAX (Ken, and Matt)
Update comment in code to reflect the use of UW (Ben)
Add Gen7+ assertion for the relevant GS code, since it won't work on Gen6- (Ken)
Drop the bogus hunk in emit_control_data_bits() (Ken)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes)
Cc: "10.4 10.3 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit f13870db09)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit c3bed13604)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit fb434e675f)
The index_bias (aka base_vertex) applies to the downstream draw just as
much, since the actual index values are never modified.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 1dfa039168)
height=0 is legal for 1D array textures (as depth=0 is legal for
2D arrays). Fixes new piglit ext_texture_array-errors test.
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit 4e6244e80f)
We need parenthesis around the expression which computes the number of
blocks per row.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 991d5cf8ce)
Looks like none of the mad variants do u16 * u16 + u32, so just add in
the extra value "by hand".
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit de83ef677f)
This fixes arb_color_buffer_float-render GL_RGBA16F.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 3de9fa8ff4)
madsh.m16 can't handle a const in src1, make sure to unconst it
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 37fe347542)
This was just returning the same value as GL_CURRENT_MATRIX_ARB.
Spotted while investigating something else in apitrace.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 2b4fe85f0e)
When converting a uniform array reference to a pull constant load, the
`reladdr` expression itself may have its own `reladdr`, arbitrarily
deeply. This arises from expressions like:
a[b[x]] where a, b are uniform arrays (or lowered const arrays),
and x is not a constant.
Just iterate the lowering to pull constants until we stop seeing these
nested. For most shaders, there will be only one pass through this loop.
Fixes the piglit test:
tests/spec/glsl-1.20/linker/double-indirect-1.shader_test
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit adefccd12a)
Conflicts:
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
We need 2.4.57 for fd_bo_dmabuf() / fd_bo_from_dmabuf().
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 40aabc0e80)
Emil Velikov: Required by commit 2a90f0fb.
Nominated-by: Rob Clark <robclark@freedesktop.org>
res->bind is not an indicator of how the resource is currently bound.
buffers can be rebound across different binding points without changing
underlying storage.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fecae4625c)
The number of vertex buffers has nothing to do with the number of bound
constbufs.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e80a0a7d9a)
Fixes regression of WebGL Conformance test texture-size-limit [1] on
Ivybridge Mobile GT2 0x0166 with Google Chrome R38.
Regression introduced by
commit 6c04423153
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Sun Feb 2 02:58:42 2014 -0800
i965: Bump GL_MAX_CUBE_MAP_TEXTURE_SIZE to 8192.
The test regressed because the pointer offset arithmetic in
intel_miptree_map_gtt() overflows for large textures. The pointer
arithmetic is not 64-bit safe.
[1] 52f0dc240f/sdk/tests/conformance/textures/texture-size-limit.html
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=78770
Fixes: Intel CHRMOS-1377
Reported-by: Lu Hua <huax.lu@intel.com>
Reviewed-by: Ian Romanic <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit b69c7c5dac)
For 1D and 2D arrays we don't want the other coordinates being
offset and affecting where we sample. I wrote this patch 6 months
ago but lost it.
Fixes:
./bin/tex-miplevel-selection textureLodOffset 1DArray
./bin/tex-miplevel-selection textureLodOffset 2DArray
./bin/tex-miplevel-selection textureOffset 1DArray
./bin/tex-miplevel-selection textureOffset 1DArrayShadow
./bin/tex-miplevel-selection textureOffset 2DArray
./bin/tex-miplevel-selection textureOffset(bias) 1DArray
./bin/tex-miplevel-selection textureOffset(bias) 2DArray
v2: rewrite to handle more cases and be consistent with code
above.
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1830138cc0)
Otherwise we seem to lose the split_gs_inputs and try and
pull from an uninitialised register.
fixes 9 texelFetch geom shader tests.
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d4c342f67e)
Not all drivers can set gl_Layer from VS. Add a fallback that passes the
instance id from VS to GS, and then uses the GS to set the layer.
Tested by adding
quad_buffers |= clear_buffers;
clear_buffers = 0;
to the st_Clear logic, and forcing set_vertex_shader_layered in all
cases. No piglit regressions (on piglits with 'clear' in the name).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 68db29c434)
Some of the geom shader tests produce an empty vertex shader,
on cayman we'd crash in the finaliser because last_cf was NULL.
cayman doesn't need the NOP workaround, so if the code arrives
here with no last_cf, just emit an END.
fixes crashes in a bunch of piglit geom shader tests.
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4e520101e6)
It appears on cayman the TG4 outputs were reordered.
This fixes a lot of piglit tests.
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 27e1e0e710)
The sampler_array_size field was added by "mesa/st: add support for
dynamic sampler offsets". But the field wasn't getting copied in
the get_pixel_transfer_visitor() or get_bitmap_visitor() functions.
The count_resources() function then didn't properly compute the
glsl_to_tgsi_visitor::samplers_used bitmask. Then, we didn't declare
all the sampler registers in st_translate_program(). Finally, we
asserted when we tried to emit a tgsi ureg src register with File =
TGSI_FILE_UNDEFINED.
Add the missing assignments and some new assertions to catch the
invalid register sooner.
Cc: "10.3, 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 11abd7b2bc)
Using the asynchronous DMA engine for multi-dimensional operations seems
to cause random GPU lockups for various people. While the root cause for
this might need to be fixed in the kernel, let's disable it for now.
Before re-enabling this, please make sure you can hit all newly enabled
paths in your testing, preferably with both piglit and real world apps,
and get in touch with people on the bug reports below for stability
testing.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85647
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83500
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
(cherry picked from commit ae4536b4f7)
A nomination unadorned with a specific version is now interpreted as
being aimed at the 10.4 branch, which was recently opened.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
So when checking/building sse code we have three possibilities:
1 Old compiler, throws an error when using -msse*
2 New compiler, user disables sse* (-mno-sse*)
3 New compiler, user doesn't disable sse
The original code, added code for #1 but not #2. Later on we patched
around the lack of handling #2 by wrapping the code in __SSE4_1__.
Yet it lead to a missing/undefined symbol in case of #1 or #2, which
might cause an issue for #2 when using the i965 driver.
A bit later we "fixed" the undefined symbol by using #1, rather than
updating it to handle #2. With this commit we set things straight :)
To top it all up, conventions state that in case of conflicting
(-enable-foo -disable-foo) options, the latter one takes precedence.
Thus we need to make sure to prepend -msse4.1 to CFLAGS in our test.
v2: Clean the #includes. Suggested by Ilia, Matt & Siavash.
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Tested-by: David Heidelberg <david@ixit.cz>
Tested-by: Siavash Eliasi <siavashserver@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1a6ae84041)
See 546d6c8d for the corresponding fix in freedreno.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2b6e703863)
On Windows, DllMain calls and thread creation/destruction are
serialized, so when llvmpipe is destroyed from DllMain waiting for the
rasterizer threads to finish will deadlock.
So, instead of waiting for rasterizer threads to have finished, simply wait for the
rasterizer threads to notify they are just about to finish.
Verified with this very simple program:
#include <windows.h>
int main() {
HMODULE hModule = LoadLibraryA("opengl32.dll");
FreeLibrary(hModule);
}
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=76252
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 706ad3b649)
Squashed together with:
llvmpipe: Call pipe_thread_wait() on Linux.
To address http://lists.freedesktop.org/archives/mesa-dev/2014-November/070569.html
In short, revert 706ad3b649 for non-Windows
OSes.
(cherry picked from commit d5b1731178)
MSVC replaces the "F" in "255.0F" with the macro argument which leads
to an error. s/F/FLT/ to avoid that.
It turns out we weren't using this macro at all on MSVC until the
recent "mesa: Drop USE_IEEE define." change.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 9608193cbc)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85918
Nominated-by: Roland Scheidegger <sroland@vmware.com>
This reverts commit 20836c8185.
255 is a huge number. If you have a loop with 255 iterations, unrolling it
will exceed the SM3 instruction limit. Let's use the default again.
The comment about a SM3 limit doesn't make sense. For SM3, we generally
want 32 (default) or a lower number due to the SM3 instruction limit, which
is 512 instructions. For SM4, we can try higher numbers if needed, but
some shaders can end up being pretty huge and shader compilation can take
more time.
This fixes a shader compile failure on R500/SM3. Reported on IRC.
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6fcb5520b7)
Avoids a crash in case of negative array index is used in a
shader program.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit 7a652c41b4)
Conflicts:
src/glsl/ast_array_index.cpp
Remap table for uniforms may contain empty entries when using explicit
uniform locations. If no active/inactive variable exists with given
location, remap table contains NULL.
v2: move remap table bounds check before existence check (Ian Romanick)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Erik Faye-Lund <kusmabite@gmail.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83574
(cherry picked from commit 9bd139e451)