See 546d6c8d for the corresponding fix in freedreno.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2b6e703863)
On Windows, DllMain calls and thread creation/destruction are
serialized, so when llvmpipe is destroyed from DllMain waiting for the
rasterizer threads to finish will deadlock.
So, instead of waiting for rasterizer threads to have finished, simply wait for the
rasterizer threads to notify they are just about to finish.
Verified with this very simple program:
#include <windows.h>
int main() {
HMODULE hModule = LoadLibraryA("opengl32.dll");
FreeLibrary(hModule);
}
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=76252
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 706ad3b649)
Squashed together with:
llvmpipe: Call pipe_thread_wait() on Linux.
To address http://lists.freedesktop.org/archives/mesa-dev/2014-November/070569.html
In short, revert 706ad3b649 for non-Windows
OSes.
(cherry picked from commit d5b1731178)
MSVC replaces the "F" in "255.0F" with the macro argument which leads
to an error. s/F/FLT/ to avoid that.
It turns out we weren't using this macro at all on MSVC until the
recent "mesa: Drop USE_IEEE define." change.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 9608193cbc)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85918
Nominated-by: Roland Scheidegger <sroland@vmware.com>
This reverts commit 20836c8185.
255 is a huge number. If you have a loop with 255 iterations, unrolling it
will exceed the SM3 instruction limit. Let's use the default again.
The comment about a SM3 limit doesn't make sense. For SM3, we generally
want 32 (default) or a lower number due to the SM3 instruction limit, which
is 512 instructions. For SM4, we can try higher numbers if needed, but
some shaders can end up being pretty huge and shader compilation can take
more time.
This fixes a shader compile failure on R500/SM3. Reported on IRC.
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 6fcb5520b7)
Avoids a crash in case of negative array index is used in a
shader program.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
(cherry picked from commit 7a652c41b4)
Conflicts:
src/glsl/ast_array_index.cpp
Remap table for uniforms may contain empty entries when using explicit
uniform locations. If no active/inactive variable exists with given
location, remap table contains NULL.
v2: move remap table bounds check before existence check (Ian Romanick)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Erik Faye-Lund <kusmabite@gmail.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83574
(cherry picked from commit 9bd139e451)
Patch fixes the slot count used by vector types and adds 1 slot
to be used by image and sampler types.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82921
(cherry picked from commit 1cb81d3a9b)
We don't have a scissor enable bit in hw, so when a raster state change
results in scissor enable bit changing, we need to also mark scissor
state as dirty.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 3eb8289aa4)
The optimization of avoiding restore (mem2gmem) if there was a clear
falls down a bit if you don't have a fullscreen scissor. We need to
make the decision logic a bit more clever to keep track of *what* was
cleared, so that we can (a) completely skip mem2gmem if entire buffer
was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that
were completely cleared.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 01b757e2b0)
FD_MESA_DEBUG=nocp will disable copy propagation pass.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 4f17e026bb)
Conflicts:
src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
It seems like the hardware is unhappy if we execute a kill instruction
prior to last input (ei). Probably the shader thread stops executing
and the end-input flag is never set.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 8a0ffedd8d)
Because we reuse various bits of emit code (for state/vertex/prog/etc)
for both regular draws and internal draws (gmem<->mem, clear, etc), the
number of parameters getting passed around has been growing. Refactor
to group these into fd3_emit. This simplifies fxn signatures, avoids
passing around shader key on the stack, etc. It also gives us a nice
place to cache shader-variant lookup to avoid looking up shader variants
multiple times per draw (without having to *also* pass them around as
fxn args everywhere).
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit d595987ea3)
Get rid of fd3_vertex_buf and use fd_vertex_state directly for all
draws. Removes a tiny bit of CPU overhead for munging around the vertex
state every time it is emitted, but more importantly it cleans things up
for later optimizations, so the emit paths don't have to special case
internal draws (gmem<->mem, clears, etc) with regular draws.
Instead of constructing fd3_vertex_buf array each time for internal
draws, and context init time pre-create solid_vbuf_state and
blit_vbuf_state.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit d5d80b3739)
Fixes a few issues, including a potential empty-IB (which triggers gpu
hangs in piglit occlusion_query_meta_no_fragments)
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 7297bdbd50)
Possibly we should map the front color to black (zeroes). But not sure
there is a way to do that without generating a shader variant.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit a262c601d3)
Shaders like:
FRAG
PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
DCL IN[0], GENERIC[0], PERSPECTIVE
DCL OUT[0], COLOR
DCL SAMP[0]
DCL TEMP[0], LOCAL
IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000}
0: TEX TEMP[0], IN[0].xyyy, SAMP[0], 2D
1: MOV OUT[0], IMM[0].xyxx
2: END
cause unhappyness. They have an IN[], but once this is compiled the
useless TEX instruction goes away. Leaving a varying that is never
fetched, which makes the hw unhappy.
In the process fix a signed vs unsigned compare. If the vertex shader
has max_reg=-1, MAX2() vs an unsigned would not give the desired result.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit af4d088395)
Still failing a bunch of the fairly picky texelFetch tests, but the
1D(Array) ones are full passes.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 33c9ad97bf)
Experimentally, this makes *ArrayShadow tex-miplevel-selection tests
pass.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 5bba74c64b)
Since the RA has to be done s.t. each one gets its own (adjacent)
register, it would complicate matters if instructions were allowed to be
repeated. This enables copy-propagation use in situations where
previously that might have happened.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 3dd9a0d6fd)
Makes the command stream a bit tighter when there are lots of
immediates.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit f5eeb8a6dc)