If we fails in reserve_explicit_locations, we leak uniform_map.
Reported-by: coverity scanner.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
i965 will have more than 32 bits when BRW_STATE_COMPUTE_PROGRAM is added.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The set of state atoms for compute shaders is currently empty; it will
be filled in by future patches.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
The hardware state for compute shaders is almost entirely orthogonal
to the hardware state for 3D rendering. To avoid sending unnecessary
state to the hardware, we'll need to have a separate set of state
atoms for the compute pipeline and the 3D pipeline. That means we
need to maintain two separate sets of dirty bits to determine which
state atoms need to be run.
But the dirty bits are not completely independent; for example, if
BRW_NEW_SURFACES is flagged while doing 3D rendering, then not only do
we need to re-run 3D state atoms that depend on BRW_NEW_SURFACES, but
we also need to re-run compute state atoms that depend on
BRW_NEW_SURFACES. But we'll also need to re-run those state atoms the
next time the compute pipeline is run.
To accomplish this, we record two sets of dirty bits, one for each
pipeline. When bits are dirtied (via SET_DIRTY_BIT() or
SET_DIRTY_ALL()) we set them to the dirty state in both pipelines.
When brw_state_upload() is run, we clear the dirty bits just for the
pipeline that was run.
Note that since the number of pipelines is known at compile time to be
2, the compiler should unroll the loops in SET_DIRTY_BIT() and
SET_DIRTY_ALL().
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
coverity reported this, Matt said it look like missing parens,
not bad identing, so lets try that.
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This one path doesn't goto fail, so it seems to leak dec.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The current code... makes no sense. Use nouveau_bo_ref to attach the bo
to the exposed resource so as to have the proper lifetime guarantees.
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
With VP2, nv50_miptree is faked because the underlying bo's have to be
laid out in a certain way. This is done by adjusting the address. Make
sure that blits (and everything else for consistency) use the mt address
rather than the bo address as a base.
This fixes retrieving chroma plane with VDPAU.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82255
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
The mt address is about to be used more, make sure it's set
appropriately.
Reported-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
When constant folding a MAD operation, we first fold the multiply and
generate an ADD. However we do so without making sure that the immediate
can be handled in the saturate case. If it can't, load the immediate in
a separate instruction.
Reported-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Experimentally, the sampler doesn't appear to like these, neither as
buffer nor as rect textures. So remove 1D from the list of texture types
to make linear when used for staging.
This fixes the OSD in mplayer for VDPAU.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Samplers are only defined up to num_samplers, so set all samplers above
nr to NULL so that we don't try to read them again later.
Tested-by: Christian Ruppert <idl0r@qasl.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
In certain circumstances, findFirstUses could end up doubling back on
instructions it had already processed, resulting in an infinite
recursion. Avoid this by keeping track of already-visited instructions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
We always left them enabled, which turned off HiZ in some cases.
This should improve performace with Hyper-Z.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This should be as fast as no HTILE for stencil. I think we can still get full
performance with depth-only rendering even if stencil is present in the buffer
but not used, but I'm not 100% sure. This may be revisited when HiS and fast
stencil clear are implemented.
This fixes a hang in Brutal Legend.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64471
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This is a golden setting on RV740, but there is a hw bug which recommends
setting it on all R7xx chipsets.
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
It's almost the same.
This enables tiling for HTILE. It also enables Hyper-Z for other texture
targets (1D, 1D_ARRAY, 2D_ARRAY, CUBE, CUBE_ARRAY, 3D, RECT).
2D array depth textures are tested by Unigine Sanctuary and my new piglit
test.
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
This fixes rendering to a non-zero layer/face/slice with HTILE.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72685
v2: added the assertion
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This should make a machine which is running piglit more responsive at times.
e.g. streaming-texture-leak can easily eat 600 MB because of how fast it
creates new textures.
We were only using it to get at its type, which we already know because
it's a builtin variable.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We were only using it to get at its type, which we already know because
it's a builtin variable.
v2 (Ken): Rebase on Matt's optimized gl_FrontFacing calculations.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The replicated data clear shader needs to be SIMD16, or else the GPU
will hang. So, compile it even if INTEL_DEBUG=no16 is set.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>