These need to be freed too.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 667e5cec76)
Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts
of pCreateInfo members are moved to the earliest points at which they
should not be NULL.
This fixes a segfault seen in the McNopper demo, VKTS_Example09.
v3 (Jason Ekstrand):
- Fix disabled rasterization check
- Revert opaque detection of color attachment usage
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a4a5917248)
To reduce confusion, clarify that the state being copied is not dynamic.
This agrees with the Vulkan spec's usage of the term. Various sections
specify that the various pipeline state which have VkDynamicState enums
(e.g. viewport, scissor, etc.) may or may not be dynamic.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a0d84a9ef9)
We already check that the address is not "too far", but we should also
clamp the UBO index in order to avoid looking at the wrong place in the
driver cb. This is a pretty rare situation though.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7f257abc1b)
Removes the need to set LIBVA_DRIVER_NAME=gallium for supported targets and is
consistent with vdpau and general gallium drivers.
Note: some versions of libva can detect the gallium name and use the
backend. Although that behaviour seems inconsistent since it only works
for some platforms/backends.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 0c0f841e5d)
When building from a release tarball (where the generated/built files
are in srcdir) in an OOT fashion we need to have both builddir and
srcdir in the includes list.
Otherwise we'll error out, as the file (header gen_knobs.h in this case)
won't be in the location where we are looking.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit fcb5a75a66)
Will allows us to catch missing files and build issues before getting
the tarball out for general consumption.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f4d26856df)
Otherwise drivers such as SWR that depend on providing their own values
will fail to build.
v2: Add -mcpu for good measure (Chuck)
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chuck Atkins <chuck.atkins@kitware.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
(cherry picked from commit bab5ab6940)
With earlier commit we've handled the `make distclean' out of tree
build, yet we failed to attribute that for in-tree builds the test
condition will return 1. Thus effectively the target will be considered
as "failed".
Fixes: b7f7ec7843 ("mesa: automake: distclean git_sha1.h when building
OOT")
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 8229fe68b5)
We were programming the number of threads per subslice, when we should
have been programming the total number of threads on the GPU as a whole.
Thanks to Curro and Jordan for helping track this down!
On Skylake GT3e:
- Improves performance in Unreal's Elemental Demo by roughly 1.5-1.7x.
- Improves performance in Synmark's Gl43CSDof by roughly 3.7x.
- Improves performance in Synmark's Gl43GSCloth by roughly 1.18x.
On Broadwell GT2:
- Improves performance in Unreal's Elemental Demo by roughly 1.2-1.5x.
- Improves performance in Synmark's Gl43CSDof by roughly 2.0x.
- Improves performance in Synmark's Gl43GSCloth by 1.47035% +/-
0.255654% (n=25).
On Haswell GT3e:
- Improves performance in Unreal's Elemental Demo (in GL 4.3 mode)
by roughly 1.10x.
- Improves performance in Synmark's Gl43CSDof by roughly 1.18x.
- Decreases performance in Synmark's Gl43CSCloth by -1.99484% +/-
0.432771% (n=64).
On Ivybridge GT2:
- Improves performance in Unreal's Elemental Demo (in GL 4.2 mode)
by roughly 1.03x.
- Improves performance in Synmark's G/43CSDof by roughly 1.25x.
- No change in Synmark's Gl43CSCloth (n=28).
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 0fb85ac08d)
I don't know that anything actually guarantees this, but if we exceed
the limits, we may end up overflowing and trashing random buffers that
happen to be nearby in the VMA space, leading to rendering corruption,
hangs, or worse.
We should really fix this properly. However, the pitfall has existed
for ages, so for now we should at least detect it.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 1db37ebecf)
These are linear, not powers of two, and much more limited.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a42a93dc12)
Most scratch stages use power of two sizes, in kilobytes, where
0 means 1kB. But compute shaders on Haswell have a minimum of 2kB,
and use a representation where 0 = 2kB.
This meant that we were effectively telling the hardware to allocate
each thread twice as much space as we meant to, while simultaneously
not allocating that much space in the buffer, leading to overflows.
Note that the existing code is completely wrong for Ivybridge,
but that will take additional work to sort out, so I've left it
as is for now. A subsequent commit will take care of that.
Together with the previous patches, this fixes rendering corruption
on Synmark's Gl43CSDof on Haswell.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 147a90d82a)
Curro figured this out by investigating the simulator. Apparently
there's also a workaround in the Windows driver. I'm not sure it's
actually documented anywhere.
We were underallocating the scratch buffer by a factor of 128/70.
v2: Rename threads_per_subslice to scratch_ids_per_subslice
(suggested by Jordan Justen).
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit a7d029d3df)
We were allocating enough space for the number of threads per subslice,
when we should have been allocating space for the number of threads in
the entire GPU.
Even though we currently run with a reduced thread count (due to a bug),
we might still overflow the scratch buffer because the address
calculation is based on the FFTID, which can depend on exactly which
threads, EUs, and threads are executing. We need to allocate enough
for every possible thread that could run.
Fixes rendering corruption in Synmark's Gl43CSDof on Gen8+.
Earlier platforms need additional bug fixes.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 2213ffdb4b)
We'll use this for compute shader thread counts and scratch space
calculations shortly.
Note that subslices are referred to as "half slices" on Ivybridge.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 9cd8f95809)
This was fixed in revision 47 of the ARB_dsa spec in Oct 22, 2015. Since
it's horrible to have differing APIs across library versions, we should
attempt to minimize the impact by backporting it as far as possible and
hope no one notices.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7d7e015381)
This brings in the fixed glClearNamedFramebufferfi definition, as well
as a lot of GLsizei -> GLsizeiptr changes.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 92351a71a8)
This brings in defines from GL_EXT_window_rectangles and fixes the
glClearNamedFramebufferfi definition.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f81374fd3e)
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
to the destination rectangle bounded by the locations (dstX0, dstY 0)
and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
while the upper bounds are exclusive."
So, the rectangles sharing just an edge shouldn't overlap.
-----------
| |
------- ---
| | |
| | |
------- ---
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit 466b320163)
>From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels":
"The pixels corresponding to these buffers are copied from the source
rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1)
to the destination rectangle bounded by the locations (dstX0, dstY 0)
and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive,
while the upper bounds are exclusive."
So, the rectangles sharing just an edge shouldn't overlap.
-----------
| |
------- ---
| | |
| | |
------- ---
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit f8679badd4)
This reworks the #if guards a bit. When Emil originally wrote them, he
just guarded everything. However, part of what anv_entrypoints_gen.py
generates is a hash table for looking up entrypoints based on their name.
This table *cannot* get out of sync between C and python regardless of
preprocessor flags. In order to prevent this, this commit makes us use
void pointers in the dispatch table for those entrypoints which aren't
available. This means that the dispatch table size and entry order is
constant and it should never get out-of-sync with the python.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8d37556ec9)
This is a bit cleaner than generating the types ourselves when making the
table.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9ed0d9dd06)
When an applications specifies mip levels _before_ setting a mipmap texture
filter, we will initially guess a single texture level. When the second level
image is created, we try to allocate the full texture -- however, we get the
base level size guess wrong if that size is odd. This leads to yet another
re-allocation of the texture later during st_finalize_texture.
Even worse, this re-allocation breaks a (reasonable) assumption made by
st_generate_mipmaps, because the re-allocation in the finalization call will
again allocate a single-level pipe texture (based on the non-mipmap texture
filter!). As a result, mipmap generation fails in interesting ways.
All of this can be avoided by just using the fact that we already know the
size of the base level.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95529
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 42624ea837)
At this point, the limits are probably more-or-less correct. If there is
an invalid limit, that's a bug not a FINSHME.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1e69930e4)
This gets ANV_ENABLE_PIPELINE_CACHE=false working again.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 4f5bbf804b)
This way the the bind map (which we're caching) is mostly independent of
the pipeline layout. The only coupling remaining is that we pull the array
size of a binding out of the layout. However, that size is also specified
in the shader and should always match so it's not really coupled. This
rendering issues in Dota 2.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a1a25db699)
Since applications are allowed to specify some set of bindings which need
not be dense they also need not be in order. For most things, this doesn't
matter, but it could result getting the wrong dynamic offsets. This adds a
quick-and-dirty sort to ensure that everything is always in increasing
order of binding index.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c13c5ac561)
This allows for some extra validation and makes it easier to see what's
going on when poking around in gdb.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e2265926f2)
With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets
renewed in each frame, so when we receive a new pixmap, should get a new
front buffer for it.
This also fixes Totem player playback corruption.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2ad443e4cc)
From original commit, the macro "if HAVE_DRI3" was in Makefile.sources,
this file is shared with SCons, SCons is not able to parse this marco,
the SCons build failed. Jose quickly gave two approaches and quick fix
with his second approach, thanks Jose for the solutions and fixes.
This patch is Jose's first approach, and it's more proper, because the
dri3 c file should not be included to build when DRI3 is not enabled.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0ef8500aab)
GLX documentation states:
glXCreateNewContext can generate the following errors: (...)
GLXBadFBConfig if config is not a valid GLXFBConfig
Function checks if the given config is a valid config and sets proper
error code.
Fixes currently crashing glx-fbconfig-bad Piglit test.
v2: coding style cleanups (Emil, Topi)
use DefaultScreen macro (Emil)
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cf804b4455)
When Kristian implemented GL_TEXTURE_EXTERNAL_OES, he hooked it up for gen8
but not for gen7 or earlier. It all works, we just need to emit the states
for the extra planes.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 037ce5d734)
When calling virgl_fence_wait() with timeout=0,
virgl_{drm,vtest}_resource_is_busy() is called. However, it returns TRUE
for a busy resource, whereace virgl_fence_wait() should return TRUE for
a completed (non-busy) resource.
This fixes running supertuxkart in a VM (I could not reproduce locally
with vtest though there is a similar fix)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit dc81b3ad43)
The width0/height0/depth0 on stObj may not have been set at this point.
Observed in a trace that set up levels 2..9 of a 2d texture, and set the base
level to 2, with height 1. This made the guess logic always bail.
Originally investigated by Ilia Mirkin, this patch gets rid of the somewhat
redundant storage of width0/height0/depth0 and makes sure we always compute
pipe texture sizes that are compatible with the base level image of the
GL texture.
Fixes the gl-1.2-texture-base-level piglit test provided by Brian Paul.
v2:
- try to re-use an existing pipe texture when possible
- handle a corner case where the base level is not level 0 and it is of
size 1x1x1
v3:
- ptHeight = ptWidth in cube map 1x1 case (suggested by Brian)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit bd5c41fe5f)
We were previously unconditionally doing this for arrays and ubo's, and
ignoring texture/storage/atomic buffers. Instead use the usage history
to determine which atoms need to be revalidated.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6e6fd911da)
R9G9B9E5 is the only uncompressed one hopefully.
This fixes incorrect rendering not discovered (due to a lack of tests)
until DCC mipmapping was enabled.
Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit d4d733e39d)
v2:
- TG4 does not calculate derivatives (Ilia)
- also handle SAMPLE* instructions (Roland)
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit d3a584defe)
A texture may be redefined with _NEW_TEXTURE, which might have been
bound to a shader image slot. We have to revalidate the image atoms to
pick up on the new resource.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c81b090c92)
Sometimes a register source can actually be double- or even quad-wide.
We must make sure that the inserted texbars take that width into
account.
Based on an earlier patch by Samuel Pitoiset.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 71ad8a173f)
This fixes some cases of:
GL45-CTS.cull_distance.functional
on Skylake.
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c295923d13)
Like floats, we should use the round toward 0 mode instead of the
nearest one (which is the default) for doubles to integers.
This fixes all arb_gpu_shader_fp64 piglits which convert doubles to
integers (16 tests).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 08ddfe7b2f)
The GL4.5 spec quote seems clear on this:
"The value -1 will be returned by either command if an error occurs,
if name does not identify an active variable on programInterface,
or if name identifies an active variable that does not have a valid
location assigned, as described above."
This fixes:
GL45-CTS.program_interface_query.output-built-in
[airlied: use _mesa_program_resource_location_index as
suggested by Eduardo]
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 07403014c3)
Found randomly while skimming the code. This might have caused VM faults in
robustness tests.
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ec2b52e2d9)
When we are not packing a double input varying, we might need to
read its data in a non-aligned to 64-bit offset, so we read
the wrong data. This is happening when using explicit locations
in varyings because Mesa disables packing varying for that case.
const_index is in 32-bit size units but offset() is multiplying
it by destination type size units. When operating with double
input varyings, const_index value could be not aligned to 64 bits.
To fix it, we load the double vector as if it was a float based vector
with twice the number of components.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 2d6f82a294)