This ports the workaround from radeonsi, that was missing in radv.
This fixes Talos rendering when MSAA is enabled on my Tahiti card.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <airlied@redhat.com>
The configuration option --with-sha1 is no longer required for the
MESA_SHADER_READ_PATH, MESA_SHADER_DUMP_PATH environment variables
to take effect.
1- removed the "--with-sha1" sentence from docs/shading.html
2- added an extra note: that the corresponding dumped and replacement
shaders must have the same filenames for the feature to take effect.
Acked-by: Tapani Pälli <tapani.palli@intel.com>
This mirrors what Marek has done for radeonsi, and uses
a separate counter to handle the fmask surface for MSAA
MRTs.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just copies the code from the -pro shaders,
and fixes the tests on CIK.
With this CIK passes the same set of conformance
tests as VI.
Fixes: 83e58b03 (radv: flush f32->f16 conversion denormals to zero. (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
R8_UNORM textures can be emulated by means of L8 and a swizzle.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
This patch adds support for large shaders on GC3000. For example the "terrain"
glmark benchmark with a large fragment shader will work after this.
If the GPU supports ICACHE, shaders larger than the available state area will
be uploaded to a bo of their own and instructed to be loaded from memory on
demand. Small shaders will be uploaded in the usual way. This mimics the
behavior of the blob.
On GPUs that don't support ICACHE, this patch should make no difference.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
GC3000 has changed from a separate store for VS and PS uniforms
to a single, unified one. There is backwards compatibilty functionalty,
however this does not work correctly together with ICACHE.
This patch adds explicit support, although in the simplest way possible:
the PS/VS uniforms split is still fixed and hardcoded. It should
make no difference on hardware that does not have unified uniform
memory.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
The argument here is a bitmask, so the old code selected .xy, which
got silently truncated to .x when constructing the vec4 from components,
instead of using .w.
Fixes: 588185eb6b "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <airlied@redhat.com>
It justs works with the fragment shader resolve, so no need to do
a custom conversion. In fact with SRGB dest, it actually gives
wrong results.
Fixes: 69136f4e63 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
These seem to store very bogus results. Luckily there is some code
that converts srgb->linear already, so just making the descriptor
format UNORM should work.
Fixes: 588185eb6b "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <airlied@redhat.com>
These need to match for interop compatibility queries.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This is required for interop use cases. The same device must report
identical UUIDs through the GL and Vulkan APIs so that users can
identify when it is safe to perform a memory object import.
v2: use ac helpers to calculate the uuid
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
These are just basic implementations.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
We need vulkan and gl to produce the same UUIDs. Therefore we should
keep the mechanism to compute these in a common location to guarantee
they are updated in lockstep.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
These are used by EXT_external_objects to present UUIDs for the device
and the driver.
v2 (Timothy Arceri):
- remove extra break
- use _mesa_problem() rather the _mesa_error() for unimplemented
support for value types
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
v2: use PIPE_CAP_MEMOBJ to guard the extension
v3 (Timothy Arceri):
- expose extensions via the cap_mappings array
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Include no_error variants as well.
v2 (Timothy Arceri):
- reduced code churn by squashing some changes into
previous commits
v3 (Timothy Arceri):
- drop unused function declaration
v4 (Timothy Arceri):
- fix Driver function assert()
- add missing GL errors
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Use a memory object instead of user memory.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Instead of allocating memory to back a texture, use the provided memory
object.
v2: split off extension exposure logic
v3: de-duplicate code with st_AllocTextureStorage
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Plumbing for using memory objects as texture storage.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
v2: pass dedicated flag
v3 (Timothy Arceri):
- remove unrequired _mesa_init_memory_object_functions()
call in the state tracker.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
V2 (Timothy Arceri):
- fix copy and paste error with error message
V3 (Timothy Arceri):
- drop the Protected field for now as its unused
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Used by EXT_external_objects and EXT_external_objects_fd
V2 (Timothy Arceri):
- Throw GL_OUT_OF_MEMORY error if CreateMemoryObjectsEXT()
fails.
- C99 tidy ups
- remove void cast (Constantine Kharlamov)
V3 (Timothy Arceri):
- rename mo -> memObj
- check that the object is not NULL before initializing
- add missing "EXT" in function error message
V4 (Timothy Arceri):
- remove checks for (memory objecy id == 0) and catch in
_mesa_lookup_memory_object() instead.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Includes implementation stubs.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The device version is the maximum CL version that the device supports.
device_version and device_clc_version are not necessarily the same for
devices that support CL 1.0, but have a 1.1 compiler and the necessary
extensions.
Eventually, this will be based on the features/extensions of the actual
device, but for now move it a bit closer to its eventual destination.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesey <jan.vesely@rutgers.edu>
This is a bug in the app, but I'd rather avoid hanging the GPU,
esp if someone is running in validation and it takes out their
development environment.
v2: get it right, reverse the polarity.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Having two callbacks to manage a single int seems like an overkill.
Use a cached copy and update that when needed.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
---
Might want to look if the dimensions dance in .query_surface ...
speaking of which close to nobody implements that ...
If we get a xfixes v1.x we'll error out, without freeing the
xfixes_query reply.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Currently xmlconfig is conditionally used, only when --enable-dri is
available.
As the library has moved to src/util and has wider wisebase, this guard
is no longer correct. Strictly speaking - it wasn't since the
introduction of xmlconfig into st/nine a while ago.
Unconditionally enable xmlconfig and drop the linking. As said before
there's other users of the library, so depending on the configure
options we will get multiple definitions of said symbols.
NOTE: To avoid breaking other combinations, this commit adds the
xmlconfig link to the required places - throughout gallium and the DRI
loaders.
Cc: Aaron Watry <awatry@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
The kernel only cares about whether the object is to be written to or
not, only reduces (reloc.read_domains, reloc.write_domain) down to just
!!reloc.write_domain. When we use NO_RELOC, the kernel doesn't even read
those relocs and instead userspace has to pass that information in the
execobject.flags. We can simplify our reloc api by also removing the
unused read/write domains and only pass the resultant flags.
The caveat to the above are when we need to make the kernel aware that
certain objects need to take into account different work arounds.
Previously, this was done using the magic (INSTRUCTION, INSTRUCTION)
reloc domains. NO_RELOC requires this to be passed in the execobject
flags as well, and now we push that up the callstack.
The API is more compact, more expressive of what happens underneath, but
unfortunately requires more knowledge of the system at the point of use.
Conversely it also means that knowledge is specific and not generally
applied and so not overused.
text data bss dec hex filename
8502991 356912 424944 9284847 8dacef lib/i965_dri.so (before)
8500455 356912 424944 9282311 8da307 lib/i965_dri.so (after)
v2: (by Ken) Rebase.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Based on a patch by Chris Wilson (who also wrote this commit message).
Passing the index of the target buffer via the reloc.target_handle is
marginally more efficient for the kernel (it can avoid some allocations,
and can use a direct lookup rather than a hash or search). It is also
useful for ourselves as we can use the index into our exec_bos for other
tasks.
v2: Only enable HANDLE_LUT if we can use BATCH_FIRST and thereby avoid
a post-processing loop to fixup the relocations.
v3: Move kernel probing from context creation to screen init.
Use batch->use_exec_lut as it more descriptive of what's going on (Daniel)
v4: Kernel features already exists, use it for BATCH_FIRST
Rename locals to preserve current flavouring
v5: Squash in "always insert batch bo first"
v6: (by Ken) Split out BATCH_FIRST from HANDLE_LUT.
Extracted from a patch by Chris Wilson.
Now that the batch is always at the front of the validation list,
we don't need to special case it - the usual "go find an existing BO"
code will work just fine.