For the compute support, we might stick buffers as surfaces. This fixes
an assertion when executing src/gallium/tests/trivial/compute.
To avoid using these "restricted" surfaces as render targets, these
assertions have been moved. Note that it's already handled for the
framebuffer thing on nvc0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
When probing for devices, clover will call pipe_loader_probe() twice.
The first time to retrieve the number of devices, and then second time
to retrieve the device structures.
We currently assume that the return value of both calls will be the
same, but this will not be the case if a device happens to disappear
between the two calls.
When a device disappears, the pipe_loader_probe() will add a NULL
device to the device list, so we need to handle this.
v2:
- Keep range for loop
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
Correct some occurrences of -ldl and -lpthread to use
$(DLOPEN_LIBS) and $(PTHREAD_LIBS) respectively.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Other hardwares than AMD require to parse:
VAPictureParameterBufferH264.ReferenceFrames[16]
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
In general max_references cannot be based on num_render_targets.
This patch allows to allocate buffers with an accurate size.
I.e. no more than necessary. For other codecs it is a fixed
value 2.
This is similar behaviour as vaapi/vdpau-driver.
For now HEVC case defaults to num_render_targets as before.
But it could also benefits this change by setting a more
accurate max_references number in handlePictureParameterBuffer.
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
We need to emit at least one cut/emit in every
geometry shader, the easiest workaround it to
stick a single CUT at the top of each geom shader.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is specified in the docs for rv670 to work properly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
On some chips the GSVS itemsize needs to be aligned to a cacheline size.
This only applies to some of the r600 family chips.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This will allow adding tess stuff much cleaner later.
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just splits out a common pattern into an inline function
to make things cleaner to read.
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We really should initialise HS/LS_2 and SQ_LDS_ALLOC exists
on all evergreen not just cayman, so we should initialise
it as well.
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds the defines for a bunch of registers and shader
values that are required to implement tessellation.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Move some common code into one place, tess will also need
to use this function.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This helps in debugging unknown instructions.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Analogous to previous commit, minus the extra dup. We are the one
opening the device thus we can directly use the fd.
Spotted by Coverity (CID 1339867, 1339877)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Analogous to previous commit.
Spotted by Coverity (CID 1339868)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Analogous to previous commit.
Spotted by Coverity (CID 1339866)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Add some checks if the original/dup'd fd is valid and ensure that we
don't leak it on error. The former is implicitly handled within the
pipe_loader, although let's make things explicit and check beforehand.
Spotted by Coverity (CID 1339865)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
In theory this wouldn't be an issue, as we'll find the correct name and
break out of the loop before we hit the sentinel.
Let's fix this and avoid issues in the future.
Spotted by Coverity (CID 1339869, 1339870, 1339871)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Earlier commit factored out the mpeg4 IQ matrix handling into separate
function, although it forgot to add a break in its case statement.
Thus the data ended up partially overwritten as the mpeg4 and h265
structs are members of the desc union.
Spotted by Coverity (CID 1341052)
Fixes: 64761a841d "st/va: move MPEG4 functions into separate file"
Cc: Julien Isorce <j.isorce@samsung.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Non-timer queries are suspended during blits. When the blits end, the queries
are resumed, but this resume operation itself might run out of CS space and
trigger a flush. When this happens, we must prevent a duplicate suspend during
preflush suspend, and we must also prevent a duplicate resume when the CS flush
returns back to the original resume operation.
This fixes a regression that was introduced by:
commit 8a125afa6e
Author: Nicolai Hähnle <nhaehnle@gmail.com>
Date: Wed Nov 18 18:40:22 2015 +0100
radeon: ensure that timing/profiling queries are suspended on flush
The queries_suspended_for_flush flag is redundant because suspended queries
are not removed from their respective linked list.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reported-by: Axel Davy <axel.davy@ens.fr>
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Tested-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Zero length arrays are non standard:
warning C4200: nonstandard extension used : zero-sized array in struct/union
Cannot generate copy-ctor or copy-assignment operator when UDT contains a zero-sized array
And all code does `N * sizeof query_result->batch[0]`, so it should work
exactly the same.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Rather than assigning inloc up front, when we don't yet know if it will
be unused, assign it last thing before the legalize pass.
Also, realize when inputs are unused (since for frag shader's we can't
rely on them being removed from ir->inputs[]). This doesn't make sense
if we don't also dynamically assign the inloc's, since we could end up
telling the hw the wrong # of varyings (since we currently assume that
the # of varyings and max-inloc are related..)
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Make the interpolation / point-sprite replacement mode setup deal with
varying packing.
In a later commit, we switch to packing just the varying components that
are actually used by the frag shader, so we won't be able to assume
everything is vec4's aligned to vec4. Which would highly confuse the
previous vinterp/vpsrepl logic.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Since the query names are not very enlightening, and there are thousands
of them, GALLIUM_HUD=help should only show the first and last query name
for each hardware block.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Otherwise assert is raised from vlVaQueryConfigProfiles's for loop.
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
This was missed in commit 59cfb21d ("targets: use the non-inline sw
helpers").
Fixes build failure:
CXXLD libXvMCgallium.la
../../../../src/gallium/auxiliary/pipe-loader/.libs/libpipe_loader_static.a(libpipe_loader_static_la-pipe_loader_sw.o):(.data.rel.ro+0x0): undefined reference to `sw_screen_create'
collect2: error: ld returned 1 exit status
Makefile:756: recipe for target 'libXvMCgallium.la' failed
make[3]: *** [libXvMCgallium.la] Error 1
Trivial.
Analogous to previous commit. As we no longer have anyone who uses NIR
we can drop the link.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Unused as of commit 23fb11455b "{st,targets}/dri: use static/dynamic
pipe-loader"
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Previously (with the inline ones) things were embedded into the
pipe-loader, which means that we cannot control/select what we want in
each target.
That also meant that at runtime we ended up with the empty
sw_screen_create() as the GALLIUM_SOFTPIPE/LLVMPIPE were not set.
v2: Cover all the targets, not just dri.
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Edward O'Callaghan <edward.ocallaghan@koparo.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Feeling rather dirty copying the inline ones, yet we need the inline
ones for swrast only targets like libgl-xlib, osmesa.
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Edward O'Callaghan <edward.ocallaghan@koparo.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
With earlier commit we've dropped the manual iteration over the fixed
size array and prepemtively set the variable storing the size, that is
to be returned. Yet we forgot to adjust the comparison, as before we
were comparing the index, now we're comparing the size.
Fixes: ff9cd8a67c "pipe-loader: directly use
pipe_loader_sw_probe_null() at probe time"
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93091
Reported-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
The compiler has more information and is able to optimize the bits
it sets in these registers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
CC: <mesa-stable@lists.freedesktop.org>
Expose most of the performance counter groups that are exposed by Catalyst.
Ideally, the driver will work with GPUPerfStudio at some point, but we are not
quite there yet. In any case, this is the reason for grouping multiple
instances of hardware blocks in the way it is implemented.
The counters can also be shown using the Gallium HUD. If one is interested to
see how work is distributed across multiple shader engines, one can set the
environment variable RADEON_PC_SEPARATE_SE=1 to obtain finer-grained performance
counter groups.
Part of the implementation is in radeon because an implementation for
older hardware would largely follow along the same lines, but exposing
a different set of blocks which are programmed slightly differently.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Performance monitor queries can become very big, especially considering that
instances of a block in different shader engines are queried separately.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>