For reference picture sets, there are cases that rps will not always
be used. Once detect the unused flag from encoded bitstream, we should
not add this rps to any list, otherwise pass the incorrect reference
and skip the correct rps.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
It will fix the corruption for frame, that only has one stort term ref
picture set, we set NULL rps for this case previously, causing taking
incorrect reference. Instead we should take that only short term set
as reference
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
The video size from format container is not always compatible with
the size from codec bitstream, the HW decoder should take the size
information from bitstream, otherwise the corruption appears with clip
that has different size info between bitstream and format container
So we are passing width(height)_in_samples from sequence parameter
set to video decoder.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This is enabled automatically if shader printing is enabled, or separately
by R600_DEBUG=checkir. Catch mal-formed IR before it crashes in a later
pass.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This changes the order of basic blocks to be equal to the order of code in the
original TGSI, which is nice for making sense of shader dumps.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Some of the existing code is needlessly complicated. The basic principle
should be: control-flow opcodes emit branches to properly terminate the
current block, _unless_ the current block already has a terminator (which
happens if and only if there was a BRK or CONT).
This also fixes a bug where multiple terminators were created in a block.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Really fix the bug that was supposed to be fixed by commits 3e7cced4b and
a48bf02d: even when virtual addresses are used, the legacy relocation-based
method with offsets relative to the kernel's buffer object are used for
video submissions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
I guess this is not needed because dead code elimination removes
the declaration.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
We can finally do this, because the opcodes are scalar now.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
si_llvm_emit_ddxy is called once per element, so we don't have to generate
code for 4 elements at once.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
do it at bind time, so that pipe_sampler_view is immutable with regard to
buffer reallocations and we don't have to remember all existing buffer
views.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
similar to gl_buffer_object::UsageHistory
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
not used anymore. It was used when the polygon stipple texture was constant.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
We want to emit IA_MULTI_VGT_PARAM less often because it's a context reg.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Caches are always flushed at IB boundary.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
If primitive restart is enabled for two consecutive draws which use
different primitive restart indices, then the first draw's primitive
restart index was incorrectly used for the second draw.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025
Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Previous fundamental change in stats gathering added a temporary
SwrWaitForIdle to begin_query and end_query. Code has been reworked to
remove stall.
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
Create worker pool now computes number of worker threads based on
things like topologies, etc. and creates the pool but doesn't actually
launch the threads. Instead there is a separate start thread pool
function. This allows thread resources to be constructed first before
threads start.
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
- Immediately sleep threads until thread data is initialized
- Fix some compile bugs with AR enabled
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
- Move most jitter functionality into SwrJit namespace
- Avoid global "using namespace llvm" in headers
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Detect all of the CPUs in the system. Expose metrics
for min, max and current frequency in Hz.
Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Regression introduced by "gallium/radeon: zero all query buffers".
Cc: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This is only needed for r600 which doesn't have ARB_query_buffer_object and
therefore wouldn't really need the fences, but let's be optimistic about
filling in this feature gap eventually.
Cc: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We don't plan to use sub-allocated buffers with UVD, but just in case one
slips through, this increases the chances of things working out anyway.
Reviewed-by: Christian König <christian.koenig@amd.com>