Commit graph

28814 commits

Author SHA1 Message Date
Leo Liu
ffb863fd2c st/omx/dec/h265: fix the skip for before and after list
For reference picture sets, there are cases that rps will not always
be used. Once detect the unused flag from encoded bitstream, we should
not add this rps to any list, otherwise pass the incorrect reference
and skip the correct rps.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-04 11:09:59 -04:00
Leo Liu
c50b68e6a8 st/omx/dec/h265: set the default reference picture set for reference
It will fix the corruption for frame, that only has one stort term ref
picture set, we set NULL rps for this case previously, causing taking
incorrect reference. Instead we should take that only short term set
as reference

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-04 11:09:59 -04:00
Leo Liu
091aae0265 st/omx/dec/h265: decoder size should follow from sps
The video size from format container is not always compatible with
the size from codec bitstream, the HW decoder should take the size
information from bitstream, otherwise the corruption appears with clip
that has different size info between bitstream and format container

So we are passing width(height)_in_samples from sequence parameter
set to video decoder.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-04 11:09:59 -04:00
Leo Liu
2371119db9 st/omx/dec/h265: increase dpb max size to 32
For clip with frame delta poc over 16

Signed-off-by: Leo Liu <leo.liu@amd.com>
2016-10-04 11:09:59 -04:00
Nicolai Hähnle
8b1f9fd3b3 radeonsi: optionally run the LLVM IR verifier pass
This is enabled automatically if shader printing is enabled, or separately
by R600_DEBUG=checkir. Catch mal-formed IR before it crashes in a later
pass.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-04 16:39:33 +02:00
Nicolai Hähnle
1e9476e8c5 gallium/radeon: fix argument type of llvm.{cttz,ctlz}.i32 intrinsics
Caught by R600_DEBUG=checkir (next commit).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-04 16:39:28 +02:00
Nicolai Hähnle
1b6fb88ab2 gallium/radeon: unify the creation of basic blocks
This changes the order of basic blocks to be equal to the order of code in the
original TGSI, which is nice for making sense of shader dumps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-04 16:39:25 +02:00
Nicolai Hähnle
d377f4c1ca gallium/radeon: merge branch and loop flow control stacks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-04 16:39:21 +02:00
Nicolai Hähnle
b0d50e157d gallium/radeon: simplify if/else/endif blocks
In particular, we no longer emit an else block when there is no ELSE
instruction.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-04 16:39:18 +02:00
Nicolai Hähnle
89e9de2ea6 gallium/radeon: label basic blocks by the corresponding TGSI pc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-04 16:39:15 +02:00
Nicolai Hähnle
6f87d7a146 gallium/radeon: cleanup and fix branch emits
Some of the existing code is needlessly complicated. The basic principle
should be: control-flow opcodes emit branches to properly terminate the
current block, _unless_ the current block already has a terminator (which
happens if and only if there was a BRK or CONT).

This also fixes a bug where multiple terminators were created in a block.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-04 16:39:10 +02:00
Nicolai Hähnle
dfc1afda83 winsys/radeon: add buffer_get_reloc_offset
Really fix the bug that was supposed to be fixed by commits 3e7cced4b and
a48bf02d: even when virtual addresses are used, the legacy relocation-based
method with offsets relative to the kernel's buffer object are used for
video submissions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-04 16:37:44 +02:00
Marek Olšák
71a5cf6f3b radeonsi: don't declare LDS in PS when ds_bpermute is used
I guess this is not needed because dead code elimination removes
the declaration.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:16 +02:00
Marek Olšák
b2a694f079 radeonsi: use DDX/DDY directly in si_llvm_emit_ddxy_interp
We can finally do this, because the opcodes are scalar now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:14 +02:00
Marek Olšák
b57aef8033 radeonsi: simplify si_llvm_emit_ddxy
si_llvm_emit_ddxy is called once per element, so we don't have to generate
code for 4 elements at once.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:12 +02:00
Marek Olšák
046c199c3a radeonsi: don't call build_gep0 in si_llvm_emit_ddxy on VI
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:11 +02:00
Marek Olšák
bcc55e1f32 radeonsi: use a helper function for BuildGEP(0, x)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:10 +02:00
Marek Olšák
e20f7142a3 radeonsi: remove obsolete shader definitions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:09 +02:00
Marek Olšák
8c6ea5a6ff radeonsi: remove unnecessary #includes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:07 +02:00
Marek Olšák
3388f27d84 radeonsi: clean up lucky #include dependencies
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:06 +02:00
Marek Olšák
53d2c8f00f radeonsi: don't re-create shader PM4 states after scratch buffer update
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:05 +02:00
Marek Olšák
6c01684393 gallium/radeon: move r600_common_context::texture_buffers to r600g
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:03 +02:00
Marek Olšák
7ce19d9014 radeonsi: don't set sampler buffer offsets in create_sampler_view
do it at bind time, so that pipe_sampler_view is immutable with regard to
buffer reallocations and we don't have to remember all existing buffer
views.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:01 +02:00
Marek Olšák
7e6428e0a8 radeonsi: optimize si_invalidate_buffer based on bind_history
Just enclose each section with: if (rbuffer->bind_history & PIPE_BIND_...)

Bioshock Infinite: +1% performance

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:12:00 +02:00
Marek Olšák
e43bd861e8 radeonsi: track buffer bind history
similar to gl_buffer_object::UsageHistory

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:11:58 +02:00
Marek Olšák
b523a9ddc5 radeonsi: drop support for NULL sampler views
not used anymore. It was used when the polygon stipple texture was constant.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:11:57 +02:00
Marek Olšák
82e51e8188 radeonsi: separate IA_MULTI_VGT_PARAM and VGT_PRIMITIVE_TYPE emission
We want to emit IA_MULTI_VGT_PARAM less often because it's a context reg.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:11:56 +02:00
Marek Olšák
3ee9be42ac radeonsi: move VGT_LS_HS_CONFIG to derived tess_state
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:11:53 +02:00
Marek Olšák
f92113c5a1 radeonsi: don't check PIPE_BARRIER_MAPPED_BUFFER
Caches are always flushed at IB boundary.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:11:51 +02:00
Marek Olšák
ca1d1e0e19 radeonsi: parse SURFACE_SYNC correctly on CIK-VI
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:11:49 +02:00
Marek Olšák
37065b0583 gallium/radeon: inline r600_context_add_resource_size
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-10-04 16:11:47 +02:00
James Legg
e33f31d61f radeonsi: Fix primitive restart when index changes
If primitive restart is enabled for two consecutive draws which use
different primitive restart indices, then the first draw's primitive
restart index was incorrectly used for the second draw.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-10-04 15:57:37 +02:00
Matt Whitlock
42ed8a6c9c gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-04 11:09:03 +02:00
Matt Whitlock
ac6064f918 st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-04 11:09:01 +02:00
Matt Whitlock
0c060f691c st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-04 11:08:58 +02:00
Matt Whitlock
5d0069eca2 gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)
Without this fix, duplicated file descriptors leak into child processes.
See commit aaac913e90 for one instance
where the same fix was employed.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-04 11:08:55 +02:00
Nayan Deshmukh
b7a0f2e1f7 vl/dri3: fix warning about incompatible pointer type
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2016-10-03 12:51:30 -04:00
Bruce Cherniak
903d00cd32 swr: Removed stalling SwrWaitForIdle from queries.
Previous fundamental change in stats gathering added a temporary
SwrWaitForIdle to begin_query and end_query.  Code has been reworked to
remove stall.

Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2016-10-03 09:57:45 -05:00
Tim Rowley
cdac042733 swr: [rasterizer core] refactor thread creation
Create worker pool now computes number of worker threads based on
things like topologies, etc. and creates the pool but doesn't actually
launch the threads. Instead there is a separate start thread pool
function. This allows thread resources to be constructed first before
threads start.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:38 -05:00
Tim Rowley
114f7a92c6 swr: [rasterizer jitter] canonicalize blend compile state
Canonicalize to prevent unnecessary JIT compiles.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:31 -05:00
Tim Rowley
4198520a82 swr: [rasterizer core] archrast fixes
- Immediately sleep threads until thread data is initialized
- Fix some compile bugs with AR enabled

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:25 -05:00
Tim Rowley
aaeb07989e swr: [rasterizer jitter] fixes for icc in vs2015 compat mode
- Move most jitter functionality into SwrJit namespace
- Avoid global "using namespace llvm" in headers

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:19 -05:00
Tim Rowley
b8a6f06c85 swr: [rasterizer core] generalize compute dispatch mechanism
Generalize compute dispatch mechanism to support other types of dispatches.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:57:13 -05:00
Tim Rowley
33a1a09eb0 swr: [rasterizer common] os.h portability header changes
- Fix conflict between windows MemoryFence and llvm::sys::MemoryFence
- Declare gettid()

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-03 09:56:47 -05:00
Steven Toth
e99b9395be gallium/hud: Add support for CPU frequency monitoring
Detect all of the CPUs in the system. Expose metrics
for min, max and current frequency in Hz.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-09-30 15:18:46 -06:00
Marek Olšák
7b87190d2b Revert "gallium/hud: automatically print % if max_value == 100"
This reverts commit dbfeb0ec12.

With max_value being rounded to 100, it's often wrong.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-09-30 22:07:12 +02:00
Nicolai Hähnle
7bac5bf032 gallium/radeon: fix crash/regression in performance counters
Regression introduced by "gallium/radeon: zero all query buffers".

Cc: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-30 12:41:45 +02:00
Nicolai Hähnle
cfd870de70 gallium/radeon: update documentation of buffer_get_virtual_address
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-30 12:41:41 +02:00
Nicolai Hähnle
fd9f54223d gallium/radeon: emit relocations for query fences
This is only needed for r600 which doesn't have ARB_query_buffer_object and
therefore wouldn't really need the fences, but let's be optimistic about
filling in this feature gap eventually.

Cc: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-09-30 12:38:57 +02:00
Nicolai Hähnle
3e7cced4b9 radeon/uvd: adjust the buffer offset when relocation is used
We don't plan to use sub-allocated buffers with UVD, but just in case one
slips through, this increases the chances of things working out anyway.

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-09-30 12:38:52 +02:00