Commit graph

29173 commits

Author SHA1 Message Date
Ilia Mirkin
ecea2f69ef gm107/ir: fix texturing with indirect samplers
The indirect handle has to come right after the coordinates, so if there
was a sample/bias/depth compare/offset, everything would end up being
shifted by one argument position.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2016-10-18 09:56:14 -04:00
Marek Olšák
34099894c3 gallium/tgsi: add missing #include
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-18 11:20:57 +02:00
Julien Isorce
dbc8e18116 st/va: set default rt formats when calling vaCreateConfig
As specified in va.h, default value should be set on attributes
not present in the input list.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Mark Thompson <sw@jkqxz.net>
2016-10-18 08:44:14 +01:00
Nicolai Hähnle
9160b4d981 radeonsi: unify the constant load paths
Remove the split between direct and indirect.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-17 19:08:45 +02:00
Nicolai Hähnle
51f9b38ce8 radeonsi: fix indirect loads of 64 bit constants
This fixes GL45-CTS.compute_shader.fp64-case3.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-17 19:08:36 +02:00
Marek Olšák
74d145f4a8 radeonsi: shorten "shader->selector" to "sel" in si_shader_create
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-17 12:13:00 +02:00
Marek Olšák
2e74e8ead9 radeonsi: clear DB_RENDER_OVERRIDE
Vulkan doesn't set these fields even though it doesn't use HiS.
HiS is disabled by programming DB_SRESULTS_COMPARE_STATEn to 0.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-17 12:13:00 +02:00
Axel Davy
9baf4505fb st/nine: Fix multisample limit check
Fixes regression introduced by
b560305687

The regression prevents some apps to start.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-10-17 00:02:52 +02:00
Eric Anholt
c61eb3c91c vc4: Fix fast clear color packing for 565.
Piglit didn't manage to cover this because fbo-clear-formats uses
scissors, so we don't get fast clearing.
2016-10-16 11:22:50 -07:00
Tobias Klausmann
b7d9677de8 nv50/ir: constant fold OP_SPLIT
Split the source immediate value into new values and move them into the
original defs set by the split. Since we can only have up to 64-bit
immediates, this is largely beneficial for F64 (and, in the future, U64)
operations.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: always use U32, set newi for foldCount tracking]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-10-14 23:23:57 -04:00
Jose Fonseca
c6d17701c8 pipe_loader_sw: Don't invoke Unix close() on Windows.
Trivial.
2016-10-14 16:29:04 +01:00
Emil Velikov
48267b730c gallium: annotate sw_driver_descriptor instance as const data
Already treated and handled as such.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-14 11:09:00 +01:00
Emil Velikov
792148f16a gallium: annotate drm_driver_descriptor instance as const data
Already treated and handled as such.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-14 11:09:00 +01:00
Emil Velikov
c079a206ad gallium: rename drm_driver_descriptor::{, driver_}name
Historically we use "device name" for the name of the kernel module and
"driver name" for the dri/other driver.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-14 11:09:00 +01:00
Emil Velikov
9837cf13b1 gallium: remove unused drm_driver_descriptor::driver_name
Likely unused since day 1, although I've only checked back until the
st/dri unification with commit 29ca7d2c94 ("st/dri: merge dri/drm and
dri/sw backends")

Based on the comment, referencing drmOpenByName it's not something we
want to bring back.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-14 11:09:00 +01:00
Emil Velikov
0f031dcf11 gallium: fix drm_driver_descriptor::name comment
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-14 11:09:00 +01:00
Mark Thompson
0b241b7717 st/va: Fix H.264 PicOrderCnt value
TopFieldPicOrderCnt is exactly the PicOrderCnt value for a frame - see
H.264 section 8.2.1.

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-10-14 11:57:52 +02:00
Mark Thompson
1edaa33135 st/va: Baseline profile is not supported
Constrained baseline profile is supported, so use that instead.  This
matches what the encoder already does (constraint_set1_flag is always
set in the output bitstream).

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-10-14 11:57:48 +02:00
Mark Thompson
e0604eed9f st/va: Return surface formats depending on config chroma format
This makes the supported format actually match the configuration, and
allows the user to observe that NV12 is supported for video processing
where previously they couldn't (though it did always work if they
blindly tried to use it anyway).

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-10-14 11:57:44 +02:00
Mark Thompson
e7c7ef3625 st/va: Save surface chroma format in config
Both YUV420 and RGB32 configurations are supported, so we need to be
able to distinguish which is being used.

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-10-14 11:57:40 +02:00
Mark Thompson
8a931c83ba st/va: Return more useful config attributes
The encoder attributes are needed for a user of the encoder to be
able to configure it sensibly without internal knowledge.

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-10-14 11:57:25 +02:00
Tim Rowley
a42c22fdbf swr: [rasterizer core] don't construct pArContext on non-ar builds
Stops debug directory being created on non-ar builds.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-13 23:39:14 -05:00
Tim Rowley
29d07480b8 swr: [rasterizer core] remove WorkerWaitForThreadEvent bucket
Cause of bucket stop capture hang, as threads get stuck in level 1.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-13 23:39:14 -05:00
Tim Rowley
ada27b503e swr: [rasterizer core] move binner functionality to separate file
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-13 23:39:14 -05:00
Tim Rowley
f0a66c1da2 swr: [rasterizer scripts] add DEBUG_OUTPUT_DIR knob
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-13 23:39:14 -05:00
Tim Rowley
ffd0224303 swr: [rasterizer core] fix comment typo
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-13 23:39:14 -05:00
Tim Rowley
4889922210 swr: [rasterizer core/sim] 8x2 backend + 16-wide tile clear/load/store
Work in progress (disabled).

USE_8x2_TILE_BACKEND define in knobs.h enables AVX512 code paths
(emulated on non-AVX512 HW).

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-13 23:39:14 -05:00
Tim Rowley
bf1f46216c swr: [rasterizer archrast] fix event file issue with saving data
Also, tagging stats with draw id to correlate these events with
draw/dispatch events.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-13 23:39:13 -05:00
Eric Engestrom
827e038062 swr: [rasterizer common] fix assert index
Fixes: b3bd8bb611 ("swr: [rasterizer core] add support
       for "RAW" surface format")
CovID: 1373647
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-10-13 21:37:20 -05:00
Ilia Mirkin
afb6dc53bf nv50: enable ARB_enhanced_layouts
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-10-13 21:45:21 -04:00
Ilia Mirkin
a6d6eff2e6 nvc0/ir: be more careful about preserving modifiers in SHLADD creation
src2 was being given the wrong modifier, and we were not properly
managing the modifier on the SHL source either.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-10-13 21:44:03 -04:00
Brian Paul
b81546d43c tgsi: fix comment typo in tgsi_ureg.c
Trivial.
2016-10-13 17:38:49 -06:00
Eric Anholt
99d790538d vc4: Avoid loading from the texture during non-utile-aligned glTexImage().
Previously, the plan was "if the width/height we have to load/store isn't
the size the user is planning on writing, then we need to load the old
contents out beforehand to prevent writing back undefined".

However, when we're doing glTexImage() we often end up aligning the
width/height into the padding of the texture, and we don't actually
need to read out that padding.

Improves x11perf -aatrapezoid100 performance from ~460/sec to
~700/sec.
2016-10-13 14:27:30 -07:00
Axel Davy
0717cd975d st/nine: Fix possible segfault in surface ctor
Regression introduced by
ba0274c7d6

Check the resource exists before assigning it
a flag (and use This->base.resource instead
of pResource, since the former may have a newly
allocate resource, while the latter would be
NULL).

This should reintroduce the behaviour of previous
code.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-10-13 21:16:35 +02:00
Axel Davy
98b8ad61c6 st/nine: Remove useless code in nine_shader
Since 1604efa6fd,
lconsti and lconstb don't need to be initialized.

Remove some leftovers from the previous code (which
has now invalid use of ARRAY_SIZE on a pointer instead
of an array).

Reported by Coverity.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2016-10-13 21:16:35 +02:00
Axel Davy
197cdd1bbd gallium/os: Use unsigned integers for size computation
Use uint64_t instead of int64_t in the calculation,
as the result is uint64_t.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-13 21:16:35 +02:00
Samuel Pitoiset
4527222169 nvc0: enable ARB_enhanced_layouts
All ARB_enhanced_layouts piglit tests pass without any changes
in our compiler.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-10-13 21:13:34 +02:00
Marek Olšák
7dddf0b7ab radeonsi: adjust and clean up Z_ORDER and EXEC_ON_x settings
The table was copied from the Vulkan driver. The comment lines are as long
as the table for cosmetic reasons.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-13 19:00:51 +02:00
Marek Olšák
e12c1cab5d radeonsi: disable ReZ
This is a serious performance fix. Discovered by luck.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-13 19:00:51 +02:00
Marek Olšák
d4d9ec55c5 radeonsi: implement TC-compatible HTILE
so that decompress blits aren't needed and depth texturing needs less
memory bandwidth.

Z16 and Z24 are promoted to Z32_FLOAT by the driver, because TC-compatible
HTILE only supports Z32_FLOAT. This doubles memory footprint for Z16.
The format promotion is not visible to state trackers.

This is part of TC-compatible renderbuffer compression, which has 3 parts:
DCC, HTILE, FMASK. Only TC-compatible FMASK compression is missing now.

I don't see a measurable increase in performance though.

(I tested Talos Principle and DiRT: Showdown, the latter is improved by
 0.5%, which is almost noise, and it originally used layered Z16,
 so at least we know that Z16 promoted to Z32F isn't slower now)

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-13 19:00:51 +02:00
Marek Olšák
a077185ea9 gallium: add PIPE_RESOURCE_FLAG_TEXTURING_MORE_LIKELY
For performance tuning in drivers. It filters out window system
framebuffers and OpenGL renderbuffers.

radeonsi will use this to guess whether a depth buffer will be read
by a shader. There is no guarantee about what will actually happen.

This is a departure from PIPE_BIND flags which are defined to be strict
but they are useless in practice.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-13 19:00:51 +02:00
Nicolai Hähnle
761388a0eb radeonsi: fix regression in image atomics
Caused by a bad rebase when pushing commit 76a940893.
2016-10-13 16:04:16 +02:00
Nicolai Hähnle
76a940893d radeonsi: fix the coordinate overloading of llvm.amdgcn.image.atomic.cmpswap.*
Fixes GL45-CTS.shader_image_load_store.basic-allTargets-atomic*

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-13 10:17:42 +02:00
Emil Velikov
a4622305e6 swr: automake: add ar_eventhandlerfile_h.template to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-12 18:55:22 +01:00
Ilia Mirkin
a48a343c29 nvc0/ir: fix textureGather with a single offset
Recent fix for non-const offsets broke the case of a single offset (vs 4
offsets). The later code relies on the offs array to contain null values
to tell whether they should be added onto the srcs list.

Fixes: 5239bd592 ("nvc0/ir: fix overwriting of value backing non-constant gather offset")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2016-10-12 13:18:14 -04:00
Ilia Mirkin
300b5ad023 nv50/ir: copy over value's register id when resolving merge of a phi
The offset needs to be properly copied over to the phi value, otherwise
it will get assigned to the base of the merge instead of the proper
location.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2016-10-12 13:18:14 -04:00
Nicolai Hähnle
789119d212 st/mesa: enable ARB_enhanced_layouts and turn the cap on
v2: mark llvmpipe & softpipe properly as well (Jason Wood)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
2b460c750a tgsi/ureg: add ureg_DECL_output_layout
For specifying an exact location/component.

v2: change the order of parameters (Dave)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
047a7c7a0b tgsi/ureg: add layout/component input declarations
v2: change the order of parameters (Dave)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
f9a01f3872 tgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arrays
v2: remove a tautological left-over assert (Marek)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2016-10-12 18:50:10 +02:00