Commit graph

29934 commits

Author SHA1 Message Date
George Kyriazis
e259efd805 swr: Update fs texture & sampler state logic
In swr_update_derived() update texture and sampler state on a new fragment
shader.  GALLIUM_HUD can update fs using a previously bound texture and
sampler.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-01-25 10:02:50 -06:00
Samuel Pitoiset
cff199ceb7 gallium/radeon: add a new HUD query for the number of mapped buffers
Useful when debugging applications which map a ton of buffers
and also because we used to run into Linux's limit on the number
of simultaneous mmap() calls.

v2: - update the commit message

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-25 15:19:21 +01:00
Marek Olšák
eac7df43ca radeonsi: handle first_non_void correctly in si_create_vertex_elements
This fixes R11G11B10_FLOAT, because it's in the category of "OTHER",
meaning that it doesn't have any channel description.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-24 23:52:01 +01:00
Roland Scheidegger
f4df21ed95 gallivm: don't try to use fast rcp for fdiv
The use of fast rcp instruction is disabled, and will always fall back
to use a division instead (1 / x). Hence, if we get a division opcode,
it doesn't make much sense trying to split that into rcp/mul.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-24 19:12:46 +01:00
Roland Scheidegger
25208949d7 gallivm: (trivial) fix ddiv cpu implementation
we can't use the cpu implementation of fdiv, as this one uses different
lp_build_context, which causes assertion failure.
Just use default fdiv action (there is no fast rcp for doubles which we
could potentially use anyway).

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-24 19:12:46 +01:00
Roland Scheidegger
3b575a955c tgsi: implement ddiv opcode
softpipe (along with llvmpipe) claims to support arb_gpu_shader_fp64,
so we really need to support that opcode.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-24 19:12:46 +01:00
Samuel Pitoiset
d90d37db73 gallium/radeon: undef the very specific UPDATE_COUNTER macro
Also, wrap this into a do { ... } while (0). Suggested by Nicolai.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-24 11:17:25 +01:00
Ilia Mirkin
b755f2f233 nv50: add support for MUL_ZERO_WINS property
This is simply keyed off the vertex shader, as that's guaranteed to be
present in any pipeline.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-01-23 20:37:14 -05:00
Ilia Mirkin
8c764a2321 nvc0: add support for MUL_ZERO_WINS property
This sets the dnz flag on all the relevant multiplication operations. At
emission time, this will only be supported by nvc0+, so nv50 will need a
different solution.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-01-23 20:37:14 -05:00
Ilia Mirkin
e1346f25bf st/nine: set the MUL_ZERO_WINS flag when supported
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2017-01-23 20:37:10 -05:00
Ilia Mirkin
6e40938fbc gallium: add PIPE_CAP_TGSI_MUL_ZERO_WINS
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2017-01-23 20:36:47 -05:00
Ilia Mirkin
a2b2cd81d1 gallium: add TGSI_PROPERTY_MUL_ZERO_WINS
This will be useful for proper D3D9 emulation, where this behavior is
expected by some shaders.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2017-01-23 20:35:55 -05:00
Marek Olšák
573bf0940a radeonsi: always set the TCL1_ACTION_ENA when invalidating L2
Some CIK-VI docs say this is the default behavior on SI. That doesn't
answer whether it's also the default behavior on CIK-VI.

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Marek Olšák
5d3dd70cab radeonsi: don't declare LDS in TES
not used since we started using the offchip tess ring

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Marek Olšák
59c5da40ed radeonsi: preload PS inputs only if KILL is used
so that most shaders can get lower VGPR usage thanks to lazy input loading.
I think this is a more accurate constraint that prevents the black transitions
in Witcher 2.

Affected shaders (7758):
Max Waves: 57437 -> 58231 (1.38 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Marek Olšák
7b32ae4df5 gallium/radeon: adjust the rule for using the LINEAR_ALIGNED layout
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Marek Olšák
e248390e93 winsys/amdgpu: drop all IBs if at least one was rejected within the context
The corruption is inevitable and hangs are possible too.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Marek Olšák
1840800860 winsys/amdgpu: report a rejected IB as a lost context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Samuel Pitoiset
aa2ace8e49 gallium/radeon: add HUD queries for monitoring some hw blocks
It's also possible to monitor them via performance counters but
the hardware can only use two counters simultaneously. It seems
easier to re-use the existing code which reads from MMIO instead
of writing a multi-pass approach.

v2: - add new lines after ':'

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-23 21:19:49 +01:00
Samuel Pitoiset
a704f19247 gallium/radeon: refactor the GRBM counters path
This will allow to expose more queries in order to know which
blocks are busy/idle.

v2: - add new lines after ':'

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-23 21:19:49 +01:00
George Kyriazis
00847e4f14 swr: Align query results allocation
Some query results struct contents are declared as cache line aligned.
Use aligned malloc, and align the whole struct, to be safe.

Fixes crash when compiling with clang.

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-01-23 14:15:54 -06:00
Bruce Cherniak
b829206b07 swr: Prune empty nodes in CalculateProcessorTopology.
CalculateProcessorTopology tries to figure out system topology by
parsing /proc/cpuinfo to determine the number of threads, cores, and
NUMA nodes.  There are some architectures where the "physical id" begins
with 1 rather than 0, which was creating and empty "0" node and causing a
crash in CreateThreadPool.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
CC: <mesa-stable@lists.freedesktop.org>
2017-01-23 13:52:26 -06:00
Christian König
1338d912f5 st/va: make sure that we call begin_frame() only once v2
This fixes "st/va: delay calling begin_frame until we have all parameters".

v2: call begin frame after decoder (re)creation as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
2017-01-23 17:00:04 +01:00
Nicolai Hähnle
e4f8f9a638 r600: implement DDIV
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-23 16:17:15 +01:00
Nicolai Hähnle
488560cfe6 r600: factor out cayman_emit_unary_double_raw
We will use it for DDIV.

Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-23 16:17:12 +01:00
Nicolai Hähnle
76b02d2fe1 r600: double multiply can handle only one multiply at a time
It seems clear that trying to multiply two pairs of doubles would result
in the temporary register getting overwritten by the second pair. So
make the code more explicit.

Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-23 16:15:45 +01:00
Rob Clark
31daeb5bf1 freedreno/a5xx: set frag shader threadsize
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:12:05 -05:00
Rob Clark
8d6af93e76 freedreno/a5xx: set fragcoordxy properly
What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into
bary.f.  We were incorrectly setting both this and gl_FragCoord.xy to
the same register resulting in all sorts of hilarity.

Fixes stk, vdrift, 0ad, probably a bunch others.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:11:43 -05:00
Rob Clark
278b97946f freedreno/ir3: setup var locations in standalone compiler
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-01-22 14:11:26 -05:00
Rob Clark
6cc93bedc1 freedreno/a5xx: fix psize
Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on
a5xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:11:15 -05:00
Rob Clark
141a4f86d6 freedreno/a5xx: srgb fix
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:11:04 -05:00
Rob Clark
69fbb458cf freedreno/a5xx: fix int vbos
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:10:54 -05:00
Rob Clark
16671e9704 freedreno/a5xx: fix clear for uint/sint formats
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:10:42 -05:00
Rob Clark
4d9aa4f67d freedreno/a5xx: fix cull state
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:10:28 -05:00
Rob Clark
4c39458460 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:09:45 -05:00
Marek Olšák
74f40d1570 Revert "radeonsi: reject invalid vertex element formats"
This reverts commit 9e4d1d8a7c.

It broke arb_vertex_type_10f_11f_11f_rev-draw-vertices, which has
first_non_void == -1.
2017-01-20 16:02:45 +01:00
Philipp Zabel
a37cf630b4 gallium: add pipe_screen::resource_changed callback wrappers
Add resource_changed to the ddebug, rbug, and trace wrappers. Since it
is optional, there is no need to add it to noop.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Suggested-by: Nicolai Hähnle <nhaehnle@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:30 +01:00
Philipp Zabel
c70ed79e79 etnaviv: implement resource_changed to invalidate internal resources derived from imported buffers
Implement the resource_changed pipe callback to invalidate internal
resources derived from imported buffers. This is needed to update the
texture for re-imported renderables.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:30 +01:00
Philipp Zabel
362edc868c etnaviv: initialize seqno of imported resources
Imported resources already have contents that we want to be copied to
texture resources derived from them. Set initial seqno of imported
resources to 1, just as if it had already been rendered to.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:29 +01:00
Philipp Zabel
2c95d6dac3 st/dri: ask the driver to update its internal copies on reimport
For imported buffers that can't be used directly as a source to the
texture samplers, the pipe driver might need to create an internal
copy, for example in a different tiling layout. When buffers are
reimported they may contain new image data, so the driver internal
copies need to be recreated.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:29 +01:00
Philipp Zabel
30853f55a3 gallium: add pipe_screen::resource_changed
Add a hook to tell drivers that an imported resource may have changed
and they need to update their internal derived resources.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:29 +01:00
Samuel Pitoiset
383fc8e9f3 gallium/hud: add missing break in hud_cpufreq_graph_install()
Fixes: e99b9395be "gallium/hud: Add support for CPU frequency monitoring"
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-01-20 10:33:47 +01:00
Marek Olšák
9e4d1d8a7c radeonsi: reject invalid vertex element formats
This should fix a coverity defect.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-01-19 16:38:37 +01:00
Marek Olšák
e490b7812c radeonsi: don't forget to add HTILE to the buffer list for texturing
This fixes VM faults. Discovered by Samuel Pitoiset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-01-19 16:38:37 +01:00
Nayan Deshmukh
31908d6a4a st/vdpau: only send buffers with B8G8R8A8 format to X
PresentPixmap only works if the pixmap depth matches with the
window depth, otherwise it returns a BadMatch protocol error.
Even if the depths match, the result won't look correctly
if the VDPAU RGB component order doesn't match the X11 one so
we only allow the X11 format.
For other buffers we copy them to a buffer which is send to X.

v2: only send buffers with format VDP_RGBA_FORMAT_B8G8R8A8
v3: reword commit message
v4: add comment explaining the code

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-01-19 15:34:02 +01:00
Nicolai Hähnle
3cd092c415 radeonsi: fix texture gather on stencil textures
At least on VI, texture gather doesn't work with a 24_8 data format, so
use 8_8_8_8 and a modified swizzle instead.

A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select
the X24S8 pipe format because we don't support stencil-only render targets
properly. With mip-mapping this can lead to a setup where the tiling is
incompatible with stencil texturing, and a flushed stencil texture is
used. For the flushed stencil, a literal X24S8 is used because there were
issues with an 8bpp DB->CB copy.

Longer term, it would be good if we could get away from these workarounds,
i.e. properly support an S8 format for stencil-only rendering and flushed
stencil. Since stencil texturing is somewhat rare, it's not a high
priority.

Fixes GL45-CTS.texture_cube_map_array.sampling.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-01-19 15:02:57 +01:00
Zachary Michaels
d7d32b3bfe radeonsi: Always leave poly_offset in a valid state
This commit makes si_update_poly_offset set poly_offset to NULL if
uses_poly_offset is false. This way poly_offset either points into the
currently queued rasterizer, or it is NULL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-19 10:50:16 +01:00
Dave Airlie
ef71b867ee gallivm: use #ifdef not #if for PIPE_ARCH_BIG_ENDIAN
This fixes the build on ppc/s390.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-19 16:00:53 +10:00
Marek Olšák
c67a2793b3 radeonsi: determine in advance which VBOs should be added to the buffer list
v2: now it should be correct

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-18 19:51:31 +01:00
Marek Olšák
1db2bf8d2b radeonsi: use fewer pointer dereferences in upload_vertex_buffer_descriptors
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-18 19:51:31 +01:00