Commit graph

39979 commits

Author SHA1 Message Date
Connor Abbott
96c2a2832f ttn: Fill out more info fields
We'll use these in radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-03 15:54:57 +02:00
Connor Abbott
dcc64fcfed nir: Fix num_ssbos when lowering atomic counters
Otherwise it's impossible to know the maximum SSBO index for both
internal TGSI shaders from TTN (which don't have any notion of atomic
counters and no offset) as well as shaders from GLSL.

I fixed everything I could find while grepping for num_ssbos and
num_abos, which hopefully is everything (iris was the only user I could
find that uses it in a meaningful way).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-03 15:54:54 +02:00
Alyssa Rosenzweig
5cdfccf8a6 panfrost: Remove panfrost_upload
This routine was made obsolete over a series of reworks of memory
allocation; Tomeu's changes to shader memory allocation finally made
this unused as cppcheck noted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig
42f0aae874 panfrost: Fix misc. issues flagged by cppcheck
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig
6bd18bb264 panfrost: Mark (1 << 31) as unsigned
I was not aware this incurred undefined behaviour; thank you cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-03 13:55:29 +02:00
Zhaowei Yuan
9db06a5350 broadcom/vc4: Expand width of dst surface
Four bytes of src_surf will be compressed into a 32-bits data and
stored into dst_surf, and dst_surf is read as z-order, so its width
must be aligned to multiples of 8(4x2) before divided by 2.

Signed-off-by: Zhaowei Yuan <zhaowei.yuan@samsung.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111266

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-09-03 08:47:43 +02:00
Vinson Lee
538820ff5f swr: Fix make_unique build error.
swr_shader.cpp: In function ‘void (* swr_compile_gs(swr_context*, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT*)’:
swr_shader.cpp:732:44: error: ‘make_unique’ was not declared in this scope
    ctx->gs->map.insert(std::make_pair(key, make_unique<VariantGS>(builder.gallivm, func)));
                                            ^~~~~~~~~~~

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-09-02 14:52:23 -07:00
Kenneth Graunke
87fa8d9ebc iris: Lessen texture cache hack flush for blits/copies on Icelake.
Lionel found actual documentation for this at long last.  Apparently
it actually is a sampler cache limitation that was mostly fixed on
Icelake.  Unfortunately, it seems there are still issues with ASTC
and non-ASTC sampler views.  Still, we can lessen the flush condition
from "format mismatch" to "ASTC mismatch", which eliminates most of
the flushing here.

We also update the documentation to refer to the workaround name.
2019-08-31 20:17:55 -07:00
Erik Faye-Lund
52af1427c6 gallium/auxiliary/indices: consistently apply start only to input
The majority of these only apply the start argument to the input, but a
few of them also does for the output-array. util_primconvert, the only
user of this argument expects this pass a non-zero start-argument does
not expect this to be applied to the output; if it is, it will write
outside of allocated memory, leading to VRAM corruption.

The reason this doesn't seem to have been noticed before, is that no
driver currently use util_primconvert to convert a primitive-type to
itself, which is the cases where this was broken. But for Zink, this
will no longer be true, because we need to eliminate the use of 8-bit
index-buffers.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 28f3f8d413 ("gallium/auxiliary/indices: add start param")
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-31 19:45:52 +00:00
Vinson Lee
3664a6600e swr: Fix build with llvm-9.0 again.
Commit 6f7306c029 ("swr/rast: Refactor memory API between rasterizer
core and swr") unintentionally removed changes for llvm-9.0.

Fixes: 6f7306c029 ("swr/rast: Refactor memory API between rasterizer core and swr")
Fixes: 5dd9ad1570 ("swr/rasterizer: Better implementation of scatter")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-08-31 00:20:40 -07:00
Alyssa Rosenzweig
20237166b6 pan/midgard: Use shared psiz clamp pass
We already had a perfectly cromulent pass for this, but one landed in
common NIR code so let's switch and lighten our tree.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 16:06:09 -07:00
Boris Brezillon
9087cf7015 panfrost: Add transient BOs to job batches
Memory allocated through panfrost_allocate_transient() is likely to
come from the transient pool. Let's add the BO backing the allocated
memory region to the job batch so the kernel can retain this BO while
jobs are executed.

In practice that has never been a problem because the transient pool
is never shrinked, and even if it was, we still control the lifetime of
the job, so there's no reason for this BO to be freed before the GPU is
done executing the batch. But it still make sense to add the BO for
debugging purpose.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 22:13:41 +02:00
Rohan Garg
b2ff2dfc2a panfrost: protect access to shared bo cache and transient pool
Both the BO cache and the transient pool are shared across
context's. Protect access to these with mutexes.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-30 22:10:49 +02:00
Rohan Garg
6b0dc3d530 panfrost: Jobs must be per context, not per screen
Jobs _must_ only be shared across the same context, having
the last_job tracked in a screen causes use-after-free issues
and memory corruptions.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-30 22:06:54 +02:00
Khaled Emara
6926f56d5b freedreno/a3xx: fix sysmem <-> gmem tiles transfer
Tiling mode was missing from fd3_emit_gmem_restore_tex().
emit_gmem2mem_surf() used LINEAR exclusiveley.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-08-30 08:54:30 -07:00
Khaled Emara
ed1954ced3 freedreno/a3xx: fix texture tiling parameters
* Fix 2D/2DArray/3D tiling parameters:
  There is a bottom threshold for width and height.
* Renable tiling for Cubemap, after setting the right parameters.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-08-30 08:54:30 -07:00
Dave Stevenson
873b092e91 broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.
Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with
arbitrary offset/stride." for v3d.

Allows YUV buffers with a single buffer and plane offsets to be
passed in.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-30 10:53:05 +02:00
Jan Zielinski
2263e6a895 swr/rasterizer: Fix GS attributes processing
Input to GS is just a set of attributes, so remove explicit setup of
'position' which is meaningless for GS input processing.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-30 07:31:45 +00:00
Samuel Pitoiset
9f2fd23f99 ac: drop now useless lookup_interp_param from ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-30 08:23:56 +02:00
Samuel Pitoiset
a63719db6a ac: import linear/perspective PS input parameters from radv/radeonsi
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-30 08:23:54 +02:00
Dave Airlie
a69ae76cc8 gallivm: disable accurate cube corner for integer textures.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111511
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-08-30 08:27:16 +10:00
Thong Thai
8d03a6b700 radeonsi: add JPEG decode support for VCN 2.0 devices
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2019-08-29 17:27:35 -04:00
Thong Thai
2a3a560407 Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"
This reverts commit 5a2e65be89.

Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL
still needs to be emitted by the UMD, or else the driver will hang

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-29 17:27:15 -04:00
Kenneth Graunke
30b9ed92ea iris: Fix partial fast clear checks to account for miplevel.
We enabled fast clears at level > 0, but didn't minify the dimensions
when comparing the box size, so we always thought it was a partial
clear and as a result never actually enabled any.

This eliminates some slow clears in Civilization VI, but they are mostly
during initialization and not the main rendering.

Thanks to Dan Walsh for noticing we had too many slow clears.

Fixes: 393f659ed8 ("iris: Enable fast clears on other miplevels and layers than 0.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-08-29 11:27:16 -07:00
Rohan Garg
394192fcee panfrost: Remove unused argument from panfrost_drm_submit_vs_fs_job()
is_scanout is not used anywhere and can be inferred within
panfrost_drm_submit_vs_fs_job() if required.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-29 19:03:17 +02:00
Kenneth Graunke
fda9fb8dcd iris: Actually describe bo_reuse driconf option
Otherwise it doesn't exist and can't be parsed, so everything dies at
screen init time.

Fixes: 6dc4ddc5f8 ("iris: use driconf for 'bo_reuse' parameter")
2019-08-29 09:40:34 -07:00
Tomeu Vizoso
aace7d3500 panfrost/ci: Print only regressions
Some functionality has been added to deqp-volt to only print
regressions, so update our version of it and use the new options.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-29 17:12:04 +02:00
Roland Scheidegger
332b21db55 gallivm: use fallback code for mul_hi with llvm >= 7.0
LLVM 7.0 ditched the pmulu intrinsics.
This is only a trivial patch to use the fallback code instead.
It'll likely produce atrocious code since the pattern doesn't match what
llvm itself uses in its autoupgrade paths, hence the pattern won't be
recognized.

Should fix https://bugs.freedesktop.org/show_bug.cgi?id=111496

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-29 16:55:49 +02:00
Jan Zielinski
e64091ebd4 swr/rasterizer: Enable ARB_fragment_layer_viewport
Added loading gl_Layer and gl_ViewportIndex variables
to Pixel Shader context.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-29 12:09:05 +02:00
Tapani Pälli
6dc4ddc5f8 iris: use driconf for 'bo_reuse' parameter
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-29 09:33:52 +03:00
Kenneth Graunke
90ca709f6d iris: Don't auto-flush/dirty on transfer unmap for coherent buffers
When u_upload_mgr fills up a buffer, it unmaps and destroys it.  Our
unmap function was automatically performing the equivalent of a
FlushMappedBufferRange call in this case.  Because the buffer mapping
is persistent and coherent, we don't actually do any flushing when we
do the rest of the writes to the buffer - we were just doing one final
one at the end.  But we would be using the uploaded contents on the
GPU the whole time.

This certainly shouldn't be necessary for streaming buffers, and if
such flushing and dirtying is necessary for coherent buffers, this is
wildly insufficient.

Drops a small number of constant packets and PIPE_CONTROL flushes from
most benchmarks that I've looked at.  Doesn't seem to make much of an
impact on performance, however.

Thanks to Felix Degrood for noticing that we were emitting more
3DSTATE_CONSTANT_* packets than we needed to.
2019-08-28 22:11:05 -07:00
Timur Kristóf
5f3eb6ef29 st/nine: Properly initialize GLSL types for NIR shaders.
NIR shaders use GLSL types (note: these live outside libglsl), and
nine needs to properly initialize these just like the other state
trackers. This fixes an assertion failure when TTN is used.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
2019-08-28 23:31:34 +00:00
Tapani Pälli
da603c066e
iris: build android libmesa_iris for gen12
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:34 -07:00
Jordan Justen
44ab7c265f
iris: Build for gen12
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:33 -07:00
Eric Anholt
973b49386c freedreno/a6xx: Fix non-mipmap filtering selection.
We were clamping the LOD to force non-mipmap filtering, but that means
that the HW doesn't get to select between the min and mag filters.
Setting MIPFILTER_LINEAR_FAR appears to force non-mipmap filtering.

Fixes all failures in dEQP-GLES2.functional.texture.filtering.2d.*

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-28 13:14:41 -07:00
Eric Anholt
4662b70d23 gallium: Don't emit identical endian-dependent pack/unpack code.
Reduces the size of the u_format_table.c file by 140k (out of 1.64M)
and makes me less confused about endianness in gallium.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
d17ff2f7f1 gallium: Fix big-endian addressing of non-bitmask array formats.
The formats affected are:

- LA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT)
- R8G8B8 x (UNORM, SNORM, SRGB, USCALED, SSCALED, UINT, SINT)
- RG/RGB/RGBA x (64_FLOAT, 32_FLOAT, 16_FLOAT, 32_UNORM, 32_SNORM,
                 32_USCALED, 32_SSCALED, 32_FIXED, 32_UINT, 32_SINT)
- RGB/RGBA x (16_UNORM, 16_SNORM, 16_USCALED, 16_SSCALED,
              16_UINT, 16_SINT)
- RGBx16 x (UNORM, SNORM, FLOAT, UINT, SINT)
- RGBx32 x (FLOAT, UINT, SINT)
- RA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT)

The updated st_formats.c unit test checks that the formats affected by
this change are all array formats in the equivalent Mesa format (if
any).  Mesa's array format definition is clear: the value stored is an
array (increasing memory address) of values of the channel's type.
It's also the only thing that makes sense for the RGB types, or very
large types like RGBA64_FLOAT (A should not move to the low address
because the cpu is BE).

Acked-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Tested-by: Matt Turner <mattst88@gmail.com> (unit tests on BE)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
0547fdd7ee gallium: Drop a bit of dead code from the pack/unpack python.
Nothing used this var.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
309ef968cd gallium: Drop the useless union wrapper on pack/unpack.
Nothing accessed the .value field, just the .chan.  Unwrap all the
code from the union, for clarity (and 13k less generated code).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
174240c5e4 gallium: Skip generating the pack/unpack union if we don't use it.
Shaves 30k off of the 1.6M .c file, and makes for less noise for me
trying to understand how gallium formats actually work.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Boris Brezillon
8709b865ce panfrost: Reset the damage area on imported resources
Reset the damage area in the resource_from_handle() path (as done in
panfrost_resource_create()).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-28 17:50:44 +02:00
Vasily Khoruzhick
200859f45c lima: fix texture descriptor issues
Looks like initial RE was wrong and some fields have different purpose.
I.e. there's no "disable_mipmap" field, it's actually part of another field
that selects mipmap filtering.

Also fix layout position.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-28 00:28:38 +00:00
Kenneth Graunke
7e095a4fbf iris: Drop swizzling parameter from s8_offset.
This is always false on Gen8+, no need for dead code and parameters.
2019-08-27 17:11:32 -07:00
Marek Olšák
360cf3c4b0 radeonsi: fix scratch buffer WAVESIZE setting leading to corruption
Cc: 19.2 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:52:32 -04:00
Marek Olšák
f95a28d361 radeonsi: unbind blend/DSA/rasterizer state correctly in delete functions
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111414

Fixes: b758eed9c3 ("radeonsi: make sure that blend state != NULL and remove all NULL checking")

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:52:30 -04:00
Marek Olšák
40e5ac45ae radeonsi: align scratch and ring buffer allocations for faster memory access
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:52:28 -04:00
Marek Olšák
d8f27552f4 radeonsi: consolidate determining VGPR_COMP_CNT for API VS
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
4dde40908f radeonsi/gfx10: set PA_CL_VS_OUT_CNTL with CONTEXT_REG_RMW to fix edge flags
We need two different values of the register, one for NGG and one for
legacy, in order to fix edge flags for the legacy pipeline.

Passing the ngg flag to emit_clip_regs would be too complicated,
so CONTEXT_REG_RMW is used for partial register updates.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
1426acf9e7 radeonsi/gfx10: remove incorrect ngg/pos_writes_edgeflag variables
It varies depending on si_shader_key::as_ngg.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Marek Olšák
2e94cb6693 radeonsi: add PKT3_CONTEXT_REG_RMW
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00