Commit graph

2769 commits

Author SHA1 Message Date
Samuel Pitoiset
02ccef7874 radv: add get_image_stride_for_r32g32b32() helper
For the special R32G32B32 paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:50:03 +02:00
Samuel Pitoiset
468c33e2f7 radv: add create_bview_for_r32g32b32() helper
For the special R32G32B32 paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:50:00 +02:00
Samuel Pitoiset
e60e3e1b3f radv: add create_buffer_from_image() helper
For the special R32G32B32 paths.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-26 10:49:58 +02:00
David McFarland
07a00a8729 util: Change remaining uint32 cache ids to sha1
After discussion with Timothy Arceri. disk_cache_get_function_identifier
was using only the first byte of the sha1 build-id.  Replace
disk_cache_get_function_identifier with implementation from
radv_get_build_id.  Instead of writing a uint32_t it now writes to a
mesa_sha1.  All drivers using disk_cache_get_function_identifier are
updated accordingly.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: 83ea8dd99b ("util: add disk_cache_get_function_identifier()")
2018-10-26 14:49:22 +11:00
Bas Nieuwenhuizen
d41c3cc013 radv: Emit enqueued pipeline barriers on event write.
Since the CPU can read them we need to execute any GPU->CPU
flushes before the event is written.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108524
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-25 16:17:54 +02:00
Bas Nieuwenhuizen
9d40ec2cf6 radv: Add support for VK_KHR_driver_properties.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-25 16:14:43 +02:00
Eric Engestrom
bb84fa146f util: use C99 declaration in the for-loop hash_table_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Timothy Arceri
0ff1ccca25 radv: call nir_link_xfb_varyings()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-24 08:21:29 +11:00
Timothy Arceri
c769ed10de radv: move nir_lower_io_to_scalar_early() to radv_link_shaders()
nir_lower_io_to_scalar_early() is really part of the link time
optimisations. Moving it here allows the code to be simplified
and also keeps the code easy to follow in the next patch.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-24 08:21:29 +11:00
Leo Liu
b75fb8ee36 amd/common: check DRM version 3.27 for JPEG decode
JPEG was added after DRM version 3.26

Signed-off-by: Leo Liu <leo.liu@amd.com>
Fixes: 4558758c51749(amd/common: add vcn jpeg ip info query)
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-23 13:12:05 -04:00
Boyuan Zhang
4558758c51 amd/common: add vcn jpeg ip info query
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Samuel Pitoiset
69c44de798 radv: fix btoi for R32G32B32 when the dest offset is not 0
Fixes: 593996bc02 ("radv: implement buffer to image operations for R32G32B32")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-23 14:29:26 +02:00
Eric Engestrom
17b03b5320 radv: s/abs/fabsf/ for floats
Fixes: a4c4efad89 "radv: Rework guard band calculation"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-23 11:43:51 +01:00
Jason Ekstrand
ca4e465f7d anv,radv: Trivially expose two new VK_GOOGLE extensions
This patch exposes support for the following two extensions:

 * VK_GOOGLE_decorate_string
 * VK_GOOGLE_hlsl_functionality1

There's nothing for the driver to do; it's all handled in spirv_to_nir.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107971
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 10:50:20 -05:00
Connor Abbott
27fe3f5b5a ac: Fix loading a dvec3 from an SSBO
The comment was wrong, since the loop above casts to a type with the
correct bitsize already.

Fixes: 7e7ee82698 ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 09:44:51 +02:00
Connor Abbott
59535b05cf ac: Introduce ac_build_expand()
And implement ac_bulid_expand_to_vec4() on top of it.

Fixes: 7e7ee82698 ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 09:44:51 +02:00
Bas Nieuwenhuizen
68c7833540 radv: Fix WSI & PCI bus info initialization order.
Trying to access the bus info before it is initialized is not going
to work.

Fixes: baa38c144f "vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108491
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2018-10-19 13:24:19 +02:00
Andres Rodriguez
e71a87775e radv: fix check for perftest options size
It was using the debug options array size.

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:42:20 -04:00
Jason Ekstrand
baa38c144f vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching
This lets us avoid passing the DRM fd around all over the place and gets
us closer to layer utopia.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-18 11:29:00 -05:00
Jason Ekstrand
7c65cf9844 vulkan/wsi: Implement GetPhysicalDevicePresentRectanglesKHR
This got missed during 1.1 enabling because it was defined as an
interaction between device groups and WSI and it wasn't obvious it was
in the delta.

The idea behind it is that it's supposed to provide a hint to the
application in a multi-GPU setup to indicate which regions of the screen
are being scanned out by which GPU so a multi-device split-screen
rendering application can render each part of the screen on the GPU that
will be presenting it and avoid extra bus traffic between GPUs.  On a
single-GPU setup or one which doesn't support this present mode, we need
to do something.  We choose to return the window size (or a max-size
rect) if the compositor, X server, or crtc is associated with the given
physical device and zero rectangles otherwise.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-18 09:17:39 -05:00
Jason Ekstrand
7629c00557 vulkan/wsi: Store the instance allocator in wsi_device
We already have wsi_device and we know the instance allocator at
wsi_device_init time so there's no need to pass it into the physical
device queries.  This also fixes a memory allocation domain bug that can
occur if CreateSwapchain gets called prior to any queries (not likely)
in which case the cached connection gets allocated off the device
instead of the instance.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-18 09:17:39 -05:00
Timothy Arceri
3a95396f3c radv: use nir_shrink_vec_array_vars()
Totals from affected shaders:
SGPRS: 1096 -> 1096 (0.00 %)
VGPRS: 1192 -> 1056 (-11.41 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 100940 -> 94384 (-6.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 100 -> 112 (12.00 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from Batman Arkham City.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Timothy Arceri
8086fa1bcd radv: use nir_split_array_vars()
We call in the opt loop in case another pass results in an
array with indirect access being turned into direct access.

Totals from affected shaders:
SGPRS: 512 -> 496 (-3.12 %)
VGPRS: 456 -> 452 (-0.88 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 40040 -> 39664 (-0.94 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 41 -> 43 (4.88 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from Batman Arkham City.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Timothy Arceri
06675711e7 radv: use nir_opt_find_array_copies()
Totals from affected shaders:
SGPRS: 1112 -> 1112 (0.00 %)
VGPRS: 1492 -> 1196 (-19.84 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 112172 -> 101316 (-9.68 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 93 -> 98 (5.38 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from "Batman: Arkham City" over DXVK.

The pass detects that the temporary array created by DXVK for
storing TCS inputs is a copy of the input arrays and allows
us to avoid copying all of the input data and then indirecting
on it with if-ladders, instead we just do indirect indexing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Timothy Arceri
9d5b106b2e radv: use nir_opt_copy_prop_vars and nir_opt_dead_write_vars
Totals from affected shaders:
SGPRS: 2856 -> 2856 (0.00 %)
VGPRS: 3236 -> 3248 (0.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 236560 -> 233548 (-1.27 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 277 -> 283 (2.17 %)
Wait states: 0 -> 0 (0.00 %)

Even in the cases were we have increased VGPR use it appears
the NIR is improved significantly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Keith Packard
67a2c1493c vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]
Offers three clocks, device, clock monotonic and clock monotonic
raw. Could use some kernel support to reduce the deviation between
clock values.

v2:
	Ensure deviation is at least as big as the GPU time interval.

v3:
	Set device->lost when returning DEVICE_LOST.
	Use MAX2 and DIV_ROUND_UP instead of open coding these.
	Delete spurious TIMESTAMP in radv version.

	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
	Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

v4:
	Add anv_gem_reg_read to anv_gem_stubs.c

	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>

v5:
	Adjust maxDeviation computation to max(sampled_clock_period) +
	sample_interval.

	Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-17 20:10:15 -07:00
Marek Olšák
bfc795670e ac: add helpers for fast integer division by a constant 2018-10-16 17:23:25 -04:00
Marek Olšák
fedc1fda30 radeonsi: save raster config in screen, add se_tile_repeat 2018-10-16 15:28:22 -04:00
Marek Olšák
0d05581578 radeonsi: rename si_gfx_* functions to si_cp_*
and write_event_eop -> release_mem
2018-10-16 15:28:22 -04:00
Marek Olšák
6e1cf6532d radeonsi: make si_gfx_write_event_eop more configurable 2018-10-16 15:28:22 -04:00
Samuel Pitoiset
647c2b90e9 radv: disable VK_SUBGROUP_FEATURE_VOTE_BIT
This feature isn't used for now, so disable it until
wwm is fixed in LLVM.

Fixes dEQP-VK.subgroups.vote.graphics.subgroupallequal*

https://bugs.freedesktop.org/show_bug.cgi?id=108115
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-16 10:24:19 +02:00
Samuel Pitoiset
593996bc02 radv: implement buffer to image operations for R32G32B32
This should fix rendering issues with Batman Arkham City.
We will probably need to implement itob and itoi at some
point, but currently nothing hits these paths.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-16 09:22:38 +02:00
Alex Smith
ca83d51cfb ac/nir: Use context-specific LLVM types
LLVMInt*Type() return types from the global context and therefore are
not safe for use in other contexts. Use types from our own context
instead.

Fixes frequent crashes seen when doing multithreaded pipeline creation.

Fixes: 4d0b02bb5a "ac: add support for 16bit load_push_constant"
Fixes: 7e7ee82698 "ac: add support for 16bit buffer loads"
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-16 08:18:24 +01:00
Samuel Pitoiset
26a2ce35ab radv: do not force the flat qualifier for clip/cull distances
This fixes some new CTS that reads clip/cull distances
from the fragment shader stage:

dEQP-VK.clipping.user_defined.clip_*

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-15 21:55:28 +02:00
Samuel Pitoiset
80c84bdba9 radv: bump discreteQueuePriorities to 2
It's the minimum value required by the spec.

This fixes dEQP-VK.api.info.device.properties.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-15 21:55:25 +02:00
Bas Nieuwenhuizen
6ed0fd24d4 radv: Implement VK_EXT_pci_bus_info.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-15 12:27:49 +02:00
Samuel Pitoiset
2c139e2cdf radv: do not support blitting surfaces for R32G32B32 formats
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108113
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-12 15:28:21 +02:00
Samuel Pitoiset
416013b4f5 radv: emit the GLC bit for SSBO loads/stores when needed
This fixes some new memory model tests:
dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.*

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108112
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-12 08:42:08 +02:00
Samuel Pitoiset
229803b66a radv: implement clear operations for R32G32B32
This fixes crashes for some CTS:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.*.linear_*_*
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.*.*_linear_*

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108113
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-11 14:49:16 +02:00
Samuel Pitoiset
c3ba3c2611 radv: disallow 3D images and mipmaps/layers for R32G32B32 linear formats
R32G32B32 are weird formats and we are only going to support
some basic operations for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-11 14:49:14 +02:00
Samuel Pitoiset
d179312b53 radv: add a workaround for a VGT hang with prim restart and strips
Otherwise, Yakuza and The Evil Within hang the GPU with DXVK.
This apparently only works on Polaris.

Suggested by Marek.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-11 10:16:11 +02:00
Dave Airlie
600d8ecb57 radv: remove unsigned comparison against 0
The value is always >= 0 here.

Found by coverity

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-11 10:19:20 +10:00
Dave Airlie
6e1d294804 radv: remove dead code for master_fd close
We have never opened master_Fd at this point, so remove code to
close it.

Found by coverity.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-11 10:19:16 +10:00
Dave Airlie
7c04b96f03 radv: don't pass shader key by copy
Coverity pointed out we were copying 168 bytes here unnecessarily.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-11 10:18:43 +10:00
Eric Engestrom
976188737d radv: add missing meson c++ visibility arguments
Fixes: 6f3aee40f9 "radv: using tls to store llvm related info
                             and speed up compiles (v10)"
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-09 14:22:24 +01:00
Samuel Pitoiset
d3682766f6 radv: tidy up radv_pipeline_init_multisample_state()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-08 14:17:43 +02:00
Samuel Pitoiset
b38228ccb0 radv: always set PA_SC_MODE_CNTL_1.OUT_OF_ORDER_WATER_MARK
It has probably no effect without out of order rasterization
anyway.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-08 14:17:40 +02:00
Samuel Pitoiset
937986ca1d radv: set DB_EQAA.INCOHERENT_EQAA_READS
My attempt was to set this field instead of duplicating one.

Fixes: 6cfa321c39 ("radv: add potential missing fields for DB_EQAA")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-08 14:17:33 +02:00
Marek Olšák
77903c8cfb ac: add ac_build_round 2018-10-06 21:50:09 -04:00
Marek Olšák
fa023f293e ac: correct PKT3_COPY_DATA definitions 2018-10-06 21:50:09 -04:00