Commit graph

105253 commits

Author SHA1 Message Date
Jason Ekstrand
ce36f412c9 nir/search_helpers: Use nir_src_is_const and friends
This not only makes them safe for more bit sizes but it also fixes a bug
in is_zero_to_one where it would return true for constant NaN.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Jason Ekstrand
7bae7828aa nir/search: Use nir_src_is_const and friends
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Jason Ekstrand
bca5c2c688 nir: Add some new helpers for working with const sources
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Alyssa Rosenzweig
e0c267c752 mesa/st: Only call nir_lower_io_to_scalar_early on scalar ISAs
On scalar ISAs, nir_lower_io_to_scalar_early enables significant
optimizations. However, on vector ISAs, it is counterproductive and
impedes optimal codegen. This patch only calls
nir_lower_io_to_scalar_early for scalar ISAs. It appears that at present
there are no upstreamed drivers using Gallium, NIR, and a vector ISA, so
for existing code, this should be a no-op. However, this patch is
necessary for the upcoming Panfrost (Midgard) and Lima (Utgard)
compilers, which are vector.

With this patch, Panfrost is able to consume NIR directly, rather than
TGSI with the TGSI->NIR conversion.

For how this affects Lima, see
https://www.mail-archive.com/mesa-dev@lists.freedesktop.org/msg189216.html

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2018-10-22 20:37:07 +02:00
Dylan Baker
4e785fb383 meson: don't require libelf for r600 without LLVM
r600 doesn't have a hard requirement on LLVM, and therefore doesn't have
a hard requirement on libelf. Currently the logic doesn't allow that
however.

Distro-bug: https://bugs.gentoo.org/669058
Fixes: 5060c51b6f
       ("meson: build r600 driver")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-10-22 11:29:55 -07:00
Jason Ekstrand
ca4e465f7d anv,radv: Trivially expose two new VK_GOOGLE extensions
This patch exposes support for the following two extensions:

 * VK_GOOGLE_decorate_string
 * VK_GOOGLE_hlsl_functionality1

There's nothing for the driver to do; it's all handled in spirv_to_nir.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107971
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 10:50:20 -05:00
Jason Ekstrand
891886da2f spirv: Add no-op support for VK_GOOGLE_hlsl_functionality1
This extension adds two new decorations which carry meaning only for
HLSL shaders.  They are expected to be handled by higher level layers
and can be ignored by implementations.  However, it does save the client
a bit of work if the implementation safely ignores them instead of the
client having to strip them out of the SPIR-V in order for it to be
valid.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 10:49:53 -05:00
Jason Ekstrand
5f0322d5c3 spirv: Add support for SPV_GOOGLE_decorate_string
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 10:49:53 -05:00
Rob Herring
2bb05d70af android: Build kms_swrast for the Android platform
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-22 13:08:17 +01:00
Connor Abbott
27fe3f5b5a ac: Fix loading a dvec3 from an SSBO
The comment was wrong, since the loop above casts to a type with the
correct bitsize already.

Fixes: 7e7ee82698 ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 09:44:51 +02:00
Connor Abbott
59535b05cf ac: Introduce ac_build_expand()
And implement ac_bulid_expand_to_vec4() on top of it.

Fixes: 7e7ee82698 ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 09:44:51 +02:00
Eduardo Lima Mitev
fdd926d5b2 ir3/nir: Set up image_dims consts for image_deref_size intrinsic too
`nir_intrinsic_image_deref_size` is not being considered during scan for
driver constants, so image constants are not emitted if a shader
only ever query the size of an image (no load, store, atomic op, etc).
This is unlikely, but possible.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-10-21 21:29:18 +02:00
Karol Herbst
2d235d69c8 nv50/ir: fix ConstantFolding::createMul for 64 bit muls
Fixes: 2f52925f5c
       "nv50/ir: move a * b -> a << log2(b) code into createMul()"

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-10-20 03:00:04 +02:00
Sonny Jiang
bfb2b90246 radeonsi: Disable clear_state with radeon kernel driver
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-10-19 16:16:57 -04:00
Kenneth Graunke
f91f9bab83 meson: Add -Werror=return-type when supported.
This warning detects non-void functions with a missing return statement,
return statements with a value in void functions, and functions with an
bogus return type that ends up defaulting to int.  It's already enabled
by default with -Wall.  Generally, these are fairly serious bugs in the
code, which developers would like to notice and fix immediately.  This
patch promotes it from a warning to an error, to help developers catch
such mistakes early.

I would not expect this warning to change much based on the compiler
version, so hopefully it won't become a problem for packagers/builders.

See the GCC documentation or 'man gcc' for more details:
https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc/Warning-Options.html#index-Wreturn-type

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-19 10:16:57 -07:00
Jason Ekstrand
0d380af809 anv: Define trampolines as the weak functions
Instead of having weak references to the anv functions and separate
trampoline functions with their own dispatch table, just make the
trampoline functions weak.  This gets rid of a dispatch table and
potentially lets the compiler delete the unused weak function.  The
end result is a reduction in the .text section of 5.7K and a reduction
in the .data section of 1.4K.

Before:

   text	   data	    bss	    dec	    hex	filename
3190329	 282232	   8960	3481521	 351fb1	_install/lib64/libvulkan_intel.so

After:

   text	   data	    bss	    dec	    hex	filename
3184548	 280792	   8960	3474300	 35037c	_install/lib64/libvulkan_intel.so

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-19 11:52:00 -05:00
Juan A. Suarez Romero
f8e789d2ac docs: fix typo in 18.2.3 release notes link
Fixes: 86b4bd52dc ("docs: update calendar, add news item and link
release notes for 18.2.3")

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-10-19 18:48:12 +02:00
Juan A. Suarez Romero
86b4bd52dc docs: update calendar, add news item and link release notes for 18.2.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2018-10-19 18:45:41 +02:00
Juan A. Suarez Romero
01f5d37d3e docs: add sha256 checksums for 18.2.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 27fd12857b)
2018-10-19 18:43:49 +02:00
Juan A. Suarez Romero
e30970e2cd docs: add release notes for 18.2.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit d219361b42)
2018-10-19 18:43:48 +02:00
Jose Fonseca
45bacc4b63 scons: Remove gles option.
It's broken, and WGL state tracker is always built with GLES support
noawadays.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-19 16:50:26 +01:00
Bas Nieuwenhuizen
68c7833540 radv: Fix WSI & PCI bus info initialization order.
Trying to access the bus info before it is initialized is not going
to work.

Fixes: baa38c144f "vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108491
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2018-10-19 13:24:19 +02:00
Marek Olšák
69a87b5d47 radeonsi: fix a typo in a comment in emit_guardband 2018-10-18 18:01:22 -04:00
Marek Olšák
2a26b1c045 radeonsi: fix gnome-shell crash
I wasn't expecting to get viewports with the center having
negative coordinates.

Broken by: 6cc79e4411
2018-10-18 17:55:44 -04:00
Jason Ekstrand
8c0b9fdfa1 Revert "anv: Stop generating weak references for instance entrypoints"
This reverts commit 00bb42105d.  It was
not as well thought out as I had intended and broke the build when
VK_KHR_display is disabled in the build.
2018-10-18 15:36:26 -05:00
Marek Olšák
77bcbe712e radeonsi: clamp point size to the limit
This fixes dEQP-GLES2.functional.rasterization.limits.points.
Broken by: ea039f789d

Tested-by: Jakob Bornecrantz <jakob@collabora.com>
2018-10-18 16:08:56 -04:00
Marek Olšák
eae8f49fc6 radeonsi: fix a VGT hang with primitive restart on Polaris10 and later
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
2018-10-18 16:08:56 -04:00
Marek Olšák
165817d47f radeonsi: fix a deadlock due to partially-initialized context on CI 2018-10-18 16:08:56 -04:00
Jan Vesely
06bf56725d radeonsi: Bump number of allowed global buffers to 32
Fixes assertion failure/crash when running luxmark/luxball on clover.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108272
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-18 16:02:42 -04:00
Andres Rodriguez
e71a87775e radv: fix check for perftest options size
It was using the debug options array size.

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:42:20 -04:00
Marek Olšák
6cc79e4411 radeonsi: fix incorrect hw screen offset and guardband computation
It resulted in assertion failures or incorrect rendering.

Broken by: 9e182b8313
2018-10-18 14:42:42 -04:00
Jason Ekstrand
baa38c144f vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching
This lets us avoid passing the DRM fd around all over the place and gets
us closer to layer utopia.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-10-18 11:29:00 -05:00
Michel Dänzer
c20ba1be18 loader/dri3: Also wait for front buffer fence if we triggered it
In that case, we have to wait for the fence to synchronize with the
corresponding drawing we triggered in the X server.

Fixes incorrect display with the i965 driver and some applications, e.g.
solvespace.

Bugzilla: https://bugs.freedesktop.org/108097
Fixes: aefac10fec "loader/dri3: Only wait for back buffer fences in
                     dri3_get_buffer"
Tested-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
2018-10-18 16:52:06 +02:00
Jason Ekstrand
00bb42105d anv: Stop generating weak references for instance entrypoints
We don't need weak references to instance entrypoints because we never
have more than one of each so we don't need the NULL fall-back.  This
also helps us avoid forgetting things because we now get link errors for
missing instance entrypoints.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-18 09:17:39 -05:00
Jason Ekstrand
7c65cf9844 vulkan/wsi: Implement GetPhysicalDevicePresentRectanglesKHR
This got missed during 1.1 enabling because it was defined as an
interaction between device groups and WSI and it wasn't obvious it was
in the delta.

The idea behind it is that it's supposed to provide a hint to the
application in a multi-GPU setup to indicate which regions of the screen
are being scanned out by which GPU so a multi-device split-screen
rendering application can render each part of the screen on the GPU that
will be presenting it and avoid extra bus traffic between GPUs.  On a
single-GPU setup or one which doesn't support this present mode, we need
to do something.  We choose to return the window size (or a max-size
rect) if the compositor, X server, or crtc is associated with the given
physical device and zero rectangles otherwise.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-18 09:17:39 -05:00
Jason Ekstrand
7629c00557 vulkan/wsi: Store the instance allocator in wsi_device
We already have wsi_device and we know the instance allocator at
wsi_device_init time so there's no need to pass it into the physical
device queries.  This also fixes a memory allocation domain bug that can
occur if CreateSwapchain gets called prior to any queries (not likely)
in which case the cached connection gets allocated off the device
instead of the instance.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-18 09:17:39 -05:00
Michał Janiszewski
0ef50ecc69 st/xlib: Use more appropriate include guard
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com
2018-10-18 11:03:04 +01:00
Michał Janiszewski
bcc613acc1 gallium: Fix mismatched ifdef-guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-18 11:03:03 +01:00
Gert Wollny
74adc624b6 softpipe: dynamically allocate space for immediate constants
The number of immediate constants was fixed and the size check was
only done by means of an assertion. Given this a shader that emits
more immediate constants would result in a memory corruption when
mesa is build in release mode.

Instead of using this fixed limit allocate the space dynamically, let it
grow as needed, and also remove the unused ImmArray.

Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.1

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-18 10:59:51 +02:00
Timothy Arceri
3a95396f3c radv: use nir_shrink_vec_array_vars()
Totals from affected shaders:
SGPRS: 1096 -> 1096 (0.00 %)
VGPRS: 1192 -> 1056 (-11.41 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 100940 -> 94384 (-6.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 100 -> 112 (12.00 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from Batman Arkham City.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Timothy Arceri
8086fa1bcd radv: use nir_split_array_vars()
We call in the opt loop in case another pass results in an
array with indirect access being turned into direct access.

Totals from affected shaders:
SGPRS: 512 -> 496 (-3.12 %)
VGPRS: 456 -> 452 (-0.88 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 40040 -> 39664 (-0.94 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 41 -> 43 (4.88 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from Batman Arkham City.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Timothy Arceri
06675711e7 radv: use nir_opt_find_array_copies()
Totals from affected shaders:
SGPRS: 1112 -> 1112 (0.00 %)
VGPRS: 1492 -> 1196 (-19.84 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 112172 -> 101316 (-9.68 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 93 -> 98 (5.38 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from "Batman: Arkham City" over DXVK.

The pass detects that the temporary array created by DXVK for
storing TCS inputs is a copy of the input arrays and allows
us to avoid copying all of the input data and then indirecting
on it with if-ladders, instead we just do indirect indexing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Timothy Arceri
9d5b106b2e radv: use nir_opt_copy_prop_vars and nir_opt_dead_write_vars
Totals from affected shaders:
SGPRS: 2856 -> 2856 (0.00 %)
VGPRS: 3236 -> 3248 (0.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 236560 -> 233548 (-1.27 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 277 -> 283 (2.17 %)
Wait states: 0 -> 0 (0.00 %)

Even in the cases were we have increased VGPR use it appears
the NIR is improved significantly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-18 15:04:09 +11:00
Keith Packard
67a2c1493c vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]
Offers three clocks, device, clock monotonic and clock monotonic
raw. Could use some kernel support to reduce the deviation between
clock values.

v2:
	Ensure deviation is at least as big as the GPU time interval.

v3:
	Set device->lost when returning DEVICE_LOST.
	Use MAX2 and DIV_ROUND_UP instead of open coding these.
	Delete spurious TIMESTAMP in radv version.

	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
	Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

v4:
	Add anv_gem_reg_read to anv_gem_stubs.c

	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>

v5:
	Adjust maxDeviation computation to max(sampled_clock_period) +
	sample_interval.

	Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
	Suggested-by: Jason Ekstrand <jason@jlekstrand.net>

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-17 20:10:15 -07:00
Topi Pohjolainen
a11cafbd7a intel/compiler/icl: Use invocation id bits 22:16 instead of 23:17
Identifier bits in the dispatch header have changed. See Bspec:

SINGLE_PATCH Payload:

3D Pipeline Stages - 3D Pipeline Geometry -
Hull Shader (HS) Stage IVB+ - Payloads IVB+

Fixes: KHR-GL46.tessellation_shader.tessellation_shader_tc_barriers.barrier_guarded_read_write_calls

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-10-17 21:19:57 +03:00
Neil Roberts
a9475d9337 Fix setting indent-tabs-mode in the Emacs .dir-locals.el files
Some of the .dir-locals.el had the wrong name for the truthy value so
it wasn’t setting indent-tabs-mode.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-17 19:03:08 +02:00
Rob Clark
d27b1c83b9 freedreno/a6xx: don't allocate binning rb
Now that a single cmdstream is used for both binning and draw passes, we
can skip allocation of cmdstream buffer for binning.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:49 -04:00
Rob Clark
24d57a6d8f freedreno/a6xx: single cmdstream for draw+binning
Now that state which is different for draw vs binning pass is split out
into different state-groups with appropriate enable_mask (so the
appropriate one is chosen for draw vs binning), switch over to using a
single cmdstream for both passes.

This should significantly lower draw overhead for CPU bound benchmarks.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:49 -04:00
Rob Clark
72f6164fef freedreno/a6xx: split binning vs draw program stateobj's
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:49 -04:00
Rob Clark
3313d693af freedreno/a6xx: split VBO state into binning/draw variants
Blob seems to manage to use same input registers for BS (binning pass)
vs VS (draw pass) shaders, so it can use the same VBO state for both.
We can't quite do that yet, so split them.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:49 -04:00