Commit graph

95481 commits

Author SHA1 Message Date
Kenneth Graunke
6ec7bddb19 i965: Don't double count the batch in aperture_space.
intel_batchbuffer_reset calls add_exec_bo on the batch right away,
which adds in the batch BO size.

Fixes: 29ba502a4e ("i965: Use I915_EXEC_BATCH_FIRST when available.")

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-01 09:59:25 -07:00
Cherniak, Bruce
43145bbf09 swr: Report format max_samples=1 to maintain support for "fake" msaa.
Accompanying patch "st/mesa: only try to create 1x msaa surfaces for
'fake' msaa" requires driver to report max_samples=1 to enable "fake"
msaa. Previously, 0 and 1 were treated equivalently in st_init_extensions()
and either could enable "fake" msaa.

This patch raises the swr default msaa_max_count from 0 to 1, so that
swr_is_format_supported will report max_samples=1.

Real msaa can still be enabled by exporting SWR_MSAA_MAX_COUNT with a
pow2 value between 2 and 16.

This patch is necessary to prevent an OpenSWR regression resulting from
the st/mesa patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2017-09-01 11:23:16 -05:00
Eric Engestrom
4d6c23ee83 aubinator: remove duplicate initialisation
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-01 17:06:43 +01:00
Samuel Pitoiset
80177306d9 radv: report VM faults if detected
It's fairly simple for now, but this might be quite useful.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:46:36 +02:00
Samuel Pitoiset
12cbd9a13f radeonsi: move si_vm_fault_occured() to AMD common code
For radv, in order to report VM faults when detected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:46:32 +02:00
Samuel Pitoiset
72d9ffc72c radv: add radv_check_gpu_hangs() helper function
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:46:00 +02:00
Samuel Pitoiset
f14020c15f radv: disassemble SPIR-V binaries with RADV_DEBUG=spirv
This introduces a new separate option because the output can
be quite verbose. If spirv-dis is not found in the path, this
debug option is useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:41:54 +02:00
Samuel Pitoiset
ad42e2abb8 radv: move RADV_TRACE_FILE functions to radv_debug.c
At the moment, debugging radv is not really easy because the
driver doesn't report enough information when it hangs. This
new file will be the main location for all debug tools.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:41:54 +02:00
Samuel Pitoiset
f1f2f00f6a radv: silent a compiler warning in radv_emit_framebuffer_state()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:38:52 +02:00
Samuel Pitoiset
962fda5b90 radv: compute correct maximum wave count per SIMD
Ported from RadeonSI (original patch by Marek).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:38:50 +02:00
Brian Paul
9eca7e0ddb st/mesa: only try to create 1x msaa surfaces for "fake" msaa drivers
For software drivers where we want "fake" msaa support for GL 3.x, we
treat 1 sample as being msaa.

For drivers with real msaa support, start format probing at 2x msaa.
For drivers with fake msaa support, start format probing at 1x msaa.

This also tweaks the MaxSamples code in st_init_extensions() so that
we use MaxSamples=1 for fake msaa.  This allows the format proble loops
to run at least one iteration.

This fixes a llvmpipe/VTK regression from commit 6839d33699.
And for drivers with fake msaa support, calls such as
glTexImage2DMultisample(samples=1) will now succeed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102125
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-31 22:09:57 -06:00
Tobias Klausmann
1c4e6d7ca8 nvc0/ir: propagate immediates to CALL input MOVs
On using builtin functions we have to move the input to registers $0 and $1, if
one of the input value is an immediate, we fail to propagate the immediate:

...
mov u32 $r477 0x00000003 (0)
...
mov u32 $r0 %r473 (0)
mov u32 $r1 $r477 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...

With this patch the immediate is propagated, potentially causing the first MOV
to be superfluous, which we'd remove in that case:

...

mov u32 $r0 %r473 (0)
mov u32 $r1 0x00000003 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...

Shaderdb stats:
total instructions in shared programs : 4893460 -> 4893324 (-0.00%)
total gprs used in shared programs    : 582972 -> 582881 (-0.02%)
total local used in shared programs   : 17960 -> 17960 (0.00%)

                local        gpr       inst      bytes
    helped           0          91         112         112
      hurt           0           0           0           0

v2:
 implement some changes proposed by imirkin, the manual deletion of the dead
 mov is necessary after ea22ac23e0 ("nvc0/ir: unlink values pre- and post-call
 to division function") as the potentially dead mov is unlinked properly,
 causing later passes to not notice the mov op at all and thus not cleaning it
 up. That makes up a big chunk of the regression the above commit caused.
 Keep the deletion of the op where it is, deleting it later unnecessarily blows
 up size of the change.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-31 22:58:06 -04:00
Karol Herbst
b672c3833b nvc0: write 0 to pipeline_statistics.cs_invocations
cs_invocations are currently unsupported, but leaving the field uninitialized
is even worse.

fixes on nvc0:
 * KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values
 * KHR-GL45.pipeline_statistics_query_tests_ARB.functional_non_rendering_commands_do_not_affect_queries

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-31 22:57:22 -04:00
Ben Crocker
57c8ead0cd llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load
Fix loading of a 3x16 vector as a single 48-bit load
on big-endian systems (PPC64, S390).

Roland Scheidegger's commit e827d91756
plus Ray Strode's patch reduce pre-Roland Piglit failures from ~4000 to ~2000.  This patch fixes
three of the four regressions observed by Ray:

- draw-vertices
- draw-vertices-half-float
- draw-vertices-half-float_gles2

One regression remains:
- draw-vertices-2101010

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-09-01 01:20:07 +02:00
Ray Strode
75cb6e3617 gallivm: correct channel shift logic on big endian
lp_build_fetch_rgba_soa fetches a texel from a texture.
Part of that process involves first gathering the element
together from memory into a packed format, and then breaking
out the individual color channels into separate, parallel
arrays.

The code fails to account for endianess when reading the packed
values.

This commit attempts to correct the problem by reversing the order
the packed values are read on big endian systems.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ray Strode <rstrode@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-09-01 01:19:13 +02:00
Roland Scheidegger
c92fe8a8c5 util: only use SCHED_IDLE in pthread_setschedparam() when it's defined
Fixes build error when it's not.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-01 01:10:32 +02:00
Jason Ekstrand
242211933a anv/formats: Nicely handle unknown VkFormat enums
This fixes some crashes in the dEQP-VK.memory.requirements.core.* tests.
I'm not sure whether or not passing out-of-bound formats into the query
is supposed to be allowed but there's no harm in protecting ourselves
from it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/101956
Cc: mesa-stable@lists.freedesktop.org
2017-08-31 14:31:42 -07:00
Charmaine Lee
2d93b462b4 vbo: fix offset in minmax cache key
Instead of saving primitive offset in the minmax cache key,
save the actual buffer offset which is used in the cache lookup.

Fixes rendering artifact seen with GoogleEarth when run with
VMware driver.

v2: Per Brian's comment, initialize offset to avoid compiler warning.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-08-30 23:12:21 -07:00
Tapani Pälli
15b61dec94 anv: fix build errors on android
error: incompatible pointer to integer conversion initializing 'VkFence'
   (aka 'unsigned long long') with an expression of type 'void *' [-Werror,-Wint-conversion]

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-31 18:05:50 +03:00
Christian König
214b565bc2 winsys/amdgpu: set AMDGPU_GEM_CREATE_VM_ALWAYS_VALID if possible v2
When the kernel supports it set the local flag and
stop adding those BOs to the BO list.

Can probably be optimized much more.

v2: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-31 14:55:38 +02:00
Marek Olšák
8b3a257851 radeonsi: set a per-buffer flag that disables inter-process sharing (v4)
For lower overhead in the CS ioctl.
Winsys allocators are not used with interprocess-sharable resources.

v2: It shouldn't crash anymore, but the kernel will reject the new flag.
v3 (christian): Rename the flag, avoid sending those buffers in the BO list.
v4 (christian): Remove setting the kernel flag for now

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-31 14:55:21 +02:00
Kenneth Graunke
5ae2de81c8 i965: Use BLORP for buffer object stall avoidance blits instead of BLT.
Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2:
- Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8).
- Car Chase by 1.25607% +/- 0.291262% (n=5).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:24 -07:00
Kenneth Graunke
3efedf98e8 i965: Always flush caches after blitting to a GL buffer object.
When we blit data into a buffer object, we may need to invalidate any
caches that might contain stale data, so the new data becomes visible.

For example, if the buffer object is bound as a vertex buffer, we need
to invalidate the vertex fetch cache.

While this flushing was missing, it usually happened implicitly for
non-obvious reasons: we're usually on the render ring, and calling
intel_emit_linear_blit() would require switching to the BLT ring,
causing an implicit flush.  This likely provoked the kernel to do
PIPE_CONTROLs on our behalf.  Although, Gen4-5 wouldn't have this
behavior.  At any rate, we should do it ourselves.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:23 -07:00
Kenneth Graunke
df8f4bfc02 i965: Add PIPE_CONTRTOL_DATA_CACHE flush to brw_emit_mi_flush().
Although we're phasing out brw_emit_mi_flush(), we still use it in some
places in order to "flush everything".  In a number of those places, we
write data to a buffer that we may then bind as an image surface, SSBO,
or atomic buffer.  Those usages require us to flush the data cache.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:22 -07:00
Kenneth Graunke
225425111f i965: Add a brw_blorp_copy_buffers() command.
This exposes the new blorp_copy_buffer() functionality to i965.
It should be a drop-in replacement for intel_emit_linear_blit()
(other than the arguments being backwards, for consistency with BLORP).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:21 -07:00
Kenneth Graunke
fc20df830c blorp: Make blorp_buffer_copy work on Gen4-6.
Gen4-6 can only handle surfaces up to 8192.  Only Gen7+ can do 16384.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:19 -07:00
Kenneth Graunke
81d5b61a19 blorp: Turn anv_CmdCopyBuffer into a blorp_buffer_copy() helper.
I want to be able to copy between buffer objects using BLORP in the i965
driver.  Anvil already had code to do this, in a reasonably efficient
manner - first using large bpp copies, then smaller bpp copies.

This patch moves that logic into BLORP as blorp_buffer_copy(), so we
can use it in both drivers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:07 -07:00
Grazvydas Ignotas
b8dd69e1b4 radv: don't assert on empty hash table
Currently if table_size is 0, it's falling through to:

unreachable("hash table should never be full");

But table_size can be 0 when RADV_DEBUG=nocache is set, or when the
table allocation fails (which is not considered an error).

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-08-31 02:47:26 +03:00
Brian Paul
5610911fed svga: include sample count in surface_size() computation
Use MAX2() because sampleCount will be zero for non-MSAA surfaces.
No Piglit regressions.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-08-30 13:59:14 -06:00
Lionel Landwerlin
350ead0f26 i965: drop unused brw->needs_unlit_centroid_workaround
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
b1c9ed25a5 i965: drop brw->has_surface_tile_offset in favor of devinfo's
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
aff1ad0798 i965: drop unused brw->no_simd8
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
6da7a00a84 i965: drop unused brw->has_pln
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
cbee3b03c9 i965: drop brw->must_use_separate_stencil in favor of devinfo's
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
40d20699b7 i965: drop unused brw->has_negative_rhw_bug
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
71493b320d i965: drop unused brw->has_compr4
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
a5f0821485 i965: drop brw->has_llc in favor of devinfo->has_llc
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
27e273578f i965: drop brw->is_broxton
We need to take some take here as brw->is_broxton has been used to
check whether the device is a low power gen9 (aka Atom gen9 platform).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
b6e783300c i965: drop brw->is_cherryview in favor of devinfo->is_cherryview
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
97e90113c6 i965: drop brw->is_haswell in favor of devinfo->is_haswell
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
d324197de9 i965: drop brw->is_baytrail in favor of devinfo->is_baytrail
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
990c24ad85 i965: drop brw->is_g4x in favor of devinfo->is_g4x
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
46213f676e i965: drop brw->gt in favor of devinfo->gt
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
b83a97a65d i965: drop brw->gen in favor of devinfo->gen
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
de9649071a anv: use device->info instead of brw->is_*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Mark Janes
8c9df0daf2 Revert "egl: Allow creation of per surface out fence"
This reverts commit 13c23b19d0.

Mesa CI was brought down by this commit, with:

mesa/drivers/dri/i965/brw_sync.c:491: brw_dri_create_fence_fd:
Assertion `brw->screen->has_exec_fence' failed.
2017-08-30 08:45:36 -07:00
Kevin Rogovin
783f2b70c0 i965: add 2xMSAA 16xMSAA modes to DRI configs.
For Gen8, add 2xMSAA. For Gen9, add 2xMSAA and 16xMSAA.
Special thanks to Eero Tamminen for reporting rasterizer
numbers being twice what it should be for 2xMSAA under
a benchmark.

V2: Make pointer name less ugly + add 2xMSAA for Gen8

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-30 08:21:38 -07:00
Kenneth Graunke
d808f44ae5 Revert "i965: add 2xMSAA and 16xMSAA to DRI configs for Gen9."
This reverts commit f6d38785e8.

Kevin's original patch accidentally didn't add 2x for Gen8; he sent
a v2 with a bunch of style fixes shortly after I pushed the original
patch, not knowing it was coming.  Let's just revert this one, apply
v2, and move on.
2017-08-30 08:21:38 -07:00
Eric Engestrom
ac0d8dc3fa mesa/st: remove unwanted backup file
Fixes: 0ac78dc925 "util: move string_to_uint_map to glsl"
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 14:52:16 +01:00
Michael Olbrich
81d5c31631 egl/dri2: only destroy created objects
dri2_display_destroy may be called by dri2_initialize_wayland_drm() if
initialization fails. In this case, these objects may not be initialized.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Michael Olbrich <m.olbrich@pengutronix.de>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 14:06:49 +01:00