Commit graph

115447 commits

Author SHA1 Message Date
Jon Turney
87173ded6e glx/windows: Fix compilation with -Werror-format
Fix compilation where the DWORD type is used with a format, after
-Werror-format added by c9c1e261.

Some Win32 API types are different fundamental types in the 32-bit and
64-bit versions. This problem is then further compounded by the fact
that whilst both 32-bit Cygwin and 32-bit MinGW use the ILP32 data
model, 64-bit MinGW uses the LLP64 data model, but 64-bit Cygwin uses
the LP64 data model. This makes it near impossible to write printf
format specifiers which are correct for all those targets.

In the Win32 API, DWORD is an unsigned, 32-bit type.  So, it is defined
in terms of an unsigned long, except in the LP64 data model used by
64-bit Cygwin, where it is an unsigned int.

It should always be safe to cast it to unsigned int and use %u or %x.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 11:28:48 -07:00
Kenneth Graunke
cd796120c9 iris: Rename bind_state to bind_shader_state.
bind_state is possibly the worst name ever.  For create, we used
create_shader_state, which is more descriptive.  Put shader in the name.
2019-06-07 11:26:20 -07:00
Kenneth Graunke
d5d2fb5c4c isl: Mark enum isl_channel_select packed so it becomes 1 byte.
I recently discovered that the following code lead to valgrind errors:

   struct isl_swizzle swizzle = ISL_SWIZZLE_IDENTITY;
   VALGRIND_CHECK_MEM_IS_DEFINED(&swizzle, sizeof(swizzle));

which is surprising, because struct isl_swizzle is simply:

   struct isl_swizzle {
      enum isl_channel_select r:4;
      enum isl_channel_select g:4;
      enum isl_channel_select b:4;
      enum isl_channel_select a:4;
   };

and the above code initializes all of them with a C99 initializer.
Iván Briano reminded me that C99 initializers don't necessarily zero
padding.  A quick inspection revealed that sizeof(struct isl_swizzle)
was 4 (rather than the expected 2).  Ian Romanick suggested changing
it to uint16_t, since this is essentially dicing up an unsigned, and
that worked.

This patch marks enum isl_channel_select packed, changing its size
from 4 bytes to 1 byte.  This then makes struct isl_swizzle 2 bytes,
with no bogus padding fields.  This eliminates valgrind undefined
memory warnings.

These isl_swizzle values become part of our BLORP blit program keys,
which are then hashed.  This undefined padding was being included in
the hashing, possibly leading to issues.  I originally saw this error
when running KHR-GL45.texture_size_promotion.functional in iris under
valgrind.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-07 11:09:44 -07:00
Alyssa Rosenzweig
e1c14b2820 panfrost/ci: Texture wrap tests are legitimately fixed
These depended on the wallpaper reload.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig
8442dde169 panfrost/midgard: Lower inot to inor with 0
We were previously lowering to inand, but the second arg was not
duplicated so inot would always return ~0. Oops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig
d415748955 panfrost/midgard: Cleanup tag fetch in disassembler
Trivial.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig
d3ad8d6b48 panfrost/midgard: Use fancy iterator
Trivial cleanup.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig
ae20bee75e panfrost/midgard: Cull dead branches
This fixes bugs with complex control flow.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig
c62f2ff852 panfrost/midgard: Add mir_print_bundle helper
This helps with debugging scheduling/emission.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig
fd6d6c1b15 panfrost/midgard/disasm: Pretty-print branch tags
Just makes it a little more obvious what's going on.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig
2ebf22c399 panfrost/ci: Note some since-fixed tests
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig
de8d49acdc panfrost/midgard: Vectorize I/O
This uses the new mesa/st functionality for NIR I/O vectorization, which
eliminates a number of corner cases (resulting in assorted dEQP
failures and regressions) and should improve performance substantial due
to lessened pressure on the load/store pipe.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig
4aced18031 panfrost/midgard: Remove varyings delay pass
This pass interfered with the more delicate path required for
non-vectorized I/O. It's also ugly and duplicating the job of an actual
honest-to-goodness scheduler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig
43568f2675 panfrost/midgard: Apply component to load_input
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-07 09:05:28 -07:00
Eric Engestrom
440fe0eb43 nir: fix s/&&/||/ typo
Fixes: cd73b6174b "nir/lower_to_source_mods: Stop turning add, sat, and neg into mov"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-07 16:06:25 +01:00
Kristian H. Kristensen
b9bbac6234 freedreno/a6xx: Drop struct stage array
This now boils down to just picking between binning or vertex shader
and dummy_fs or real fs, which we can do in a couple of lines of code
instead.  The constlen logic isn't doing what it thinks it's doing,
both constlens at this point

  MAX2(s[VS].constlen, align(state->bs->constlen, 4));

are binning shader constlens.  We'll have to revisit the constlen
logic, but this commit doesn't change how it works.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 07:33:12 -07:00
Kristian H. Kristensen
9382a3c11d freedreno/a6xx: Drop support for SS6_DIRECT shader upload
a6xx only supports indirect shaders.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 07:33:10 -07:00
Kristian H. Kristensen
0ef00ceb2e freedreno/a6xx: Share shader_t_to_opcode
We have a similar function in fd6_program.c. Move to fd6_emit.h and
share.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 07:33:03 -07:00
Kristian H. Kristensen
4552162e2d freedreno/a6xx: Consolidate more of dword 0 building in fd6_draw_vbo
There's already a bit of duplicated logic here and tessellation will
add more. Build up dword 0 in fd6_draw_vbo() and drop the a4xx in the
process.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 07:32:59 -07:00
Kristian H. Kristensen
cae6b4d741 freedreno: Move fd4_size2indextype() helper to freedreno_util.h
In preparation for refactoring fd6_draw.c a bit.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 07:32:34 -07:00
Samuel Pitoiset
0905189a25 radv: enable VK_EXT_sample_locations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:11:17 +02:00
Samuel Pitoiset
05f5fa661f radv: enable HTILE for images that might need variable sample locations
This is now supported.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:11:14 +02:00
Samuel Pitoiset
e7677a697b radv: handle sample locations during automatic layout transitions
From the Vulkan spec 1.1.109:

   "Some implementations may need to evaluate depth image values
    while performing image layout transitions. To accommodate this,
    instances of the VkSampleLocationsInfoEXT structure can be
    specified for each situation where an explicit or automatic
    layout transition has to take place. [...] and
    VkRenderPassSampleLocationsBeginInfoEXT can be chained from
    VkRenderPassBeginInfo to provide sample locations for layout
    transitions performed implicitly by a render pass instance."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:11:11 +02:00
Samuel Pitoiset
d0d41e58c3 radv: determine the first subpass id for every attachments
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:11:08 +02:00
Samuel Pitoiset
f58e9f6d69 radv: handle sample locations during explicit depth/stencil transitions
From the Vulkan spec 1.1.109,

   "Some implementations may need to evaluate depth image values
    while performing image layout transitions. To accommodate this,
    instances of the VkSampleLocationsInfoEXT structure can be
    specified for each situation where an explicit or automatic
    layout transition has to take place. VkSampleLocationsInfoEXT
    can be chained from VkImageMemoryBarrier structures to provide
    sample locations for layout transitions performed by
    vkCmdWaitEvents and vkCmdPipelineBarrier calls."

This handles explicit depth/stencil layout transitions performed
with CmdWaitEvents() or CmdPipelineBarrier().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:11:01 +02:00
Samuel Pitoiset
a20925f2a9 radv: allow the depth decompress pass to emit dynamic sample locations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:11:00 +02:00
Samuel Pitoiset
2dd8dfd913 radv: allow to set dynamic sample locations to the depth decompress pass
If VK_EXT_sample_locations is used, the driver might need to emit
the sample locations specified during layout transitions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:10:55 +02:00
Samuel Pitoiset
d78990c174 radv: allow to save/restore sample locations during meta operations
This will be used for the depth decompress pass that might need
to emit variable sample locations during layout transitions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:10:50 +02:00
Kenneth Graunke
22025595f3 iris: Sweep the NIR in iris_create_uncompiled_shader().
We run a ton of backend specific passes here (mostly brw_preprocess_nir)
and ought to sweep up any unused memory at this point, since we're going
to hang on to this NIR for as long as the linked program lives.
2019-06-07 01:29:38 -07:00
Eduardo Lima Mitev
c02ffd2700 ir3: Use the new NIR lowering pass for integer multiplication
Shader-db stats courtesy of Eric Anholt:

total instructions in shared programs: 6480215 -> 6475457 (-0.07%)
instructions in affected programs: 662105 -> 657347 (-0.72%)
helped: 1209
HURT: 13
total constlen in shared programs: 1432704 -> 1427769 (-0.34%)
constlen in affected programs: 100063 -> 95128 (-4.93%)
helped: 512
HURT: 0
total max_sun in shared programs: 875561 -> 873387 (-0.25%)
max_sun in affected programs: 46179 -> 44005 (-4.71%)
helped: 1087
HURT: 0

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev
340277ad71 ir3/nir: Add new NIR AlgebraicPass for lowering imul
Currently, ir3 backend compiler is lowering integer multiplication from:

dst = a * b

to:

dst = (al * bl) + (ah * bl << 16) + (al * bh << 16)

by emitting this code:

mull.u tmp0, a, b           ; mul low, i.e. al * bl
madsh.m16 tmp1, a, b, tmp0  ; mul-add shift high mix, i.e. ah * bl << 16
madsh.m16 dst, b, a, tmp1   ; i.e. al * bh << 16

which at that point has very low chances of being optimized.

This patch adds a new nir_algebraic.AlgebraicPass to performs this
lowering during NIR algebraic optimization passes, giving it a better
chance for optimizing the resulting code.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev
3addd7c8d9 nir_algebraic: Add basic optimizations for umul_low and imadsh_mix16
For umul_low (al * bl), zero is returned if the low 16-bits word of either
source is zero.

for imadsh_mix16 (ah * bl << 16 + c), c is returned if either 'ah' or 'bl'
is zero.

A couple of nir_search_helpers are added:

is_upper_half_zero() returns true if the highest word of all components of
an integer NIR alu src are zero.

is_lower_half_zero() returns true if the lowest word of all components of
an integer nir alu src are zero.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev
e45de3a6c3 ir3/compiler: Handle new alu opcodes 'umul_low' and 'imadsh_mix16'
They directly emit ir3_MULL_U and ir3_MADSH_M16 respectively.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev
c27b3758fa nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodes
'umul_low' is the low 32-bits of unsigned integer multiply. It maps
directly to ir3's MULL_U.

'imadsh_mix16' is multiply add with shift and mix, an ir3 specific
instruction that maps directly to ir3's IMADSH_M16.

Both are necessary for the lowering of integer multiplication on
Freedreno, which will be introduced later in this series.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:45:05 +02:00
Iago Toral Quiroga
9b96ae69bc v3d: don't emit point coordinates varyings if the FS doesn't read them
We still need to emit them in V3D 3.x since there there is no mechanism to
disable them.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:29:42 +02:00
Iago Toral Quiroga
5e26e55e72 v3d: add a helper to track variables that need point coordinates
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:26:52 +02:00
Kenneth Graunke
4e3297f7d4 egl/x11: calloc dri2_surf so it's properly zeroed
Commit 2282ec0a refactored drawable creation across various platforms
into a new dri2_create_drawable helper function.

The GBM code in platform_drm.c code passed in dri2_surf->gbm_surf as the
loaderPrivate, while most other backends passed in dri2_surf directly.

To try and handle this, the patch checked if dri2_surf->gbm_surf was
non-NULL, and if so, presumed that the caller is the DRM platform and
we should use the dri2_surf->gbm_surf pointer.

This worked for most platforms, which calloc their dri2_surf structure,
zeroing the data.  Unfortunately, platform_x11.c used malloc, leaving
most of the dri2_surf as garbage.  In particular, dri2_surf->gbm_surf
was often non-NULL, causing dri2_create_drawable to try and use it,
passing a garbage pointer to the createNewDrawable hook, usually leading
to a SIGBUS or SIGSEGV when trying to dereference that bad pointer.

Since most callers calloc the data, make platform_x11.c follow suit.

Fixes crashes with i915_dri.so when running dEQP-GLES2.

Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-06-06 22:45:27 -07:00
Mark Janes
04dac69752 tests/graw: use C99 print conversion specifier for 32 bit builds
Fixes formatting errors for 32 bit compilations, eg:

  error: format specifies type 'unsigned long' but the argument has
  type 'uint64_t' (aka 'unsigned long long') [-Werror,-Wformat]
  printf("result1 = %lu result2 = %lu\n", res1.u64, res2.u64);

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-06 14:39:41 -07:00
Alyssa Rosenzweig
30adeb7a53 panfrost/midgard: Fix crash with unused SSA values
Crash introduced in "b38dab101ca7e0896255dccbd85fd510c47d84d1" but not
adding a Fixes tag since it's our bug anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-06 13:44:27 -07:00
Boris Brezillon
3d661a4ef9 panfrost: Report sRGB colorspace as not supported
The driver does not support sRGB yet, so let's report it as unsupported.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-06 13:41:54 -07:00
Erik Faye-Lund
c0dfe8c6df docs: do not use div for line-breaking
HTML has the <p>-tag for this purpose. It adds some margins, but that
just makes this read better, IMO.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-06 17:51:45 +00:00
Erik Faye-Lund
f3235cfa70 docs: fixup code-tag positioning
This reads better if we include the asterisk in the code-block, as it's
part of the function-reference, even though it's not technically
speaking code. But as the <code>-tag isn't purely for code, this should
be fine.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-06 17:51:45 +00:00
Erik Faye-Lund
205f960e08 docs: add missing code-tags
Looks like I missed a few cases when I recently added more code-tags
here. So let's add these cases as well.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-06 17:51:45 +00:00
Erik Faye-Lund
54b7a1f175 docs: add accidentally dropped "at"
When rewriting 20c56e18c2 after review, I accidentally dropped the "at"
here. Sorry for that, and let's fix it up!

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 20c56e18c2 ("docs: use proper links instead of code-tags")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-06 17:51:45 +00:00
Gurchetan Singh
110f139f98 anv: allow NV12 <--> AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420 inter-op
AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420 is an implementation defined
flexible YUV format.  Most of the times, it's NV12 or YV12.
On Intel, NV12 is preferred since it can be used by the display
engine.  

This API adds a dependency between gralloc and buffer consumers,
unfortunately.  Right now, the code seems to work for i915 gralloc,
but not cros_gralloc.  Add a preprocessor flag to fix this.

TEST=android.graphics.cts.MediaVulkanGpuTest#testMediaImportAndRendering

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-06-06 09:20:03 -07:00
Connor Abbott
9d93d2a404 ac/nir: Remove stale TODO
While we're here, copy the comment explaining this from radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-06 17:14:28 +02:00
Connor Abbott
1d55b0da59 radeonsi: Don't force dcc disable for loads
When e9d935ed0e added force_dcc_off(), we forced it off for any
preloaded image descriptor which had stores associated with them, since
the same preloaded descriptors were used for loads and stores. However,
when the preloading was removed in 16be87c904, the existing logic was
kept despite it not being necessary anymore. The comment above
force_dcc_off() only mentions stores, so only force DCC off for stores.

Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-06 17:14:28 +02:00
Gert Wollny
10895c39c3 mesa/main: Expose EXT_clip_control and related enums and the function
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-06-06 12:25:17 +02:00
Gert Wollny
f1f6228a38 mapi/glapi/registry: Update gl.xml to latest upstream version
The old copy didn't include EXT_clip_control, so update it.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-06-06 12:25:12 +02:00
Gert Wollny
8657257a6e virgl: Enable CAP_CLIP_HALFZ if host supports it
On according hosts this enables the piglits as "pass":
  arb_clip_control-*

v2: sync flag with host

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-06-06 12:24:53 +02:00