Commit graph

117072 commits

Author SHA1 Message Date
Jason Ekstrand
6c0f75c953 util/ralloc: Add helpers for growing zero-initialized memory
Unfortunately, we can't quite follow the standard C conventions for
these because ralloc doesn't know the sizes of pointers.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-05-14 12:30:22 -05:00
Jason Ekstrand
6212326941 intel/fs: Stop doing extra RA calls
In the last phase of the schedule and RA loop, the RA call is redundant
if we spill.  Immediately afterwards, we're going to see that we
couldn't allocate without spilling and call back into RA and tell it to
go ahead and spill.  We've known about it for a while but we've always
brushed over it on the theory that, if you're going to spill, you'll be
calling RA a bunch anyway and what does one extra RA hurt?  As it turns
out, it hurts more than you'd expect.  Because the RA interference graph
gets sparser with each spill and the RA algorithm is more efficient on
sparser graphs, the RA call that we're duplicating is actually the most
expensive call in the RA-and-spill loop.

There's another extra RA call we do that's a bit harder to see which
this also removes.  If we try to compile a shader that isn't the minimum
dispatch width and it fails to allocate without spilling we call fail()
to set an error but then go ahead and do the first spilling RA pass and
only after that's complete do we detect the fail and bail out.  By
making minimum dispatch widths part of the spill condition, we side-step
this problem.

Getting rid of these extra spills takes the compile time of a nasty
Aztec Ruins shader from about 28 seconds to about 26 seconds on my
laptop.  It also makes shader-db 1.5% faster

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15311100 -> 15311100 (0.00%)
    instructions in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 355468050 -> 355468050 (0.00%)
    cycles in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    Total CPU time (seconds): 2524.31 -> 2486.63 (-1.49%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-14 12:30:22 -05:00
Jason Ekstrand
41b310e219 util/ra: Improve the performance of ra_simplify
The most expensive part of register allocation is the ra_simplify step
which is a fixed-point algorithm with a worst-case complexity of O(n^2)
which adds the registers to a stack which we then use later to do the
actual allocation.  This commit uses bit sets and changes the core loop
of ra_simplify to first walk 32-node chunks and then walk each chunk.
This lets us skip whole 32-node chunks in one go based on bit operations
and compute the minimum q value potentially 32x as fast.  Of course, the
algorithm still has the same fundamental O(n^2) run-time but the
constant is now much lower.

In the nasty Aztec Ruins compute shader, this shaves a full four seconds
off the 30s compile time for a release build of mesa.  In a debug build
(needed for accurate stack traces), perf says that ra_select takes 20%
of runtime before this patch and only 5-6% of runtime after this patch.
It also makes shader-db runs faster.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15311100 -> 15311100 (0.00%)
    instructions in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 355468050 -> 355468050 (0.00%)
    cycles in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    Total CPU time (seconds): 2602.37 -> 2524.31 (-3.00%)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-05-14 12:30:22 -05:00
Jason Ekstrand
e1511f1d4c util/ra: Only update q_total if the reg is not assigned
We only use q_total if the reg is not assigned so there's no point in
updating it if the reg is not assigned.  This has no known perf benefit
but it will reduce churn in a future commit.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-05-14 12:30:22 -05:00
Jason Ekstrand
9d6d1f47e7 util/ra: Only update best_optimistic_node if !progress
This shaves about half a second off the 30 second compile time of one of
the compute shaders in Aztec ruins.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-05-14 12:30:22 -05:00
Jason Ekstrand
de56d3a2d1 util/ra: Make in_stack a bitset in the graph
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-05-14 12:30:22 -05:00
Jason Ekstrand
7720ad65ae util/ra: Get rid of tabs
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-05-14 12:30:22 -05:00
Chia-I Wu
34810f4237 virgl: clean up virgl_res_needs_flush
Add comments and some minor cleanups.

v2: document the function

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> (v1)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2019-05-14 17:00:22 +00:00
Chia-I Wu
08241624ad virgl: comment on a sync issue in transfers
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-05-14 17:00:22 +00:00
Chia-I Wu
76e45534d2 virgl: PIPE_TRANSFER_READ does not imply flush
virgl_res_needs_flush should suffice.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-05-14 17:00:22 +00:00
Chia-I Wu
9f8521882a virgl: do not skip readback because of explicit flush
Both apps and we (see virgl_buffer_transfer_flush_region) might
flush regions that are unmodified.  We have to read back for those
flushes.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-05-14 17:00:22 +00:00
Chia-I Wu
be8eeb3b59 virgl: remove unused virgl_transfer_inline_write
It currently has no user and is probably incorrect (resource_wait is
required in some more cases).  Remove it so that we can focus on
transfers first.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-05-14 17:00:22 +00:00
Nanley Chery
e81392868e iris/resource: Drop redundant checks for aux support
Drop some checks that are already done by ISL.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-14 16:23:12 +00:00
Nanley Chery
75a3947af4 iris/resource: Fall back to no aux if creation fails
No surface requires an auxiliary surface to operate correctly. Fall back
to an uncompressed surface if mesa fails to create and allocate an
auxiliary surface. This enables adding more restrictions to ISL without
having to update iris.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-14 16:23:12 +00:00
Nanley Chery
1423b78633 i965/miptree: Refactor intel_miptree_supports_ccs_e()
Update and rename this function to format_supports_ccs_e() to better
match its behavior.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-14 16:23:12 +00:00
Nanley Chery
779bd8d332 i965/miptree: Drop intel_*_supports_hiz()
intel_tiling_supports_hiz() and intel_miptree_supports_hiz() duplicate
much the work done by isl_surf_get_hiz_surf(). Replace them with simple
expressions.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-14 16:23:12 +00:00
Nanley Chery
29a13eb71d isl: Add restrictions to isl_surf_get_hiz_surf()
Import some restrictions from intel_tiling_supports_hiz() and
intel_miptree_supports_hiz().

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-14 16:23:12 +00:00
Nanley Chery
942755bec4 i965/miptree: Drop intel_*_supports_ccs()
intel_tiling_supports_ccs() and intel_miptree_supports_ccs() duplicate
much the work done by isl_surf_get_ccs_surf(). Drop them both and index
a boolean array to choose CCS_D in intel_miptree_choose_aux_usage().

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-14 16:23:12 +00:00
Nanley Chery
d57242190e isl: Add restriction and comments to isl_surf_get_ccs_surf()
Import some restrictions and comments from intel_miptree_supports_ccs().

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-14 16:23:12 +00:00
Nanley Chery
91a42537d1 i965/miptree: Drop intel_miptree_supports_mcs()
This function duplicates much the work done by isl_surf_get_mcs_surf().
Replace it with a simple expression.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-14 16:23:12 +00:00
Nanley Chery
1de089797c isl: Modify restrictions in isl_surf_get_mcs_surf()
Import some restrictions from intel_miptree_supports_mcs() and don't
assume that the caller knows which device generations are supported.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-14 16:23:12 +00:00
Nanley Chery
cf758c4182 i965/miptree: Fall back to no aux if creation fails
No surface requires an auxiliary surface to operate correctly. Fall back
to an uncompressed surface if mesa fails to create and allocate an
auxiliary surface. This enables adding more restrictions to ISL without
having to update i965.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-14 16:23:12 +00:00
Mathias Fröhlich
fc455797c1 mesa: Set _NEW_VARYING_VP_INPUTS iff varying_vp_inputs are set.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2019-05-14 18:09:49 +02:00
Mathias Fröhlich
b4b1df5a17 mesa: Avoid setting _NEW_VARYING_VP_INPUTS in non fixed function mode.
Instead of checking the API variant on entry of set_varying_vp_inputs
to check if we can ever be interrested in fixed function processing
or not, we can check if we are actually fixed function processing.
To check this we can use the immediately updated
gl_context::VertexProgram._VPMode value that tells us if we have a
user provided shader program or if we are in fixed function processing
either through an internal TNL shader of directly through hardware.
When doing so, we also need to recheck the varying_vp_inputs variable
at the time gl_context::VertexProgram._VPMode is set to VP_MODE_FF.
Put asserts at the consumers of gl_context::varying_vp_inputs to make
sure gl_context::VertexProgram._VPMode is set to VP_MODE_FF. By that
gl_context::varying_vp_inputs should be up to date then.

By not looking at the opengl api for this decision we should actually
catch more cases where we can avoid setting a state change flag, including
the ones where we cannot get into VP_MODE_FF by the choice of the api.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2019-05-14 18:09:49 +02:00
Mathias Fröhlich
663f93c869 mesa: Fix test for setting the _NEW_VARYING_VP_INPUTS flag.
The precondition stated in the comment is not true. The values mentioned are
only set from _mesa_update_state which in turn may not yet be called.
For now set the _NEW_VARYING_VP_INPUTS flag a bit more often, we will narrow
that down to a minimum again in a later patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2019-05-14 18:09:49 +02:00
Mathias Fröhlich
df50af19d3 mesa: Make _mesa_set_varying_vp_inputs static in state.c.
Is no longer used outside that file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2019-05-14 18:09:49 +02:00
Mathias Fröhlich
99952579f3 mesa: Fix old outdated variable name in a comment.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2019-05-14 18:09:49 +02:00
Mathias Fröhlich
e634ba5116 mesa/vbo: Update Comment to what is actually happening.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2019-05-14 18:09:49 +02:00
Jonas Ådahl
903ad59407 wayland/egl: Ensure correct buffer size when allocating
Whenever a buffer is allocated, e.g. by the first draw call or EGL call after a
buffer swap, make sure the size is up to date. Prior to this commit, we
failed to do so when querying the buffer age, or swapping buffers
without any prior EGL call or draw call.

Signed-off-by: Jonas Ådahl <jadahl@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-05-14 15:33:35 +00:00
Paulo Zanoni
73055ae1c9 egl: check if a window/pixmap is already used on surface creation
The spec says we can't create another surface if we already created a
surface with the given window or pixmap. Implement this check.

This behavior is exercised by piglit/egl-create-surface.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-05-14 12:41:14 +00:00
Paulo Zanoni
04ecda3b3c egl: store the native surface pointer in struct _egl_surface
Each platform stores this in a different place:
  - platform_drm uses dri2_surf->gbm_surf->base
  - platform_android uses dri2_surf->window
  - platform_wayland uses dri2_surf->wl_win
  - platform_x11 uses dri2_surf->drawable
  - platform_x11_dri3 uses dri3_surf->loader_drawable.drawable
  - haiku doesn't even store it!

We need access to the native surface since the specification asks us
to refuse creating a new surface if there's already an EGLSurface
associated with native_surface.

An alternative to this patch would be to create a new
API.GetNativeWindow callback that each platform would have to
implement. While that's something we can definitely do, I prefer
this approach.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-05-14 12:41:14 +00:00
Samuel Pitoiset
9520e7c1e9 radv: add support for VK_KHR_uniform_buffer_standard_layout
Nothing to do.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-14 09:15:28 +02:00
Gert Wollny
865b9ddae4 softpipe/buffer: load only as many components as the the buffer resource type provides
Otherwise we risk to read past the end of the buffer.

In addition, change the loop counters to unsigned to be consistent
with the types.

Fixes: afa8707ba9
    softpipe: add SSBO/shader atomics support.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-05-14 06:49:43 +00:00
Tomeu Vizoso
1050273094 panfrost: ci: Reduce batch size to 3000
As with the previous value of 5000 we seemed to be reaching OOM in some
circumstances.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-14 07:43:11 +02:00
Tomeu Vizoso
9beb8aedeb panfrost: ci: Update expectations
Since last Friday, these two tests have been fixed:

dEQP-GLES2.functional.shaders.functions.control_flow.return_in_nested_loop_fragment
dEQP-GLES2.functional.shaders.linkage.varying_7

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-14 07:43:06 +02:00
Eric Anholt
db329260bf freedreno: Fix warning on printing a uint64_t using %llx.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-05-13 15:37:01 -07:00
Eric Anholt
40dd28acc3 freedreno: Silence compiler warnings about "*" in boolean context.
It sure looks like we just want both of them to be nonzero, and && is
probably going to be cheaper than * anyway.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-05-13 15:37:01 -07:00
Eric Anholt
06168d3f6a freedreno: Silence compiler warnings about uninit 'layers'
My gcc can't see that the uninitialized value from the PIPE_BUFFER case
isn't used from the !PIPE_BUFFER cases later.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-05-13 15:37:01 -07:00
Eric Anholt
c49f0159bd freedreno: Quiet compiler warnings on 64-bit.
__u64 is a ulonglong on x86_64, not uint64_t, so my gcc was complaining
about the wrong type being passed in.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-05-13 15:37:01 -07:00
Eric Anholt
0734905d9a freedreno: Make emacs indent the way robclark's eclipse does.
The .editorconfig helps with the tabs, but we've got this
two-tabs-from-previous-indentation line continuation style that requires
whacking the c-file-offsets.  This will throw emacs warnings when first
opening a file in the directory, press '!' to shut it up for the future.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-05-13 15:37:01 -07:00
Eric Anholt
257999d9a8 freedreno: Make .editorconfig match .dir-locals.el.
The editorconfig takes precedence over dir-locals in emacs26 with
editorconfig enabled, so the /.editorconfig was affecting these
directories.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-05-13 15:37:01 -07:00
Jason Ekstrand
0745d4bd96 anv: Implement VK_KHR_uniform_buffer_standard_layout
There's no real work to do here since we already support scalar block
layout which is a direct superset of what this extension allows.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-05-13 17:20:33 -05:00
Jason Ekstrand
b464504777 vulkan: Update the XML and headers to 1.1.108
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-05-13 17:20:33 -05:00
Jason Ekstrand
072227da0a tu/entrypoints: Import copy
It's used without being imported
2019-05-13 17:20:33 -05:00
Karol Herbst
fc800af83b nv50/ir/nir: make use of SYSTEM_VALUE_MAX when iterating read sysvals
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
2019-05-13 23:40:40 +02:00
Karol Herbst
358e52383c nv50/ir/nir: prefer to shift 1ull instead of 1ll
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
2019-05-13 23:40:40 +02:00
Bas Nieuwenhuizen
1619f20883 radv: Clean up signalled and submitted fields from winsys fences.
Other types like syncobj do not need it, so lets make things a bit more uniform.

Also reduce confusion what the signalled/submitted referred to (especially with
imported fences)

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-05-13 20:36:29 +00:00
Samuel Pitoiset
5555db103e radv: bump reported version to 1.1.107
VK_AMD_draw_indirect_count has been promoted with the suffix
changed to KHR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-13 21:38:01 +02:00
Eric Anholt
60a64f028d v3d: Use driconf to expose non-MSAA texture limits for Xorg.
The V3D 4.2 HW has a limit to MSAA texture sizes of 4096.  With non-MSAA,
we can go up to 7680 (actually probably 8138, but that hasn't been
validated by the HW team).  Exposing 7680 in X11 will allow dual 4k displays.
2019-05-13 12:03:11 -07:00
Eric Anholt
0c31fe9ee7 gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.
The _LEVELS assumes that the max is always power of two.  For V3D 4.2, we
can support up to 7680 non-power-of-two MSAA textures, which will let X11
support dual 4k displays on newer hardware.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-13 12:03:08 -07:00