Commit graph

92136 commits

Author SHA1 Message Date
Marek Olšák
38828094e9 radeonsi: update si_ce_needed_cs_space
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
edb59ef2dc radeonsi: do only 1 big CE dump at end of IBs and one reload in the preamble
A later commit will only upload descriptors used by shaders, so we won't do
full dumps anymore, so the only way to have a complete mirror of CE RAM
in memory is to do a separate dump after the last draw call.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
06690e63f7 radeonsi: remove early return in si_upload_descriptors
All updates of descriptors_dirty also set dirty_mask, so the return is
unnecessary. The next commit will want this function to be executed
even if dirty_mask == 0.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
b8f8d9e46c radeonsi: clamp indirect index to the number of declared shader resources
We'll do partial uploads of descriptor arrays, so we need to clamp
against what shaders declare.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
f07c15ef80 radeonsi: merge sampler and image descriptor lists into one
Sampler slots: slot[8], .. slot[39] (ascending)
Image slots: slot[7], .. slot[0] (descending)

Each image occupies 1/2 of each slot, so there are 16 images in total,
therefore the layout is: slot[15], .. slot[0]. (in 1/2 slot increments)

Updating image slot 2n+i (i <= 1) also dirties and re-uploads slot 2n+!i.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
5df24c3fa6 radeonsi: merge constant and shader buffers descriptor lists into one
Constant buffers: slot[16], .. slot[31] (ascending)
Shader buffers: slot[15], .. slot[0] (descending)

The idea is that if we have 4 constant buffers and 2 shader buffers, we only
have to upload 6 slots. That optimization is left for a later commit.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
d88ca12350 gallium/u_threaded: add a fast path for unbinding shader buffers
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
d4c8f429d1 gallium/u_threaded: add a fast path for unbinding shader images
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
ae7f7e8162 st/mesa: silence a valgrind warning in u_threaded_context due to st_draw_vbo
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
767868ff6d glsl_to_tgsi: declare all SSBOs and atomics when indirect indexing is used
Only the first array element was declared, so tgsi_shader_info::
shader_buffers_declared didn't match what the shader was using.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Samuel Pitoiset
1468e29e02 radeonsi: get the sampler view type from inst->Texture for TG4
This will also magically fix this special lowering for
bindless samplers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 21:48:16 +02:00
Samuel Pitoiset
5cb2eee557 tgsi: store the sampler view type directly in the instruction
RadeonSI needs to do a special lowering for Gather4 with integer
formats, but with bindless samplers we just can't access the index.

Instead, store the return type in the instruction like the target.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 21:48:16 +02:00
Samuel Pitoiset
ac3f6bf608 tgsi: remove some unused OPCODE macros
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 21:48:16 +02:00
Tom Stellard
14e525a4d7 gallivm: Make sure module has the correct data layout when pass manager runs
The datalayout for modules was purposely not being set in order to work around
the fact that the ExecutionEngine requires that the module's datalayout
matches the datalayout of the TargetMachine that the ExecutionEngine is
using.

When the pass manager runs on a module with no datalayout, it uses
the default datalayout which is little-endian.  This causes problems
on big-endian targets, because some optimizations that are legal on
little-endian or illegal on big-endian.

To resolve this, we set the datalayout prior to running the pass
manager, and then clear it before creating the ExectionEngine.

This patch fixes a lot of piglit tests on big-endian ppc64.

Cc: mesa-stable@lists.freedesktop.org
2017-05-18 17:52:47 +00:00
Chad Versace
8f62d21bd7 egl: Partially revert 23c86c74, fix eglMakeCurrent
Fixes regressions in Android CtsVerifier.apk on Intel Chrome OS devices
due to incorrect error handling in eglMakeCurrent. See below on how to
confirm the regression is fixed.

This partially reverts

    commit 23c86c74cc
    Author:  Chad Versace <chadversary@chromium.org>
    Subject: egl: Emit error when EGLSurface is lost

The problem with commit 23c86c74 is that, once an EGLSurface became
lost, the app could never unbind the bad surface. Each attempt to unbind
the bad surface with eglMakeCurrent failed with EGL_BAD_CURRENT_SURFACE.

Specificaly, the bad commit added the error handling below. #2 and #3
were right, but #1 was wrong.

    1. eglMakeCurrent emits EGL_BAD_CURRENT_SURFACE if the calling
       thread has unflushed commands and either previous surface is no
       longer valid.

    2. eglMakeCurrent emits EGL_BAD_NATIVE_WINDOW if either new surface
       is no longer valid.

    3. eglSwapBuffers emits EGL_BAD_NATIVE_WINDOW if the swapped surface
       is no longer valid.

Whe I wrote the bad commit, I misunderstood the EGL spec language
for #1. The correct behavior is, if I understand correctly now, is
below. This patch doesn't implement the correct behavior, though, it
just reverts the broken behavior.

    - Assume a bound EGLSurface is no longer valid.
    - Assume the bound EGLContext has unflushed commands.
    - The app calls eglMakeCurrent. The spec requires eglMakeCurrent to
      implicitly flush. After flushing, eglMakeCurrent emits
      EGL_BAD_CURRENT_SURFACE and does *not* alter the thread's
      current bindings.
    - If the app calls eglMakeCurrent again, and the app inserts no
      commands into the GL command stream between the two eglMakeCurrent
      calls, then this second eglMakeCurrent succeeds without emitting an
      error.

How to confirm this fixes the regression:

    Download android-cts-verifier-7.1_r5-linux_x86-x86.zip from
    source.android.com, unpack, and `adb install CtsVerifier.apk`.
    Run test "Projection Cube". Click the Pass button (a
    green checkmark). Then run test "Projection Widget". Confirm that
    widgets are visible and that logcat does not complain about
    eglMakeCurrent failure.

    Then confirm there are no regressions in the cts-traded module that
    commit 263243b1 fixed:

        cts-tf > run cts --skip-preconditions --skip-device-info \
                 -m CtsCameraTestCases \
                 -t android.hardware.camera2.cts.RobustnessTest

    Tested with Chrome OS board "reef".

Fixes: 23c86c74 (egl: Emit error when EGLSurface is lost)
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Nicolas Boichat <drinkcat@chromium.org>
Cc: Emil Velikov <emil.velikov@collabora.com>
2017-05-18 10:25:52 -07:00
Iago Toral Quiroga
2322ddf548 anv: fix multiview for clear commands
According to the VK_KHX_multiview spec:

"Multiview causes all drawing and clear commands in the subpass to
behave as if they were broadcast to each view, where each view is
represented by one layer of the framebuffer attachments."

This adds support for multiview clears, which were missing in the
initial implementation.

v2 (Jason):
  - split multiview from regular case
  - Use for_each_bit() macro

Fixes new CTS multiview tests:
dEQP-VK.multiview.clear_attachments.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-18 11:53:25 +02:00
Nicolai Hähnle
70215a23c6 ac: add missing extern "C" guards
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:53 +02:00
Nicolai Hähnle
6c01c4b907 ac: add radeon_info::num_{sdma,compute}_rings
Vulkan needs them.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:53 +02:00
Nicolai Hähnle
c488bf24ed ac: add radeon_surf::htile_slice_size
Vulkan needs it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
98a2492290 ac_surface: use radeon_info from ac_gpu_info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
988c866212 ac/radeonsi: move radeon_info initialization to amd/common
v2: update Android.common.mk (Emil)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
de9dd4f9f1 ac/radeonsi: move struct radeon_info to ac_gpu_info.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
4d6e75776d ac/radeonsi: move some aspects of sanity checking to ac_surface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
00f466bad9 ac/radeonsi: add ac_compute_surface to automatically switch gfx6 vs. gfx9
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
8aabed64c3 ac/radeonsi: move the bulk of gfx9_surface_init to ac_surface
We can now merge the two *_surface_init functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:51 +02:00
Nicolai Hähnle
db77cd879b ac/radeonsi: move the bulk of gfx6_surface_init to ac_surface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:51 +02:00
Nicolai Hähnle
f187a49322 ac/radeonsi: move amdgpu_addr_create to ac_surface
v2:
- update Android.common.mk (Emil)
- rebase on top of Raven support

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-05-18 11:48:51 +02:00
Nicolai Hähnle
15a844986a ac/radeonsi: move surface definitions to new header ac_surface.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:51 +02:00
Nicolai Hähnle
377877ff5f st/mesa: remove an incorrect assertion
There is really no reason why the current DrawBuffer needs to be complete
at this point. In particular, the assertion gets hit on the X server side
in libglx when running .../piglit/bin/glx-get-current-display-ext -auto
(which uses indirect GLX rendering).

Fixes: 19b61799e3 ("st/mesa: don't cast the incomplete framebufer to st_framebuffer")
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:47:27 +02:00
Samuel Iglesias Gonsálvez
e69e5c7006 i965/vec4: load dvec3/4 uniforms first in the push constant buffer
Reorder the uniforms to load first the dvec4-aligned variables in the
push constant buffer and then push the vec4-aligned ones. It takes
into account that the relocated uniforms should be aligned to their
channel size.

This fixes a bug were the dvec3/4 might be loaded one part on a GRF and
the rest in next GRF, so the region parameters to read that could break
the HW rules.

v2:
- Fix broken logic.
- Add a comment to explain what should be needed to optimise the usage
  of the push constant buffer slots, as this patch does not pack the
  uniforms.

v3:
- Implemented the push constant buffer usage optimization.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-05-18 06:49:54 +02:00
Samuel Iglesias Gonsálvez
8aa6ada838 i965/vec4: fix swizzle and writemask when loading an uniform with constant offset
It was setting XYWZ swizzle and writemask to all uniforms, no matter if they
were a vector or scalar, so this can lead to problems when loading them
to the push constant buffer.

Moreover, 'shift' calculation was designed to calculate the offset in
DWORDS, but it doesn't take into account DFs, so the calculated swizzle
for the later ones was wrong.

The indirect case is not changed because MOV INDIRECT will write
to all components. Added an assert to verify that these uniforms
are aligned.

v2:
- Fix 'shift' calculation (Curro)
- Set both swizzle and writemask.
- Add assert(shift == 0) for the indirect case.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-05-18 06:49:54 +02:00
Samuel Iglesias Gonsálvez
354f7f2cb9 i965/vec4/gs: restore the uniform values which was overwritten by failed vec4_gs_visitor execution
We are going to add a packing feature to reduce the usage of the push
constant buffer. One of the consequences is that 'nr_params' would be
modified by vec4_visitor's run call, so we need to restore it if one of
them failed before executing the fallback ones. Same thing happens to the
uniforms values that would be reordered afterwards.

Fixes GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 when
the dvec4 alignment and packing patch is applied.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-05-18 06:49:28 +02:00
Eric Anholt
e8ea42d245 vc4: Don't allocate new BOs to avoid synchronization when they're shared.
If X11 did a software fallback to the entire screen, we would throw out
the BO the screen is scanning out from and allocate a new one.

Cc: mesa-stable@lists.freedesktop.org
2017-05-17 14:18:29 -07:00
Eric Anholt
50e78cd04f vc4: Drop pointless indirections around BO import/export.
I've since found them to be more confusing by adding indirections than
clarifying by screening off resources from the handle/fd import/export
process.
2017-05-17 14:18:26 -07:00
Eric Anholt
76e4ab5715 vc4: Drop the u_resource_vtbl no-op layer.
We only ever attached one vtbl, so it was a waste of space and
indirections.
2017-05-17 14:18:26 -07:00
Marek Olšák
bd4b224fa6 gallium/radeon: use a top-of-pipe timestamp for the start of TIME_ELAPSED
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 20:28:44 +02:00
Marek Olšák
4f50c91c32 mesa: don't check mapped buffers in every draw call if drivers allow it
Before: DrawElements (16 VBOs) w/ no state change: 4.34 million/s
After:  DrawElements (16 VBOs) w/ no state change: 8.80 million/s

This inefficiency was uncovered by Timothy Arceri's no_error work.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 20:28:44 +02:00
Marek Olšák
d02d8ea8b6 mesa: add gl_constants::AllowMappedBuffersDuringExecution
for skipping mapped-buffer checking in every GL draw call

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 20:28:44 +02:00
Marek Olšák
50189379fa gallium: add PIPE_CAP_ALLOW_MAPPED_BUFFERS_DURING_EXECUTION
for skipping mapped-buffer checking in every GL draw call

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 20:28:44 +02:00
Hans de Goede
84f764a759 glxglvnddispatch: Add missing dispatch for GetDriverConfig
Together with some fixes to xdriinfo this fixes xdriinfo not working
with glvnd.

Since apps (xdriinfo) expect GetDriverConfig to work without going to
need through the dance to setup a glxcontext (which is a reasonable
expectation IMHO), the dispatch for this ends up significantly different
then any other dispatch function.

This patch gets the job done, but I'm not really happy with how this
patch turned out, suggestions for a better fix are welcome.

Cc: Kyle Brenneman <kbrenneman@nvidia.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2017-05-17 20:02:18 +02:00
Tim Rowley
dce41f7728 swr: don't use AttributeSet with llvm >= 5
This change fixes the build break with llvm-svn.

r301981 of llvm-svn made add/remove of function attributes
use AttrBuilder instead of AttributeList.

Tested with llvm-3.9, llvm-4.0, llvm-svn.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-17 11:24:46 -05:00
Chih-Wei Huang
bfc0c23843 Android: correct libz dependency
Commit 6facb0c0 ("android: fix libz dynamic library dependencies")
unconditionally adds libz as a dependency to all shared libraries.
That is unnecessary.

Commit 85a9b1b5 introduced libz as a dependency to libmesa_util.
So only the shared libraries that use libmesa_util need libz.

Fix Android Lollipop build by adding the include path of zlib to
libmesa_util explicitly instead of getting the path implicitly
from zlib since it doesn't export the include path in Lollipop.

Fixes: 6facb0c0 "android: fix libz dynamic library dependencies"

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rob Herring <robh@kernel.org>
2017-05-17 14:04:18 +01:00
Timothy Arceri
f96edf72b4 mesa: add KHR_no_error support for glDispatchCompute*()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
d1894c42ef mesa: add DispatchCompute* helpers
These will be used to add KHR_no_error support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
07e14d561c mesa: move FLUSH_CURRENT() calls out of DispatchCompute*() validation
This is required to add KHR_no_error support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
f98411eaad mesa: compute.c C99 tidy up
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
64757e73de mesa: move DispatchCompute() validation to compute.c
This is the only place it is used so there is no reason for it to be
in api_validate.c

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
25591adc28 mesa: add KHR_no_error support for glBlendEquationSeparateiARB()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
9aff3c605b mesa: add blend_equation_separatei() helper
Will be used to add KHR_no_error support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
5c8252ba6f mesa: add KHR_no_error support for glBlendFunc*iARB()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00