Commit graph

82384 commits

Author SHA1 Message Date
Ilia Mirkin
438d421f8b nvc0: avoid crashing when there are holes in vertex array bindings
When using the "shared" vertex array configuration strategy, we bind
each of the buffers as a separate array. However there can be holes in
such vertex buffer lists, so just emit a disable for those.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2016-01-29 22:10:42 -05:00
Ilia Mirkin
899b1b98a4 nvc0: enable atomic counters and ssbo
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-29 22:10:42 -05:00
Ilia Mirkin
48cf392c0e nv50/ir: handle new TGSI MEMBAR opcode
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-29 21:22:48 -05:00
Ilia Mirkin
df043f0764 nvc0/ir: fix atomic compare-and-swap arguments
Teach the emitter that the two registers are sequential, and drop the
second arg entirely, in favor of a double-wide first argument.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-29 21:22:48 -05:00
Ilia Mirkin
7b9a77b905 nv50/ir: add support for indirect buffer loading
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-29 21:22:48 -05:00
Ilia Mirkin
2c4eeb0b5c nv50/ir: add SUQ op by reading the info from driver constbuf
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-29 21:22:47 -05:00
Ilia Mirkin
c3083c7082 nv50/ir: add support for BUFFER accesses
This largely leaves the existing image logic alone. When image support
is added this will have to be harmonized somehow.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-29 21:22:47 -05:00
Ilia Mirkin
abe427ebd2 nvc0: handle shader buffer memory barrier
Issue a MEM_BARRIER. No idea if this is sufficient. As there are no
tests for this, it'll have to do for now.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-29 21:22:38 -05:00
Ilia Mirkin
fe01be4ad5 nvc0: add state management for shader buffers
(address, length) pairs are uploaded to the driver constbuf as well to
make these values available to the shaders.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-29 21:06:07 -05:00
Ilia Mirkin
b4688c4615 nvc0: double per-shader stage driver constants area
We need to store a lot more info now with per-buffer address/size.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-29 21:06:06 -05:00
Ilia Mirkin
ae725d5746 trace: add support for set_shader_buffers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
v1 -> v2: add arg_begin/arg_end around buffer array
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-01-29 21:05:47 -05:00
Ilia Mirkin
fea25db925 st/mesa: enable ARB_shader_storage_buffer_object when supported
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-29 21:05:47 -05:00
Ilia Mirkin
6fb8fac853 st/mesa: add shader buffer barrier bit
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-29 21:05:47 -05:00
Ilia Mirkin
792bab24ac st/mesa: add support for memory barrier intrinsics
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)

v1 -> v2: use TGSI_MEMBAR defines
2016-01-29 21:05:47 -05:00
Ilia Mirkin
c0e1c54a4f st/mesa: use RESQ to find buffer size
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-01-29 21:05:47 -05:00
Ilia Mirkin
6880036694 st/mesa: add support for SSBO binding and GLSL intrinsics
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

v1 -> v2: some 80 char reformatting
2016-01-29 21:05:46 -05:00
Ilia Mirkin
9d6f9ccf6b st/mesa: add atomic counter support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-29 21:05:46 -05:00
Ilia Mirkin
0fddb677e6 mesa: add PROGRAM_IMMEDIATE, PROGRAM_BUFFER
This makes PROGRAM_IMMEDIATE a first-class gl_register_file type, and
adds PROGRAM_BUFFER to the list. These are used purely inside
glsl_to_tgsi conversion.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-29 21:05:35 -05:00
Ilia Mirkin
35f8488668 glsl: keep track of ssbo variable being accessed, add access params
Currently any access params (coherent/volatile/restrict) are being lost
when lowering to the ssbo load/store intrinsics. Keep track of the
variable being used, and bake its access params in as the last arg of
the load/store intrinsics.

If the variable is accessed via an instance block, then 'variable'
points to the instance block variable and not the field inside the
instance block that we are accessing. In order to check access
parameters for the field itself we need to detect this case and keep
track of the corresponding field struct so we can extract the specific
field access information from there instead.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
v1 -> v2: add tracking of struct field
v2 -> v3: minor adjustments based on Iago's feedback
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-01-29 21:05:08 -05:00
Ilia Mirkin
2b089c7ffe glsl: always initialize image_* fields, copy them on interface init
Interfaces can have image properties set in case they are buffer
interfaces. Make sure not to lose this information.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-29 21:04:56 -05:00
Ilia Mirkin
2ccc42fd2c tgsi: add MEMBAR opcode to handle memoryBarrier* GLSL intrinsics
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
v1 -> v2: add defines for the various bits
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-01-29 21:04:36 -05:00
Kristian Høgsberg Kristensen
f28645f71c anv: Don't disable snooping for mempools
There's an intermittent flushing problem with VkEvent that we need to
root cause. For now, using the snooping feature keeps the memory pools
up to date with GPU writes and fixes the problem.
2016-01-29 17:19:51 -08:00
Kristian Høgsberg Kristensen
0c4ef36360 anv: clflush is only orderered against mfence
We can't use the more fine-grained load and store fence commands (lfence
and mfence), since clflush is only guaranteed to be ordered with respect
to mfence.
2016-01-29 14:56:41 -08:00
Kristian Høgsberg Kristensen
31d3486bd2 anv: Limit flushing to the range of mapped memory 2016-01-29 14:56:41 -08:00
Ben Widawsky
89ec36f221 anv/cmd_buffer: Emit gen9 style SF state for CHV
The state for line width changes on Cherryview to use the GEN9 bits (for extra
precision).
2016-01-29 14:12:32 -08:00
Ben Widawsky
31508bd0ce anv/gen8: Extract SF state
For upcoming patch to address difference in Cherryview.
2016-01-29 14:11:53 -08:00
Michel Dänzer
30fcf241e1 winsys/amdgpu: Process RADEON_FLAG_* independently from RADEON_DOMAIN_*
In particular, AMDGPU_GEM_CREATE_CPU_GTT_USWC can affect even BOs created
in VRAM if they get evicted to GTT. In general there's no need to
restrict any of the flags to any particular domains.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-01-29 16:06:06 +09:00
Michel Dänzer
62f837e2ea winsys/amdgpu: Handle RADEON_FLAG_NO_CPU_ACCESS
Failing to do this was resulting in the kernel driver unnecessarily
leaving open the possibility of CPU access to tiled BOs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93862

(This change shouldn't be backported to stable branches, because
released versions of xf86-video-amdgpu unnecessarily try to map the
front buffer)

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-01-29 16:06:06 +09:00
Karol Herbst
29d09f8747 nv50/ir: optimize mad/fma with third argument 0 to mul
Very modest effect, but it's clearly the right thing to do.

total instructions in shared programs : 6131491 -> 6131398 (-0.00%)
total gprs used in shared programs    : 910157 -> 910131 (-0.00%)
total local used in shared programs   : 15328 -> 15328 (0.00%)

                local        gpr       inst      bytes
    helped           0          55          85          85
      hurt           0          26          20          20

Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-28 15:59:41 -05:00
Karol Herbst
3aa681449e nv50/ir: run DCE backwards
Reduces calls up to 50%

Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-28 15:34:29 -05:00
Karol Herbst
978ae28ca2 nv50/ir: optimize shl(shr(a, c), c) to and(a, ~((1 << c) - 1))
Following shader-db results on GK110:

total instructions in shared programs : 6141510 -> 6131491 (-0.16%)
total gprs used in shared programs    : 910187 -> 910157 (-0.00%)
total local used in shared programs   : 15328 -> 15328 (0.00%)

                local        gpr       inst      bytes
    helped           0          18         821         821
      hurt           0           0           0           0

Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-01-28 15:34:22 -05:00
Chad Versace
f8a4abcd15 anv: Do resolves at end of subpass 2016-01-28 10:49:50 -08:00
Chad Versace
bef8456ede anv/meta: Remove unneeded resolve pipeline
Vulkan does not allow resolving a single-sample image. So remove that
pipeline from anv_meta_state::resolve::pipelines.
2016-01-28 10:45:11 -08:00
Chad Versace
ac5594fa71 anv/meta_resolve: Remove redundant initialization params 2016-01-28 10:14:39 -08:00
Chad Versace
142da00486 anv: Drop const on anv_framebuffer::attachments
The attachments should be const, but the driver's function signatures
are generally not const-friendly.

Drop the const because it conflicts with upcoming
anv_cmd_buffer_resolve_subpass().
2016-01-28 10:03:00 -08:00
Chad Versace
22258e279d anv: Add anv_subpass::has_resolve
Indicates that the subpass has at least one resolve attachment.
2016-01-28 10:03:00 -08:00
Chad Versace
3d863e8dad anv/meta_resolve: Save/Restore viewport and scissor 2016-01-28 10:03:00 -08:00
Chad Versace
8487569fa7 anv/meta_resolve: Begin pass outside emit_resolve()
This refactor is preparation for handling subpass resolve attachments.
2016-01-28 10:03:00 -08:00
Chad Versace
2bab3cd681 anv/image: Update usage flags for multisample images
Meta resolves multisample images by binding them as textures. Therefore
we must add VK_IMAGE_USAGE_SAMPLED_BIT.
2016-01-28 10:03:00 -08:00
Ilia Mirkin
089f605439 glsl: disallow implicit conversions in ESSL shaders
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-01-28 11:31:19 -05:00
Axel Davy
dda7a84986 radeonsi: Add option for SI scheduler
Add a debug option to select the LLVM SI Machine Scheduler.
R600_DEBUG=sisched

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-01-28 17:22:44 +01:00
Jason Ekstrand
608b411e9f anv/device: Add a better version check.
We now check that the requested version is precicely within the range of
versions that we support.
2016-01-28 08:19:40 -08:00
Samuel Iglesias Gonsálvez
f9c43dd22f glsl: double-precision values don't support interpolation
ARB_gpu_shader_fp64 spec says:

  "This extension does not support interpolation of double-precision
  values; doubles used as fragment shader inputs must be qualified as
  "flat"."

Fixes the regressions added by commit 781d278:

arb_gpu_shader_fp64-double-gettransformfeedbackvarying
arb_gpu_shader_fp64-tf-interleaved
arb_gpu_shader_fp64-tf-interleaved-aligned
arb_gpu_shader_fp64-tf-separate

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93878
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2016-01-28 11:35:03 +01:00
Jason Ekstrand
6286a74f6b anv/device: Advertise 1.0.2 2016-01-27 22:02:51 -08:00
Jason Ekstrand
ec80d6388a anv/formats: Properly set FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT
This was added last minute and the API bumped to 1.0.2.
2016-01-27 22:02:06 -08:00
Jason Ekstrand
ac75746448 vulkan.h: Update to 1.0.2 2016-01-27 21:59:00 -08:00
Jason Ekstrand
c64bc5463d anv/device: Improve the api version check to allow 1.0.X 2016-01-27 21:56:46 -08:00
Eric Anholt
3fba517bdd vc4: Throttle outstanding rendering after submission.
Just make sure that after we've submitted, we get to at least 5
(global) submits ago before we go on to do more.  Prevents up to
seconds of lag with window movement in X with xcompmgr -c.  There may
be useful tuning to do in the future, but for now this gets us
usability.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
2016-01-27 20:05:37 -08:00
Eric Anholt
2a449ce7c9 vc4: Don't record the seqno of a failed job submit.
On an error return, the returned seqno will probably be unset, so we'd
lose track of what we've submitted so far for waiting on in the
future.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
2016-01-27 20:05:37 -08:00
Francisco Jerez
4604b2871a vtn: Improve accuracy of acos approximation.
The adjusted polynomial coefficients come from the numerical
minimization of the L2 norm of the relative error.  The old
coefficients would give a maximum relative error of about 15000 ULP in
the neighborhood around acos(x) = 0, the new ones give a relative
error bounded by less than 2000 ULP in the same neighborhood.
2016-01-27 19:55:21 -08:00