Commit graph

85595 commits

Author SHA1 Message Date
Marek Olšák
a077185ea9 gallium: add PIPE_RESOURCE_FLAG_TEXTURING_MORE_LIKELY
For performance tuning in drivers. It filters out window system
framebuffers and OpenGL renderbuffers.

radeonsi will use this to guess whether a depth buffer will be read
by a shader. There is no guarantee about what will actually happen.

This is a departure from PIPE_BIND flags which are defined to be strict
but they are useless in practice.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-13 19:00:51 +02:00
Nicolai Hähnle
761388a0eb radeonsi: fix regression in image atomics
Caused by a bad rebase when pushing commit 76a940893.
2016-10-13 16:04:16 +02:00
Nicolai Hähnle
d413fbb159 st/mesa: fix vertex elements setup for doubles
Whether one or two slots are taken up by one API array depends on the
vertex shader, not on how the array is configured. When an array is
set up with fewer components than the shader expects, the high components
are undefined.

Fixes GL45-CTS.vertex_attrib_binding.basic-inputL-case1.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-13 15:41:36 +02:00
Nicolai Hähnle
15fc74905b st/glsl_to_tgsi: remove unnecessary ir_instruction argument from get_opcode
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-13 15:41:33 +02:00
Nicolai Hähnle
1d7685e52c st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-13 15:41:29 +02:00
Nicolai Hähnle
b234e37765 st/glsl_to_tgsi: simplify translate_tex_offset
This fixes a bug with offsets from uniforms which seems to have only been
noticed as a crash in piglit's
arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag
on radeonsi.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-13 15:41:11 +02:00
Nicolai Hähnle
76a940893d radeonsi: fix the coordinate overloading of llvm.amdgcn.image.atomic.cmpswap.*
Fixes GL45-CTS.shader_image_load_store.basic-allTargets-atomic*

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-13 10:17:42 +02:00
Nicolas Koch
35e2bfa6d9 radv: Return correct result in EnumeratePhysicalDevices
If pPhysicalDevices is too small for all physical devices,
the driver must return VK_INCOMPLETE. Since only a single
physical device is supported, this is only the case when
pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-13 09:11:13 +10:00
Ilia Mirkin
e6a693c447 st/mesa: only flip stipple pattern for winsys fbo's
Gallium is completely oblivious to whether the fbo is flipped or not.
Only flip the stipple pattern when the fbo is flipped as well. Otherwise
the driver has no idea when to unflip the pattern.

Fixes bin/gl-2.1-polygon-stipple-fs -fbo

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-12 17:04:16 -04:00
Emil Velikov
a4622305e6 swr: automake: add ar_eventhandlerfile_h.template to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-12 18:55:22 +01:00
Emil Velikov
3c419a941a radv: add all headers to the sources list
Otherwise they'll be missing from the tarball and the build will fail.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-12 18:55:20 +01:00
Ilia Mirkin
a48a343c29 nvc0/ir: fix textureGather with a single offset
Recent fix for non-const offsets broke the case of a single offset (vs 4
offsets). The later code relies on the offs array to contain null values
to tell whether they should be added onto the srcs list.

Fixes: 5239bd592 ("nvc0/ir: fix overwriting of value backing non-constant gather offset")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2016-10-12 13:18:14 -04:00
Ilia Mirkin
300b5ad023 nv50/ir: copy over value's register id when resolving merge of a phi
The offset needs to be properly copied over to the phi value, otherwise
it will get assigned to the base of the merge instead of the proper
location.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2016-10-12 13:18:14 -04:00
Nicolai Hähnle
789119d212 st/mesa: enable ARB_enhanced_layouts and turn the cap on
v2: mark llvmpipe & softpipe properly as well (Jason Wood)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
b5b4aa42ba st/glsl_to_tgsi: adjust swizzles and writemasks for explicit components
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
777dcf81b9 st/glsl_to_tgsi: explicitly track all input and output declaration
In order to be able to emit overlapping input and output array
declarations, we flip the logic of emitting those declarations on its
head: rather than iterating over slots and emitting the corresponding
declarations, we iterate over the declarations from GLSL and emit those.

v2: fix some regressions related to structs
v3: fix a regression in geometry and tessellation shader array handling

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v2)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v2)
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
2299a9940c st/glsl_to_tgsi: mark "gaps" in input/output arrays as used
In some cases, a shader may have an input/output array but not use some
entries in the middle. This happens with eON games, for example.

We emit declarations that cover the entire array range even if there are
some unused gaps. This patch now reflects that in the InputsRead etc.
fields to ensure the various input/outputMapping arrays are actually
correct, which will be important when we re-jiggle the way declarations
are emitted.

v2: fix a typo (Edward O'Callaghan)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
63193b9cde st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations
This optimization is incorrect with 64-bit operations, because the
channel-splitting logic in emit_asm ends up being applied twice to
the source operands.

A lucky coincidence of how the writemask test works resulted in this
optimization basically never being applied anyway. As far as I can tell,
the only case where it would (incorrectly) have been applied is something
like

    dvec2 d;
    float x = (float)d.y;

which nobody seems to have ever done. But the moral equivalent does occur
in one of the component layout piglit test.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
f5f3cadca3 st/glsl_to_tgsi: simpler fixup of empty writemasks
Empty writemasks mean "copy everything", so we can always just use the number
of vector elements (which uses the GLSL meaning here, i.e. each double is a
single element/writemask bit).

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
957d541089 st/glsl_to_tgsi: explicit handling of writemask for depth/stencil export
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
14aaaa1b4b glsl: dump explicit location when printing IR
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
2b460c750a tgsi/ureg: add ureg_DECL_output_layout
For specifying an exact location/component.

v2: change the order of parameters (Dave)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
047a7c7a0b tgsi/ureg: add layout/component input declarations
v2: change the order of parameters (Dave)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
f9a01f3872 tgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arrays
v2: remove a tautological left-over assert (Marek)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2016-10-12 18:50:10 +02:00
Nicolai Hähnle
700a571f89 gallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTS
This is a screen cap because drivers are expected to support it either
for all shader types or for none of them.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-10-12 18:50:10 +02:00
Tom Stellard
b33cb709fd radeonsi: Use the new image load/store intrinsic signatures
This patch requires LLVM r284024 or newer.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-12 16:42:43 +00:00
Tom Stellard
ff0df66e10 radeonsi: Add function for converting LLVM type to intrinsic string
The existing function only worked for integer types.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-12 16:42:07 +00:00
Tom Stellard
a96a7eae04 radeonsi: Refactor image store/load intrinsic name creation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-12 16:42:07 +00:00
Marek Olšák
d7e74b52bb winsys/amdgpu: fix infinite loop w/ RADEON_NOOP=1 caused by unsubmitted fences
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-12 18:29:40 +02:00
Marek Olšák
e4bbab9022 radeonsi: fix R600_DEBUG=precompile for shader-db
radeonsi no longer supports pixel shaders without interpolation optimizations,
which led to assertion failures in si_shader_ps when running shader-db.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-12 18:29:40 +02:00
Marek Olšák
40e1f7e09b radeonsi: use TC write-back instead of full cache invalidation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-12 18:29:40 +02:00
Marek Olšák
8cdce30cc2 radeonsi: implement TC L2 write-back (flush) without cache invalidation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-12 18:29:40 +02:00
Marek Olšák
65a4d55a9f radeonsi: don't invalidate VMEM L1 for memory barriers for index buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-12 18:29:40 +02:00
Samuel Pitoiset
87b06cab14 nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)
total instructions in shared programs :2286901 -> 2284473 (-0.11%)
total gprs used in shared programs    :335256 -> 335273 (0.01%)
total local used in shared programs   :31968 -> 31968 (0.00%)

                local        gpr       inst      bytes
    helped           0          41         852         852
      hurt           0          44          23          23

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-10-12 17:46:03 +02:00
Nicolai Hähnle
85ba409967 mapi: fix out-of-tree build dependencies
We shouldn't be using wildcard here in the first place, but changing that
is some effort. As it stands, make -p confirms that glapi_gen_mapi_deps only
contains mapi_abi.py when building outside the Mesa tree.

As a result, only some of the tables were updated when XML files change, but
not the tables for shared glapi. This change ensures that we pick up the
XML files and scripts from the source tree as dependencies also for shared
glapi.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-10-12 17:36:35 +02:00
Roland Scheidegger
7e86b2ddae draw: initialize shader inputs
This should make the code more robust if a shader tries to use inputs which
aren't defined by the vertex element layout (which usually shouldn't happen).

No piglit change.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-10-12 15:05:44 +02:00
Edward O'Callaghan
cfbf956dfd radv: trivial case stmt style fixups
Relocate a 'default:' to the end of a case stmt and fix an
indent issue.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2016-10-12 20:12:43 +11:00
Nicolas Koch
fd27d5fd92 anv: Return correct result in EnumeratePhysicalDevices
If pPhysicalDevices is too small for all physical devices,
the driver must return VK_INCOMPLETE.
Since only a single physical device is supported, this is only the case
when pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-11 22:58:27 -07:00
Kenneth Graunke
2871d4d687 anv: Allow vp_info to be NULL in 3DSTATE_CLIP code.
pViewportState may be NULL if rasterization is disabled.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-11 22:50:19 -07:00
Kenneth Graunke
ba38a9d380 anv: Fix anv_pipeline_validate_create_info assertions.
Many of these can be "NULL if the pipeline has rasterization disabled."
Also, we should assert that pMultisampleState exists.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-11 22:50:09 -07:00
Ilia Mirkin
389d6dedbe trace: add invalidate_resource callback
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-11 20:47:54 -04:00
Gustaw Smolarczyk
c3f3c6b0e8 radv/winsys: Fix radv_amdgpu_cs_grow min_size argument. (v2)
It's supposed to be how much at least we want to grow the cs, not the
minimum size of the cs after growth.

v2: Unbreak use_ib_bos.
    Don't mask the ib_size when !use_ib_bos, since it's not needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-12 09:06:30 +10:00
Grigori Goronzy
a22b5f28fb radv: fix strict aliasing violation
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-12 09:00:22 +10:00
Grigori Goronzy
0b539abcf4 radv: fix uninitialized variables
This gets rid of "may be used uninitialized" compiler warnings.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-12 09:00:22 +10:00
Grigori Goronzy
7ca44f8a33 radv: add missing unreachable
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-12 09:00:22 +10:00
Dave Airlie
8cc9f89d26 radv: remove the validation layer and some related bits.
As pointed out by Emil this isn't used in anv anymore,
and it was totally unused in radv anyways.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-12 08:57:09 +10:00
Dave Airlie
014ec78fb2 radv: drop entrypoint split out.
radv really doesn't need different dispatch per gen yet,
there really isn't that many differences yet.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-12 08:56:41 +10:00
Dave Airlie
12301c5418 radv: drop the RADV_CALL macro.
This is leftover from anv, and we really never needed it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-12 08:56:41 +10:00
Dave Airlie
fc28f89157 radv: check driver name before calling amdgpu.
This checks the kernel driver name is amdgpu before calling
libdrm_amdgpu.

This avoids the following error:
amdgpu_device_initialize: DRM version is 1.6.0 but this driver is only compatible with 3.x.x

when run on a machine with i915 graphics as well as amdgpu.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-12 08:56:41 +10:00
Dave Airlie
6215b47648 radv: fix memory leak from physical device if wsi fails
Inspired by patch from Edward O'Callaghan <funfunctor@folklore1984.net>
which didn't do it right.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-12 08:53:44 +10:00