Commit graph

100390 commits

Author SHA1 Message Date
Glenn Kennard
77b1b33724 r600: clean up initial shader register setup
This is taken from Glenn Kennards scratch series, but separated
out as a cleanup by me.

Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 16:21:35 +10:00
Roland Scheidegger
b936f4d1ca r600: partly fix sampleMaskIn value
The hw gives us coverage for pixel, not for individual fragment shader
invocations, in case execution isn't per pixel (eg, unlike cm, actually
cannot do "real" minSampleShading, it's either per-pixel or per-fragment,
but it doesn't really make a difference here).
Also, with msaa disabled, the hw still gives us a mask corresponding to
the number of samples, where GL requires this to be 1.
Fix this up by masking the sampleMaskIn bits with the bit corresponding to
the sampleID, if we know this shader is always executed at per-sample
granularity. (In case of a per-sample frequency shader and msaa disabled,
the sampleID will always be 0, so this works just fine there.)
Fixing this for the minSampleShading case will need a shader key (radeonsi
uses the prolog part for) (for eg, could get away with a single bit, cm
would need more bits depending on sample/invocation ratio, or read the
bits from a uniform), unless we'd want to always use a sample mask uniform
(which is probably not a good idea, as it would make the ordinary common
msaa case slower for no good reason).
This fixes some parts of piglit arb_sample_shading-samplemask (with fixed
test), in particular those which use a sampleID, still failing others
as expected.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
07d724326a r600: clean up fragment shader input scan code
For some reason, we were iterating through the code twice (first just for
instructions needing barycentrics, then for instructions and input dcls).
Move things around slightly so this is no longer necessary.
There also was a unnedeed enabling of the fixed_pt_position_gpr - this is only
needed if the per-sample interpolation comes from an input, not from an
instruction (just move the assert where it belongs) (since the sample id to
sample from comes from a tgsi src in this case, and isn't sampleID).
Otherwise there should be no functional change.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
6fd3c39590 mesa: (trivial) remove unused ignore_sample_qualifier_parameter
This parameter for _mesa_get_min_incations_per_fragment() was once used
by the intel driver, but it's long gone.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@vmware.com>
2018-02-08 04:07:52 +01:00
Roland Scheidegger
becc7faae2 r600/cm: (trivial) code cleanup for emitting msaa state
No functional change (compile tested only).

Reviewed-by: Dave Airlie <airlied@redhate.com>
2018-02-08 04:07:52 +01:00
Brian Paul
b99cb13002 tgsi: use tgsi_semantic enum type in ureg code
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:43:01 -07:00
Brian Paul
174f3a4ab7 st/mesa: use tgsi_semantic enum type
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:43:01 -07:00
Brian Paul
0f7be4fc16 tgsi: use TGSI enum types in ureg code
v2: fix enum tgsi_interpolate_mode/loc typo.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:42:39 -07:00
Brian Paul
9f9ce1625f st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Brian Paul
6321b1bd40 gallium/util: replace uint with tgsi enum types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Brian Paul
15874338ff gallium/util: replace unsigned with tgsi enum types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-02-07 18:38:04 -07:00
Fredrik Höglund
5a38d8f103 radv: implement VK_EXT_external_memory_host
Ported from the radeonsi GL_AMD_pinned_memory implementation.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-08 00:46:07 +01:00
Dave Airlie
5dd385f378 r600: fix rendering regression on r6/7 gpus
Fixes: 2d5b5d267e (r600: work out target mask at framebuffer bind.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104989

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-08 09:37:09 +10:00
Grazvydas Ignotas
f91aa68ac6 radeonsi: avoid int-to-pointer-cast warnings on 32bit
I hope the actual dropping of MSB is ok, but that's what's already
happened before this change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 01:13:58 +02:00
Grazvydas Ignotas
13ada91740 gallium/hud: update some query functions
It seems these were missed when struct pipe_context * argument was
added to hud_graph::query_new_value.

Fixes: 3132afdf4c "gallium/hud: pass pipe_context explicitly to most functions"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-08 01:12:07 +02:00
Roland Scheidegger
09f49b9e50 Revert "gallium: build ddebug, noop, rbug, trace as part of auxiliary"
This reverts commit 6f82b8d8d0.

This broke scons build, and reportedly clover with autotools/meson too.
2018-02-07 23:47:39 +01:00
Marek Olšák
6f82b8d8d0 gallium: build ddebug, noop, rbug, trace as part of auxiliary
Building gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU.
(gallium build time is reduced by 15% when building only radeonsi)

Non-recursive makefiles are great!
2018-02-07 22:08:34 +01:00
Roland Scheidegger
def09f8db0 u_blit: (trivial) fix bogus argument order for set_fragment_shader
Amazingly this still worked sometimes, albeit I'm not even sure why...
This fixes d7bec6f7a6.
2018-02-07 22:03:18 +01:00
Andres Rodriguez
83990dd529 mesa: fix incorrect type when allocating arrays
The array members are have type 'struct gl_buffer_object *'

Found by coverity.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2018-02-07 14:50:21 -05:00
Roland Scheidegger
d7bec6f7a6 u_blit,u_simple_shaders: add shader to convert from xrbias format
We need this to handle some oddball dx10 format
(DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM). What you can do with this
format is very limited, hence we don't want to add it as a gallium
format (we could not express the properties of this format as
ordinary format properties neither, so like all special formats
it would need specific code for handling it in any case).
While here, also nuke the array for different shaders for different
writemasks, as it was not actually used (always full masks are
passed in for generating shaders).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-02-07 17:09:37 +01:00
Roland Scheidegger
afd1e9be17 u_simple_shaders: fix mask handling in util_make_fragment_tex_shader_writemask
The writemask handling was busted, since writing defaults to output
meant they got overwritten by the tex sampling anyway. Albeit the
affected components were undefined, so maybe with some luck it
still would have worked with some drivers - if not could as well
kill it... (This would have affected u_blitter but not u_blit since
the latter always used xyzw mask.)

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-07 17:08:24 +01:00
Bas Nieuwenhuizen
5d754872b5 autotools: Only build libmesa-st-tests-common.a for tests.
We don't need the library if we don't build tests, and building
it adds a dependency on gtest which adds a dependency on cxxabi.h.

Fixes: 6569b33b6e "mesa/st/tests: unify MockCodeLine* classes"
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2018-02-07 14:04:04 +01:00
Tapani Pälli
9d322fde97 i965: add __DRI2_BLOB support and set cache functions
v2: adjust to change that moved cache from ctx to screen

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
ae00ef2702 disk cache: add callback functionality
v2: add disk_cache_has_key, disk_cache_put_key support
    using blob cache (Nicolai, Jordan)

v3: rename set_cb as put_cb to match existing naming (Timothy)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
6a651b6b77 disk cache: initialize cache path and index only when used
This patch makes disk_cache initialize path and index lazily so
that we can utilize disk_cache without a path using callback
functionality introduced by next patch.

v2: unmap mmap and destroy queue only if index_mmap exists

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
e8495646af glsl/tests: changes to test_disk_cache_create test
Next patch will allow disk_cache instance to be created without
path set for it, modify some test cases that assume disk_cache
creation to fail with invalid path. Creation should succeed but
simple put/get test fail.

v2: leave tests as is but check that both cache struct exists
    and try simple put/get that should fail with invalid path set
    (Emil)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
83c81b6cce glsl/tests: move utility functions in cache_test
Patch moves functions higher so that we can utilize them from
test_disk_cache_create which is modified by next patch.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
6f5b57093b egl: add support for EGL_ANDROID_blob_cache
v2: cleanup, move callbacks to _egl_display struct (Emil Velikov)
    adapt to earlier ctx->screen changes

v3: remove useless checking, add _eglSetFuncName (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Tapani Pälli
cf4569da6b dri: add interface for EGL_ANDROID_blob_cache extension
v2: move from __DRIcontext to __DRIscreen (Emil Velikov)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-07 14:45:34 +02:00
Samuel Pitoiset
757d36ee70 ac/nir: use new pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics
Ported from RadeonSI.

Only one F1 2017 shader is affected, code size decreased
from 532 to 488 on both Polaris10 and Vega10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:13 +01:00
Samuel Pitoiset
2f54d7382d ac/nir: avoid loading unused VS input components
Polaris10:
Totals from affected shaders:
SGPRS: 122840 -> 120984 (-1.51 %)
VGPRS: 78812 -> 78440 (-0.47 %)
Spilled SGPRs: 177 -> 129 (-27.12 %)
Code Size: 2950028 -> 2941276 (-0.30 %) bytes
Max Waves: 17899 -> 17976 (0.43 %)

Vega10:
Totals from affected shaders:
SGPRS: 117144 -> 115776 (-1.17 %)
VGPRS: 77580 -> 77532 (-0.06 %)
Spilled SGPRs: 0 -> 152 (0.00 %)
Code Size: 3352656 -> 3347860 (-0.14 %) bytes
Max Waves: 19756 -> 19866 (0.56 %)

This increases SGPRs spilling a bit with Talos, but I have
some other ideas that might reduce it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:09 +01:00
Samuel Pitoiset
1c57a6da5e ac/shader: scan vertex inputs usage mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-07 12:42:07 +01:00
Iago Toral Quiroga
f474b19875 i965: allocate a SGVS element when VertexID or InstanceID are read
Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS
to put these beyond the last vertex element it seems that we still
need to allocate the SVGS element, otherwise we have observed cases
where we end up reading garbage. Specifically, the CTS test mentioned
below was flaky with a fail rate of ~1% on some gen9+ platforms caused
by reading garbage for the gl_InstanceID value. The flakyness goes
away as soon as we start allocating the SVGS element.

v2:
  - Do this for gen8+, not just gen9+, and pull the boolean
    outside the #if block (Jason)

Fixes flaky test:
KHR-GL45.vertex_attrib_64bit.limits_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-07 11:11:16 +01:00
Dylan Baker
c74719cf4a glapi: fix check_table test for non-shared glapi with meson
v2: - Add glapitable_h generated source to requirements

Fixes: 3218056e0e ("meson: Build i965 and dri stack")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2018-02-06 15:00:17 -08:00
Dylan Baker
002fbde71e glapi: Don't search through subdirs from glapitable.h
Because meson won't put it in that folder.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
aac3d01178 state_tracker: Don't build st-renumerate-test without shared glapi
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
0316aa432d glapi: remove APPLE extensions from test
Fixes: 7009955281 ("mesa: Remove GL_APPLE_vertex_array_object stubs")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
a4f1fc5dd1 glapi/check_table: Remove 'extern "C"' block
Using 'extern "C"' around includes is always incorrect, as the header may
contain C++ symbols (as it does in this case), which means it cannot use
C linkage. In this case the header has a template in it, which obviously
cannot be linked with C linkage rules.

Fixes: a29ad2b421 ("mesa/tests: Add tests for the generated dispatch table")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
105178db8f meson: fix test source name for static glapi
fixes: 43a6e84927 ("meson: build mesa test.")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Dylan Baker
9be7487f30 glapi: don't walk backwards for includes
Instead just set the proper -I flags and include it from a more standard
path. In this case we'll add -Isrc/mesa (which is common), and #include
main/foo.h.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-06 15:00:17 -08:00
Brian Paul
e7a4536e64 mesa: rename gl_vertex_array_object::_VertexAttrib -> _VertexArray
Since the type is gl_vertex_array.  Update comment to explain that
these arrays are only used by the VBO module.

Also rename some local variables in _mesa_update_vao_derived_arrays().

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
2018-02-06 15:36:47 -07:00
Brian Paul
d9ab39ea65 mesa: minor whitespace fixes, line wrapping in texcompress.c
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:23:26 -07:00
Brian Paul
b38196b452 mesa: simplify _mesa_get_compressed_formats()
Instead of testing for formats==NULL everywhere, just point formats at
a dummy array which will be discarded.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-06 15:23:26 -07:00
Vlad Golovkin
d919ff0f27 util: remove redundant check for the __clang__ macro
Clang defines __GNUC__ macro, so one doesn't need to check __clang__
macro in this particular case.

v2: added comment as per Brian Paul's suggestion

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-02-06 15:23:26 -07:00
Brian Paul
77bc74e674 st/mesa: use st_access_flags_to_transfer_flags() helper in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Brian Paul
1852a2e1a2 st/mesa: refactor st_bufferobj_map_range()
Use a new helper function, st_access_flags_to_transfer_flags(), to
convert the GL_MAP_x flags to PIPE_TRANSFER_x flags.

We'll be able to use this function in a couple other places.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Brian Paul
8a32dd2ec9 st/mesa: refactor bufferobj_data()
Split out some of the code into three new helper functions:
buffer_target_to_bind_flags(), storage_flags_to_buffer_flags(),
buffer_usage() to make the code more managable.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-06 15:23:26 -07:00
Samuel Pitoiset
3488a3f033 radv: run nir_opt_shrink_load
LLVM can't shrink loads.

Polaris10:
Totals from affected shaders:
SGPRS: 62528 -> 59955 (-4.11 %)
VGPRS: 44708 -> 44616 (-0.21 %)
Spilled SGPRs: 16 -> 8 (-50.00 %)
Code Size: 1355504 -> 1355172 (-0.02 %) bytes
Max Waves: 11710 -> 11670 (-0.34 %)

Vega10:
Totals from affected shaders:
SGPRS: 51448 -> 50371 (-2.09 %)
VGPRS: 39140 -> 39048 (-0.24 %)
Spilled SGPRs: 16 -> 16 (0.00 %)
Code Size: 1307188 -> 1304296 (-0.22 %) bytes
Max Waves: 11312 -> 11292 (-0.18 %)

This reduces SGPRs spilling in MadMax, and it also reduces
number of SGPRs in DOW3 and F12017. The number of waves slightly
decreases in F1 but I don't see any performance changes after
benchmarking it. Talos and Serious Sam are not affected because
they don't use any push constants.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-06 23:08:44 +01:00
Samuel Pitoiset
e68562b94b nir: add nir_opt_shrink_load pass
This is a very simple pass that just shrinks load_push_constant
intrinsics when some components are unused. For now, it can just
shrink vec4 to vec3, vec3 to vec2 and so on.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-06 23:08:39 +01:00
Timothy Arceri
e2ea9e1191 radeonsi/nir: add nir support for compiling compute shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-02-07 08:43:08 +11:00