Commit graph

41348 commits

Author SHA1 Message Date
Dave Airlie
48ab21109c gallivm: fix find lsb
the GLSL return value is different than the llvm intrinsic.

Fixes arb gpu shader5 tests

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>
2020-01-23 13:48:16 +10:00
Dave Airlie
1e433c398e galllivm: fix gather offset casting
cast texture offsets to 32-bit integers

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>
2020-01-23 13:48:16 +10:00
Dave Airlie
fc9d67394d llvmpipe: fix some integer instruction lowering.
We want to lower to shifts for bitfields, and lower ifind_msb.

Fixes a bunch of gpu shader5 tests.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>
2020-01-23 13:48:16 +10:00
Dave Airlie
6c88c81df9 gallivm: fix gather component handling.
Fixes the extended gather test for gpu shader5

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>
2020-01-23 13:48:16 +10:00
Kenneth Graunke
8dc0540a17 intel: Fix aux map alignments on 32-bit builds.
ALIGN() brilliantly uses uintptr_t, making it unsafe for use with 64-bit
GPU addresses in 32-bit builds of the driver.  Use align64() instead,
which uses uint64_t.

Fixes assertion failures when running any 32-bit program on Tigerlake.

Fixes: 2e6a7ced4d ("iris/gen12: Write GFX_AUX_TABLE base address register")
Fixes: 0d0290bb3f ("intel/common: Add surface to aux map translation table support")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3507>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3507>
2020-01-23 02:16:50 +00:00
Matt Turner
4413537c80 util: Remove tmp argument from BITSET_FOREACH_SET macro
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
2020-01-23 01:52:43 +00:00
Vasily Khoruzhick
60f9b45802 lima: implement invalidate_resource()
We don't need to resolve invalidated resources, so it should
improve performance for applications that are doing this hint.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3476>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3476>
2020-01-23 01:26:23 +00:00
Ian Romanick
76970940a6 iris: Enable INTEL_shader_integer_functions2
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
b14e718e68 gallium: Add a cap bit for integer multiplication between 32-bit and 16-bit
Driver supports integer multiplication between a 32-bit integer and a
16-bit integer.  If the second operand is 32-bits, the upper 16-bits are
ignored, and the low 16-bits are possibly sign extended as necessary.

Iris will eventually enable this.  Not sure about other drivers.

v2: Add default value to u_screen.c.  Suggested by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Ian Romanick
9db20748fd gallium: Add a cap bit for OpenCL-style extended integer functions
Iris will eventually enable this.  Looking at the header files, it looks
like Midgard could also enable it.  Basically, any GPU that fully
supports OpenCL can.

v2: Add default value to u_screen.c.  Suggested by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
James Xiong
ac0219cc5b gallium: dmabuf support for yuv formats that are not natively supported
V2 (Kenneth Graunke):
   added a helper function to check if every format is supported

Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2846>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2846>
2020-01-22 21:18:49 +00:00
Eric Engestrom
8f140422ed llvmpipe: drop LLVM < 3.4 support
We don't support < 3.9 anymore, so this code is dead.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2760>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2760>
2020-01-22 11:21:13 -08:00
Alexander von Gluck IV
49d2a066c2 haiku/hgl: Fix build via header reordering 2020-01-22 16:21:54 +00:00
Alyssa Rosenzweig
4936120230 panfrost: Fix crash in compute variant allocation
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: d8a3501f1b ("panfrost: Dynamically allocate shader variants")
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3515>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3515>
2020-01-22 13:48:24 +00:00
Timur Kristóf
8e22df3aec gallium: Fix a couple of multiple definition warnings.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>
2020-01-22 09:36:25 +01:00
Timur Kristóf
a134ac5ee9 r600: Move get_pic_param to radeon_vce.c
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>
2020-01-22 09:36:23 +01:00
Timur Kristóf
b7f9759809 radeon: Move si_get_pic_param to radeon_vce.c
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>
2020-01-22 09:36:16 +01:00
Dongwon Kim
9d964da19f gallium: check all planes' pipe formats in case of multi-samplers
Current code only checks whether first plane's format is supported
in case YUV format sampling is done by sampling each plane separately.
It would be safer to check other planes' as well.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2863>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2863>
2020-01-21 23:04:33 +00:00
Kenneth Graunke
311cab27e2 iris: Drop some workarounds which are no longer necessary
These workarounds are no longer required by 10th Gen hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3495>
2020-01-21 13:58:40 -08:00
Eric Anholt
6c10af95c7 radeonsi: Drop PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS.
Now that we don't expose TGSI, we can stop exposing the flag.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>
2020-01-21 19:04:22 +00:00
Eric Anholt
609a67461d r300: Remove a bunch of default handling of pipe caps.
u_screen will return 0 for all of these, which means that this is one
less driver to see in git grep when I'm checking who exposes a cap.
The exception is the texel/gather offsets and stream output
components, which will not be exposed since we don't expose the
corresponding GLSL version.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>
2020-01-21 19:04:22 +00:00
Eric Anholt
e7e034e1de r600: Remove a bunch of default handling of pipe caps.
u_screen will return 0 for all of these, which means that this is one
less driver to see in git grep when I'm checking who exposes a cap.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>
2020-01-21 19:04:22 +00:00
Eric Anholt
3e1dd99adc radeonsi: Remove a bunch of default handling of pipe caps.
u_screen will return 0 for all of these, which means that this is one
less driver to see in git grep when I'm checking who exposes a cap.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>
2020-01-21 19:04:22 +00:00
Eric Anholt
fb6fca0037 freedreno: Stop scattered remapping of SSBOs/images to IBOs.
Just make it be all SSBOs then all storage images.  The remapping table
was there to make it so that the big gap present from gallium's atomic
lowering would get cleaned up, but that's no longer case.  The table has
made it very hard to support Vulkan storage images, so it's time for it to
go.

This does mean that an SSBO/IBO that is only loaded (or size-queried) will
now occupy a slot in the table where it wouldn't before.  This seems like
a minor cost compared to being able to drop this much logic.

With the remapping table gone, SSBO array handling for turnip just falls
out.

Fixes many array cases of
dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.*

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jonathan Marek <jonathan@marek.ca> (turnip)
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
2020-01-21 10:06:23 -08:00
Eric Anholt
d0975bfc4a nir: Drop the ssbo_offset to atomic lowering.
The arguments passed in were:
- prog->info.num_ssbos
- prog->nir->info.num_ssbos
- arbitrary values for standalone compilers

The num_ssbos should match between the prog's info and prog->nir's info
until this lowering happens.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
2020-01-21 10:06:23 -08:00
Timur Kristóf
28eb481bc2 nouveau/nvc0: add extern keyword to nvc0_miptree_vtbl.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2020-01-21 17:36:36 +01:00
Icecream95
31bd3b5279 panfrost: Add ASTC texture formats
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
2020-01-21 08:35:23 -05:00
Icecream95
960fe9daea panfrost: Add ETC1/ETC2 texture formats
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
2020-01-21 08:35:23 -05:00
Alyssa Rosenzweig
2091d311c9 panfrost: Rework linear<--->tiled conversions
There's a lot going on here (it's a ton of commits squashed together
since otherwise this would be impossible to review...)

1. We have a fast path for linear->tiled for whole (aligned) tiles, but we
have to use a slow path for unaligned accesses. We can get a pretty
major win for partial updates by using this slow path simply on the
borders of the update region, and then hit the fast path for the
tile-aligned interior. This does require some shuffling.

2. Mark the LUTs constant, which allows the compiler to inline them,
which pairs well with loop unrolling (eliminating the memory accesses
and just becoming some immediates.. which are not as immediate on
aarch64 as I'd like..)

3. Add fast path for bpp1/2/8/16. These use the same algorithm and we
have native types for them, so may as well get the fast path.

4. Drop generic path for bpp != 1/2/8/16, since these formats are
generally awful and there's no way to tile them efficienctly and
honestly there's not a good reason too either. Lima doesn't support any
of these formats; Panfrost can make the opinionated choice to make them
linear.

5. Specialize the unaligned routines. They don't have to be fully
generic, they just can't assume alignment. So now they should be nearly
as fast as the aligned versions (which get some extra tricks to be even
faster but the difference might be neglible on some workloads).

6. Specialize also for the size of the tile, to allow 4x4 tiling as well
as 16x16 tiling. This allows compressed textures to be efficiently tiled
with the same routines (so we add support for tiling ASTC/ETC textures
while we're at it)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> #lima on Mali400
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
2020-01-21 08:35:19 -05:00
Alyssa Rosenzweig
f2d876b2b2 panfrost,lima: De-Galliumize tiling routines
There's an implicit dependence on Gallium here that will add more
complexity than needed when testing/optimizing out of driver as well as
potentially Vulkanizing. We don't need a full pipe_box, just the x/y/w/h
properties directly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> #lima on Mali400
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
2020-01-21 08:35:16 -05:00
Jan Zielinski
a24b3b228a gallium/gallivm: enable linking lp_bld_printf function with C++ code
To enable linking functions declared in lp_bld_printf.h file with C++,
we need to add appropriate macros to the header.

Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3470>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3470>
2020-01-21 11:00:18 +00:00
Danylo Piliaiev
3f9a6011a6 iris: Fix value of out-of-bounds accesses for vertex attributes
Having VERTEX_BUFFER_STATE.BufferSize greater than the size of
a bound vertex buffer allows shader to read uninitialized vertex
attributes from BO, instead of allowing hardware to return zeroes
on out-of-bounds access.

OpenGL spec "6.4 Effects of Accessing Outside Buffer Bounds" says:

"Robust buffer access can be enabled by creating a context with robust access
 enabled through the window system binding APIs. When enabled, any command
 unable to generate a GL error as described above, such as buffer object accesses
 from the active program, will not read or modify memory outside of the data
 store of the buffer object and will not result in GL interruption or termination.
 Out-of-bounds reads may return values from within the buffer object or zero
 values."

Fixes three webgl tests:
 conformance/rendering/out-of-bounds-array-buffers.html
 conformance2/rendering/out-of-bounds-index-buffers-after-copying.html
 conformance2/rendering/element-index-uint.html

See #1996

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3427>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3427>
2020-01-21 09:52:40 +00:00
Jan Vesely
87e1f8eca5 clover: Initialize Asm Parsers
Fixes piglits that use ADMGCN inline assembly:
	program@execute@calls
	program@execute@amdgcn-mubuf-negative-vaddr

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2020-01-21 01:39:08 +00:00
Marek Olšák
735a3ba007 radeonsi/gfx10: enable GS fast launch for triangles and strips with NGG culling
Only non-indexed triangle lists and strips are supported. This increases
performance if there is something to cull.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
c377f45c18 radeonsi/gfx10: rewrite late alloc computation
- Use conservative late alloc when the number of CUs <= 6.
- Move the late alloc GS register to the GS shader state, so that it can be
  tuned for NGG culling.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
4e4b2d13f0 ac: add helper ac_build_triangle_strip_indices_to_triangle
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
8db00a51f8 radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
aa2d846604 radeonsi/gfx10: move GE_PC_ALLOC setting to shader states
The value is not changed. I just use a different way to compute it.

The value will vary with NGG culling.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
41fef6fc09 radeonsi/gfx10: don't initialize VGPRs not used by NGG passthrough
v2: TES doesn't use the GS PrimitiveID

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
943d131e7d radeonsi/gfx10: merge main and pos/param export IF blocks into one if possible
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
a966729c84 radeonsi/gfx10: export primitives at the beginning of VS/TES
This decreases VGPR usage and will allow us to merge some IF blocks
in shaders.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
5a0fcf11f0 radeonsi/gfx10: move s_sendmsg gs_alloc_req to the beginning of shaders
This will allow us to merge some IF blocks in shaders.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
cf9f8d1ea2 radeonsi/gfx10: correct VS PrimitiveID implementation for NGG
We didn't use the correct LDS pointer, though it probably doesn't matter,
because I think that nothing else is using LDS here.

This commit makes it consistent with all other esgs_ring use.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
b2326a7549 radeonsi/gfx10: update comments and remove invalid TODOs
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
679b6244e1 radeonsi: turn an assertion into return in si_nir_store_output_tcs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 15:40:13 -05:00
Marek Olšák
27cc7703d3 radeonsi: fix doubles and int64
Fixes: 57bd73e229 - radeonsi: remove llvm_type_is_64bit

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 15:40:10 -05:00
Marek Olšák
df34fa14bb radeonsi: don't invoke decompression inside internal launch_grid
Decompress resources properly but don't do it inside launch_grid
to prevent recursion.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
2020-01-20 15:40:08 -05:00
Marek Olšák
58c929be0d radeonsi: clean up how internal compute dispatches are handled
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
2020-01-20 15:40:07 -05:00
Marek Olšák
d69483270e Revert "radeonsi: unbind image before compute clear"
This reverts commit 3a527eda7c.

It's incorrect.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 15:40:05 -05:00
Daniel Stone
cf5fccb0d9 Revert "gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES"
This reverts commit bec9c90b5e.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>
2020-01-20 12:33:29 +00:00