Commit graph

39615 commits

Author SHA1 Message Date
Alyssa Rosenzweig
f5c293425f panfrost: Correct polygon size computations
While the algorithm for computing the header size has been correct for a
while, we used a major hack to conservatively guess the body size. Let's
scrap that and figure out the algorithm we actually need to use to be
bit-identical with what the hardware expects.

We do have to be careful to add the header size to total comptued BO
size.

It's not clear how big the polygon list needs to be in practice -- but
it has to be somewhat bigger than the polygon list itself. This needs
more investigation. If we size the polygon list exactly based on the
polygon_list_size field, we get faults like:

[ 1224.219886] panfrost ff9a0000.gpu: Unhandled Page fault in AS0 at VA 0x000000001BDE8000
               Reason: TODO
               raw fault status: 0x660003C3
               decoded fault status: SLAVE FAULT
               exception type 0xC3: TRANSLATION_FAULT_LEVEL3
               access type 0x3: WRITE
               source id 0x6600

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
f6e41f30d0 panfrost: Remove DRY_RUN
Nobody uses this anymore anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig
b45eb2775e panfrost: Move pan_tiler.c outside of Gallium
The routines in this file may be shared with Vulkan.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig
fb56a162a9 panfrost: Set workgroups z to 32 for non-instanced graphics
This is a blob quirk; in so much as I know, the hardware doesn't care.
But we're trying to be bit-identical to take as much entropy out of
traces as possible, so let's introduce the quirk.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig
39b226cfb3 panfrost: Move pan_invocation to shared panfrost/
The routines in this file have no dependency on Gallium. Let's share
them so they can be used for a theoretical future Vulkan driver or, more
immediately, consulted when tracing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 08:40:51 -07:00
Tomeu Vizoso
4109a2f612 panfrost/ci: Print load stats
To help make sure we are running tests in the ideal number of threads,
print load stats to make obvious when there's a problem with
utilization.

This will be specially useful when we run tests on a wider variety of
devices.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso
3794652385 panfrost/ci: Install qemu-arm-static into chroot
Some runners may be configured such that the qemu binary might not be
available by the time we need to start running commands within the
chroot.

So make sure that it's there to avoid suprising problems in that case.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso
8496045adc panfrost/ci: Build kernel with CONFIG_DETECT_HUNG_TASK
There's lots of locking changes going into the Panfrost kernel driver,
so better be prepared.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso
a074513dc2 panfrost/ci: Print bootstrap log
A number of things can go wrong when building the rootfs from within a
non-native chroot, so make sure to print the bootstrap.log so we can
tell what's going on.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Tomeu Vizoso
76af465e57 panfrost/ci: Use Volt-based runner for dEQP tests
It's able to run tests in parallel, fully utilizing the HW and
shortening considerable the time it takes.

Needed to disable tests in RK3288 for now because Volt doesn't support
armhf yet, though this should be fixed soon.

Tests are now run with --deqp-gl-config-name=rgba8888d24s8ms0, so we are
hitting a few more failures in tests that previously were being skipped.

The time to run the tests decreases from around 8 minutes to 1:45
minutes, allowing for extending coverage without increasing CI times too
much.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 16:41:56 +02:00
Lionel Landwerlin
8a2465e3f3 radeonsi: take reference glsl types for compile threads
An application quitting before the destroying its GL context and
binding a NULL context might still have a radeonsi compiler thread
running and potentially still accessing the types.

Therefore take a reference for the duration of the threads' lifetime.

v2: Only ref the glsl types, the builtins should be used by the time
    shader data gets to a gallium driver.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-21 09:44:10 +02:00
Ilia Mirkin
958390a9bf gallium/vl: use compute preference for all multimedia, not just blit
The compute paths in vl are a bit AMD-specific. For example, they (on
nouveau), try to use a BGRX8 image format, which is not supported.
Fixing all this is probably possible, but since the compute paths aren't
in any way better, it's difficult to care.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Fixes: 9364d66cb7 (gallium/auxiliary/vl: Add video compositor compute shader render)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-20 23:51:39 -04:00
Erico Nunes
71fb721ca5 lima/ppir: use ra_get_best_spill_node to select spill node
ra_get_best_spill_node is what other users of the mesa register
allocator use.
Switching to it now also fixes an infinite loop issue with ppir regalloc
with the ppir control flow patchset, and also provides a small gain over
the previous herusitic on number of spilled nodes testing with
shader-db.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-20 21:16:02 +00:00
Eric Anholt
c1dc84e71d tgsi: Remove unused tgsi_check_soa_dependencies().
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-08-20 13:31:13 -07:00
Eric Anholt
4ebe6b2e72 tgsi: Drop the SSE2 constants setup that's been dead code since 2011.
The SSE2 executor was removed in 4eb3225b38 ("Remove tgsi_sse2.")

Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-08-20 13:31:13 -07:00
Eric Anholt
98c58355d3 tgsi: drop a stale comment
This was fixed in 912ed84f83 ("tgsi: move to using vector for system
values.")

Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-08-20 13:31:13 -07:00
Jose Maria Casanova Crespo
7c56a68c8b tgsi_to_nir: only update TGSI properties of the current shader stage
The implementation introduced in "tgsi_to_nir: be careful about not
losing any TGSI properties silently (v2)" updates all the TGSI properties,
but it didn't take into account that the shader_info structure uses a union
to store the different attributes for each shader stage.

Now we only update the attributes if they affect current shader stage,
avoiding to overwrite members of the union that should be overwritten.
This has created hundreds of regressions in v3d.

For example the TGSI_PROPERTY_VS_BLIT_SGPRS_AMD was overwritting the
same position used by TGSI_PROPERY_CS_FIXED_BLOCK_DEPTH.

Fixes: e300365197 ("tgsi_to_nir: be careful about not losing any TGSI properties silently (v2)")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-20 10:30:21 +00:00
Sagar Ghuge
fe0e9db797 iris: Enable non coherent framebuffer fetch on broadwell
v2: Use GEN_GEN in iris_state (Kenneth Graunke)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:58 -07:00
Sagar Ghuge
57ce422e20 iris: Free resource if failed to allocate surface state
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:55 -07:00
Sagar Ghuge
02244bc515 iris: Pass isl_surf to fill_surface_state
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:45 -07:00
Sagar Ghuge
638a157e02 iris: Add infrastructure to support non coherent framebuffer fetch
Create separate SURFACE_STATE for render target read in order to support
non coherent framebuffer fetch on broadwell.

Also we need to resolve framebuffer in order to support CCS_D.

v2: Add outputs_read check (Kenneth Graunke)

v3: 1) Import Curro's comment from get_isl_surf
    2) Rename get_isl_surf method
    3) Clean up allocation in case of failure

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:44 -07:00
Sagar Ghuge
61c0637afb iris: Add helper functions to get tile offset
All helper functions are ported from i965 driver.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:43 -07:00
Sagar Ghuge
7e816991cc iris: Add helper function to get isl dim layout
v2: Add missing space (Caio)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:41 -07:00
Sagar Ghuge
58471e20d2 iris: Add render target read entry in binding table
This will be used in next patches for supporting non coherent
framebuffer fetch on Broadwell.

v2: Fix comment (Kenneth Graunke)

v3: 1) Fix a few nits (Caio)
    2) Add comment (Caio)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-20 00:50:31 -07:00
Kai Wasserbäch
1abe87383e build: Bump C++ standard requirement to C++14 to fix FTBFS with LLVM 10
When building Mesa against a recent LLVM 10 with C++11, the build fails
if the AMD common code is built as well due to "std::index_sequence"
being undeclared.

LLVM requires a minimum of C++14.

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-08-20 05:39:19 +00:00
Rob Herring
d0ec5d38f6 panfrost: Add madvise support to BO cache
The kernel now supports madvise ioctl to indicate which BOs can be freed
when there is memory pressure. Mark BOs purgeable when they are in the
BO cache. The BOs must also be munmapped when they are in the cache or
they cannot be purged.

We could optimize avoiding the madvise ioctl on older kernels once the
driver version bump lands, but probably not worth it given the other
driver features also being added.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2019-08-19 19:33:20 -05:00
Marek Olšák
223b3174bd radeonsi/nir: always lower ballot masks as 64-bit, codegen handles it
This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks.

This solution is better, because the IR isn't dependent on wave32.
2019-08-19 17:23:38 -04:00
Marek Olšák
5d37194d43 radeonsi: remove the unsafemath debug option
unlikely to be used in the future

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
5586411de4 radeonsi/nir: fix counting shader inputs & outputs 2019-08-19 17:23:38 -04:00
Marek Olšák
452cb7055f radeonsi/nir: fix assertion in si_nir_load_sampler_desc 2019-08-19 17:23:38 -04:00
Marek Olšák
1f8a661748 radeonsi: clean up si_llvm_context_set_tgsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
43f8b5642b radeonsi: allocate and resize global_buffers as needed
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
c315cb509d radeonsi/gfx10: don't set PA_SC_TILE_STEERING_OVERRIDE if CLEAR_STATE sets it
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
5a2e65be89 radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
8d0d753bd0 radeonsi: fix an assertion failure: assert(!res->b.is_shared)
This only appears to happen on Raven2.

Possible way to reproduce:

resource_get_handle(WINSYS_HANDLE_TYPE_KMS) --> sets is_shared = true
resource_get_handle(WINSYS_HANDLE_TYPE_DMABUF) --> fail

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
2019-08-19 17:23:38 -04:00
Marek Olšák
bdcbac9459 radeonsi: handle the use_ngg_streamout flag in si_update_ngg 2019-08-19 17:23:38 -04:00
Marek Olšák
a6b3ca1c70 radeonsi: move the tess factor ring size assertion to a place where it matters
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Christian Gmeiner
f52b9218ff etnaviv: rs: add support for 64bpp clears
Starting with HALTI2 the RS supports 64bpp clears.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-19 22:36:45 +02:00
Christian GMEINER
7492685b1b etnaviv: update headers from rnndb
Update to etna_viv commit c51353e.

Signed-off-by: Christian GMEINER <christian.GMEINER@bachmann.info>
Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-19 22:36:45 +02:00
Eric Anholt
c45c33a5a2 gallium: Remove manual defining of PIPE_FORMAT enum values.
Now that SVGA doesn't have a table that has to be in PIPE_FORMAT
order, we can let the enums have whatever values they naturally would
without worrying about holes.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-08-19 11:48:01 -07:00
Eric Anholt
84db6ba740 svga: Drop unsupported formats from the format table.
Now that we're using the array initializers, we don't need to manually
fill out all these stub entries.

Produced with "sed -i '/.*INVALID.*INVALID.*INVALID/d'
src/gallium/drivers/svga/svga_format.c"

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-08-19 11:43:02 -07:00
Eric Anholt
ef37da52c0 svga: Remove duplication in the format table.
By using the [ ] = {} array initializer syntax, we no longer need the
entries to be listed in PIPE_FORMAT_* value order.  This means that
people adding new gallium formats don't need to cargo-cult changes to
this driver or regress that non-unit-tested requirement.

While I'm here, drop the lines for formats that no longer exist (the
numbered ones in the table).

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-08-19 11:42:55 -07:00
Eric Anholt
42efa789b5 svga: Factor out the format conversion table entry lookup.
Seemed like a sensible cleanup, while I was looking at whether I could
make the table sparse.

To make the svga table not require fixups on every new gallium format,
we may want to change how it's populated.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-08-19 11:42:36 -07:00
Jason Ekstrand
16edd02bfa iris: Only request an input mask if the shader needs it
Fixes: aebca3961b "iris: Fix handling of SIMD32 fragment shaders"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-16 19:59:42 -05:00
Xiong, James
dcad15ff54 gallium: add back YVU support
PIPE_FORMAT_YV12 is not handled so switching to PIPE_FORMAT_IYUV and
adding back YVU support.

Signed-off-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-16 13:24:49 -07:00
Erico Nunes
7a51abab42 lima: actually wait for bo in lima_bo_wait
PIPE_TIMEOUT_INFINITE is unsigned and gets assigned to signed fields
where it ends up as -1. When this reaches the kernel as a timeout it
gets translated as no timeout, which cause the waiting functions to
return immediately and not actually wait for a completion.

This seems to cause unstable results with lima where even piglit tests
randomly fail.

Handle this by setting the signed max value in case of infinite timeout.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-08-16 16:31:29 +02:00
Vasily Khoruzhick
861c2b8d31 lima: fix compilation of standalone compiler
Fixes: e0aeee946004("lima: add summary report for shader-db")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-15 16:59:51 -07:00
Erik Faye-Lund
18ab42644b gallium/util: widen type before multiplication
This method returns size_t, but the multiplication multiplies two
integers, leading to overflow rather than type widening.

Noticed by compiling with MSVC, which emits a warning.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:53 +02:00
Erik Faye-Lund
0091f62978 mesa: avoid warning on Windows
On Windows, p_atomic_inc_return returns an unsigned long long rather
than the type the pointer refers to, so let's make sure we cast the
result to the right type. Otherwise, we'll trigger a warning about
the wrong format-string for the type.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:49 +02:00
Erik Faye-Lund
544b088616 win32: unify strcasecmp definitions
There was two incompatible definitions of strcasecmp, which lead to a
compiler warning. Let's clean this up by only leaving one of them, and
using that one all the time.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:44 +02:00