Specifically the optimization of a conditional BREAK + WHILE sequence
into a conditional WHILE seems pretty broken. The list of successors
of "earlier_block" (where the conditional BREAK was found) is emptied
and then re-created with the same edges for no apparent reason. On
top of that the list of predecessors of the block immediately after
the WHILE loop is emptied, but only one of the original edges will be
added back, which means that potentially several blocks that still
have it on their list of successors won't be on its list of
predecessors anymore, causing all sorts of hilarity due to the
inconsistency in the control flow graph.
The solution is to remove the code that's removing valid edges from
the CFG. cfg_t::remove_block() will already clean up after itself.
The assert in bblock_t::combine_with() also needs to be removed since
we will be merging a block with multiple children into the first one
of them.
Found the issue on a hardware enabling branch originally, but
apparently somebody reproduced the same problem independently on
master in the meantime.
Fixes: d13bcdb3a9 ("i965/fs: Extend predicated break pass to predicate WHILE.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111009
Cc: jiradet.jd@gmail.com
Cc: Sergii Romantsov <sergii.romantsov@globallogic.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Paul Chelombitko <qamonstergl@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The device info initializer makes several fuctions internal:
- handling of device override
- updating topology from kernel information
The implementation file is slightly reordered due to the renamed
functions being static.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Rename the original device info initialization routine so callers
don't mistakenly call the wrong one:
gen_get_device_info_from_fd:
Queries kernel for full device info, including topology
details.
gen_get_device_info_from_pci_id:
Partially initializes device info based on PCI ID lookup, when
the kernel is not available.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
With perf queries, initializing the device info is much more complex
than just getting a PCI ID and calling gen_get_device_info. This commit
adds a new gen_get_device_info_from_fd helper in common code which does
all of the requisite kernel queries to get device info including all of
the topology information.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
When gen_device_info updates the topology in it's initializer, the
kernel queries will fail silently. Iris and anv have minimum
kernel requirements that support the queries. i965 must verify kernel
support before reporting OA metrics.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
i965 links against libdrm for drmIoctl, but anv and iris both
re-implement this routine to avoid the dependency.
intel/dev also needs an ioctl wrapper, so lets share the same
implementation everywhere.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes a hang (and abort) on empty shaders, which you shouldn't have
anyway but better safe than sorry. DCE going on the fritz is no reason
to freeze the system.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Important fields relating to shader state and UBOs are filled out from
this (misnomer) function.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The path for compute shader compiles resembles the graphic shader
compile path, although it is substantially simpler as we don't need any
shader keying.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We want this routine to be generic across graphics and compute, so let
the caller deal with the typing.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We already have helpers for packing invocations (due to its role in
instanced vertex shaders), so we can reuse this drop in for compute
shaders.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Squint at it hard enough and you realize it's the beginning of an
SFBD... I guess...
A compute shader with register spilling would be able to confirm this,
but we would expect to see the first field | 1 and an address splattered
later, setting up TLS.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
It's a little verbose, but this way we can support other shader stages
without too much contortion.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
It's still incomplette, but we're able to hook into launch_grid to
create a stub COMPUTE job.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Rather than disparate variables, let's use an array of payloads indexed
by the shader stage.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
last_tiler.gpu may be NULL at flush time despite no clear and existing
jobs -- if we executed a compute-only workload.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We *could* expose TGSI as well -- we pipe it through tgsi_to_nir for
Gallium-internal shaders anyway -- but we'd rather not.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Values reported here aren't remotely correct, but it's a start to just
get the entrypoint stubbed out.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Interestingly, this requires no compiler changes. It's just exposed as a
special varying.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This allows us to decode asymmetric varyings correctly, which occurs
with e.g. gl_FrontFacing.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We have a cap bit for gallium and a GLSL compiler flag to control this.
Just trust what GLSL gives us and stop forcing it. In order for this to
be safe, we have to advertise another cap in some of the gallium
drivers.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Conceptually follows util_set_vertex_buffers_mask but for SSBOs.
v2: Fix missing ~ when clearing mask. Adjust mask behaviour to match
freedreno/v3d when buffer == NULL.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
These drivers are kmsro drivers so they should be part of the kmsro #if
This fixes missing imx_drm driver when building with only freedreno+kmsro
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Currently we use the python package to manage repositories. At the same
time we also do that by hand - since it's a trivial echo to a file.
Stay consistent, remove the package and manage things manually.
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Use value "2" to signal that lowering is needed and supported and enable
it accordingly.
v2: - Note in CAP description that this lowering currently requires TGSI
- use "true" instead of GL_TRUE (both Erik)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>