Rather than copying an ever growing list of params from fd_dev_info to
ir3_compiler, just store the info pointer in the compiler and use that
directly.
Mechanical change. But deletes code and removes an extra step from
adding compiler related dev info props.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39442>
In software scoreboard (Gfx12+) use information from previous
instructions to trim out-of-order dependencies. For example, in
send g1, g2 ($1)
mov g3, g1 ($1.dst) // Depends on g1 (destination of $1)
mov g4, g2 ($1.src) // Depends on g2 (source of $1)
mov g5, g1 ($1.dst) // Depends on g1 (destination of $1)
only the first `mov` needs to be annotated, because the execution will
stall until that dependency is fulfilled, which in this case means the
`send` is done and `g1` was already written.
Note that while `$x.dst` implies `$x.src`, the reverse is not true, so
if the first `mov` did not exist, both second and third `mov` in the
example would have to keep their annotations.
This patch add resolution of implicit out-of-order dependencies that are
visible inside a block.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3526>
There's agreement now these are helpful and widely supported. We can
always fallback to a custom vector class later if necessary.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3526>
Fixes: 6bda88bfdb ("pvr: copy WSI can_present_on_device function from PanVK")
Signed-off-by: Kitlith <kitlith@kitl.pw>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39415>
On the gallium side, set the NIR option to leave IO lowered after
linking when using GLSL. On the turnip side, move up nir_lower_io to
as early as currently possible. Further turnip passes will have to be
converted to intrinsics before we can switch to using the new linker.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39328>
pipe_stream_output_info is considered deprecated, and not filled out for
drivers using I/O intrinsics. Switch to using NIR instead, which should
be more reliable and allows us to stop having separate paths for
gathering this info in turnip and freedreno.
Since we don't have a map of location -> driver_location before
compiling, we have to switch to using locations in the streamout info.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39328>
Other parts of NIR, such as unlower_io_to_variables, expect i/o
variables to be gone, and the NIR linker removes them. Now that we've
removed all the remaining bits relying on variables we can start
deleting them.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39328>
In gallium, the GLSL linker already lowers variables so this was mostly
a no-op. However in order to use common passes that operate on
intrinsics in turnip, and eventually convert turnip passes to use
intrinsics, we need to split out nir_lower_io so that we can call it
earlier. Split out the call to nir_lower_io to a separate function, and
move it up in turnip. In the future we will move it up even further
until we can use the new NIR linking passes that operate on intrinsics.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39328>
Lowering FP variants calls a number of passes that aren't aware of
driver locations. For drivers relying on them, make sure that they are
still correct by recomputing them rather than adding awareness to every
pass.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39328>
The offset for the dynamic buffers needs to be computed with the currently
bound pipeline layout. This change fixes incorrectly selecting the offset
for a dynamic buffer if a descriptor with a lower index than the currently
being bound contains a dynamic buffer but said descriptor hasn't being
bound yet. It also prevents the binding to override the dynamic buffers in
order to preserve the already bound dynamic descriptors.
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39203>
The offset for the dynamic buffers needs to be computed with the currently
bound pipeline layout. This change fixes incorrectly selecting the offset
for a dynamic buffer if a descriptor with a lower index than the currently
being bound contains a dynamic buffer but said descriptor hasn't being
bound yet. It also prevents the binding to override the dynamic buffers in
order to preserve the already bound dynamic descriptors.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39203>
Now that all larger workgroup sizes are lowered to 256,
the regalloc hang cannot mess up the compute queues anymore.
Still don't allow compute queues on GFX6 though,
those have never been enabled ever since RadeonSI started using
the compute queue in a1378639ab - let's keep it that way.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>
Now that all larger workgroup sizes are lowered to 256,
the regalloc hang cannot mess up the compute queues anymore.
Still don't allow compute queues on GFX6 though,
they are prone to hangs. Needs further investigation.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>
Even though radeonsi may not use compute queues, other processes
might run compute jobs in the background, so radeonsi must make
sure not to use larger than 256 sized workgroups on GPUs that
are affected by the regalloc hang.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>
Even though radeonsi may not use compute queues, other processes
might run compute jobs in the background, so radeonsi must make
sure not to use larger than 256 sized workgroups on GPUs that
are affected by the regalloc hang.
Unfortunately that means that for now RadeonSI won't be able to
support ARB_compute_variable_group_size on these GPUs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>