DIV_ROUND_UP is the correct replacement for ALIGN_TO.
Fixes: ba83c1e2
Signed-off-by: David Rosca <nowrep@gmail.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24127>
Now that all our age tracking is moved to etna_resource_level we can unlock
some more optimizations in the transfers by skipping copies or flushes when
the whole level is discarded.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
Now that we track the age at the resource level we can optimize
the render surface update by only copying the single level we are
going to render to.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
If the blit was just a resource flush on a uncompressed buffer we can
keep the tile status as valid, as in that case only clear tiles are filled
in the target buffer, but it doesn't hurt to look at the TS buffer when
fetching from this resource as the tile status matches the content of the
buffer. For compressed formats we can't do the same, as the compressed
tiles are uncompressed when flushing the resource, so the compression tags
don't match the buffer content anymore.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
As long as the TS is valid we can use the tile status to optimize the
sample fetch, even if the resource has been flushed for any reason.
Do the check for valid TS first when checking whether to enable sampler
TS to avoid all the other checks when TS isn't usable.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
Now that we track age at the resource level we can optimize
the sampler source update by only copying/flushing the levels
that are actually used by the sampler.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
When the TS is shared all sharing instances must see the same status
information about the resource TS buffer. Add this information to the
shared TS metadata and make it take precedence over the internal
status tracking when the TS is shared.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
Handle imports/exports always deal with one specific level of
a resource, so the shared TS metadata should also be per level.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
Wrap the setting of the resource level TS validity into a
helper function to allow the implementation to change later.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
Add a small helper to get the validity of the TS buffer for
a resource level. We can drop the ts_size check in several
places, as we never set ts_valid to true if there is no TS
buffer.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
Add a small helper to transfer the age (seqno) from one resource level
to another.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
Add a small helper to mark a resource level as changed so the
seqno handling is hidden.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
Add a small helper to mark a resource level as flushed so the
seqno handling is hidden.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
If we sync/flush a full resource we can skip any level where the
target is of the same age as the source.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
A blit into a render target may destroy valid TS information, as the
destination TS isn't updated. Flush the blit destination when necessary
to make sure that all pending TS is resolved into the destination before
the blit is executed.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
Resource maps, blits and surfaces all target a specific level of a
resource, so they can have different ages. Move the seqnos tracking
the age to etna_resource_level.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19964>
Thanks to our SSA-based RA, we only use nir_register for arrays, and we only
access array registers with dedicated moves anyway. So there's no reason to need
any fancy coalescing... we can just switch to register access intrinsics and
translate them to moves exactly like we would've done when getting srcs/dests
before.
This addresses the ir3 portion of #9051.
No shader-db changes with a (significant subset of) Rob's shader-db. (Some
shaders are affected by this change but not in any way that shows up in the
stats.)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24126>
We expect that these will be lowered in NIR now.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24005>
This is beneficial for intrinsics that do an algebraic
instruction such as bitfield extract on shader arguments,
because it allows NIR to be aware of these instructions and
optimize them together with other algebraic instructions in
the shader.
Currently, just handle subgroup_id and num_subgroups intrinsics.
More will be added here in the future.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24005>
This is so we can just use the same function when it's zero.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24005>
This commit adds a partial render command to job submission.
For geom only jobs we must always submit a pr command in case we
enter SPM. For now, for geom+frag jobs, we'll also always submit
a pr command event.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24138>
Now things are structured in sections, like the other xml files.
And elements within a section are sorted alphabetically.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24138>
- Update and add csbgen definitions to make the content of the
geom and frag km stream more obvious.
- Replace some of the hard coded constants with defines.
- Adds some static assert to make the provenance of definitions
more clear as well as making sure things fit properly.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24138>
Remove the mrt setup stuff since the EOT program only support
output registers for now. When implementing the tile buffer
support this change can be reverted, or things could be changed
to better fit with how the compiler wants things.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24138>
If the *first* BO is not marked as "written", but the same BO is marked
as "written" later, then it will be added twice, because the first
instance will have index_for_handle equal to 0.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24108>
This is a temporary measure until the zeroed shaders are replaced with the real
ones. This avoids a VK_ERROR_OUT_OF_DEVICE_MEMORY error due to a zero sized
allocation.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Fixes: 1dfd535124 ("pvr: Setup SPM background object")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24139>
Use all the nir_legacy.h features to transition away from the deprecated
structures. shader-db is quite happy. I assume that's a mix of more aggressive
source mod usage and better coalescing (nir-to-tgsi doesn't copyprop).
total instructions in shared programs: 900179 -> 887156 (-1.45%)
instructions in affected programs: 562077 -> 549054 (-2.32%)
helped: 5198
HURT: 470
Instructions are helped.
total temps in shared programs: 91165 -> 91162 (<.01%)
temps in affected programs: 398 -> 395 (-0.75%)
helped: 21
HURT: 18
Inconclusive result (value mean confidence interval includes 0).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Tested-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24116>
Consider code like:
32x4 %2 = @load_interpolated_input (%1, %0 (0x0)) (base=0, component=0, dest_type=float32, io location=VARYING_SLOT_VAR0 slots=1 mediump) // Color
32x4 %3 = fabs %2
32x4 %4 = fsat %3
32x4 %5 = fsin %4
The existing logic would incorrectly tell the backend that both fabs and fsat
could be folded, and then half the shader disappears. Whoops. Fix by stopping
the folding in this case. I choose to do this check in the fsat rather than the
fabs because it's more straightforward (1 source vs N uses) but it's somewhat
arbitrary.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24116>
Consider the IR:
%0 = load_reg
%1 = fneg %0
%2 = ffloor %1
%3 = bcsel .., .., %1
Because the fneg has both foldable and non-foldable users, nir/legacy does not
fold the fneg into the load_reg. This ensures that the backend correctly emits a
dedicated fneg instruction (with the load_reg folded in) for the bcsel to use.
However, because the chasing helpers did not previously take other uses of a
modifier into account, the helpers would fuse in the fneg to the ffloor. Except
that doesn't work, because the load_reg instruction is supposed to be
eliminated. So we end up with broken chased IR:
1 = fneg r0
2 = ffloor -NULL
3 = bcsel, ..., 1
The fix is easy: only fold modifiers into ALU instructions if the modifiers can
be folded away. If we can't eliminate the modifier instruction altogether, it's
not necessarily beneficial to fold it anyway from a register pressure
perspective. So this is probably ok. With that check in place we get correct IR
1 = fneg r0
2 = ffloor 1
3 = bcsel, ..., 1
Fixes carchase/230.shader_test under softpipe.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24116>
This removes all the users of the compiler enums, and is a lot more natural now
that nir_lower_blend speaks PIPE_BLEND enums.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24076>