The checked_shr wasn't returning the correct value if .wrap was not set.
We also weren't checking this case in the unit tests so we missed it.
While we're here, get rid of a bunch of pointhess `as u64` as well.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34678>
As other video memories for AV1 are already allocated for the maximum
sizes, now it does the same for MV buffers too.
This fixes a bunch of artifacts of AV1 playing.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34866>
Reduction ops don't return anything, including predicates. On Turing
through Hopper, this doesn't matter because these bits are ignored.
However, Blackwell uses those bits to adjust address calculations for
reduction ops.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
We had a bool for most of them and an enum for OpTld4. Now we have an
enum for all of them and we just reserve PerPx for OpTld4. While we're
here, rework printing to put the "." in the enum display method.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
Pre-blackwell, it's ignored so we can set whatever. On Blackwell+, it
seems to be take into account somehow (more RE needed?) so we need to
set it to rZ to get the old behavior.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
I can't see this in traces on Blackwell and it causes hangs.
These regs are in the hopper class headers so should be fine there.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34910>
Now that it is possible to have more than one initrd, let's switch to
the common b2c kernel which requires two additional initrds:
* The GPU initrd which contains amdgpu, i915, nouveau, radeon, and xe,
along with their necessary firmware
* The depmod initrd which contains what's necessary to modprobe the
modules of the GPU initrd
Since the GPU initrd is huge (73 MB), let's reduce the size by dropping
all the firmware that is not related to AMD.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34881>
This saves a bit of bandwidth when we're not going to use the value.
Improves renderpass times across 4 affected traces I tested (bioshock,
stranded deep, transport fever, and godot material testers) on sysmem by
.3% +/- .1%.
A similar change for avoiding stencil reads showed no change on the one
app affected among all of our renderdoc traces, so that's left out.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34964>
The driconf options were leaked when the panvk instance was destroyed.
Fixes: aa8fec638f ("panvk: add basic driconf infrastructure")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34986>
si_get_shader_variant_info doesn't need to check the kill flags because
killed stores are removed from NIR before that.
Only shader variants need to clear the writes_* flags if the epilog kills
them.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
Just use nir_def_bits_used() instead of the manually written conditions.
These are gathered from bits of load_scalar_arg(vs_state_bits):
- uses_vs_state_indexed
- uses_gs_state_provoking_vtx_first
- uses_gs_state_outprim
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
This is a step towards gathering shader info from shader variants instead of
input NIR.
uses_fbfetch_output can be ignored because it's already lowered to image
loads.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>
It shouldn't affect uses_vmem_load_other because divergent descriptors
are loaded with SMEM in waterfall loops.
Also all this removed code is highly questionable. Indirect access doesn't
matter for anything. Divergent access does, and that's handled correctly.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34492>