This seems to be faster.
ministat (nir_analyze_fp_range):
Difference at 95.0% confidence
-592900 +/- 2302.24
-27.6432% +/- 0.0998961%
(Student's t, pooled s = 2719.05)
ministat (overall):
Difference at 95.0% confidence
-76.8333 +/- 27.2345
-0.632558% +/- 0.223407%
(Student's t, pooled s = 46.867)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
If (uintptr_t)&deleted_key is small enough, inserting entries into the
hash table might not work correctly.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
The samples=2 variant also flakes, matching the RPi4 pattern which
covers all sample counts. Broaden the entry to match all variants.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
The previous commit added a v3d_get_rt_format() check to reject
fast TLB blits when the job's RT format differs from the blit
destination. Since each RT format maps to a unique (internal_type,
bpp) pair via get_internal_type_bpp_for_output_format(), the
rt_format equality check is strictly stronger than the previous
internal_type/bpp comparison.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
v3d_tlb_blit_fast includes the blit onto a pending job that writes
to the source resource. The TLB data is already unpacked according to
the job's RT format, so storing it with a different RT format performs
a channel reinterpretation rather than a raw byte copy, corrupting the
data.
So when copying from RGB10_A2UI to RG16UI with glCopyImageSubData,
the copy_image path remaps both formats to R16G16_UNORM for a raw
32-bit copy. The fast TLB blit found the pending clear job
(RGB10_A2UI, 4 channels: 10-10-10-2) and stored its TLB data as RG16UI
(2 channels: 16-16), writing the unpacked 10-bit R and G channel values
into 16-bit fields instead of preserving the raw packed bits.
Previous internal_type/bpp check was insufficient: both RGB10_A2UI
and RG16UI share internal_type=16UI and the source bpp (64) exceeds
the destination bpp (32), but their channel layouts are different.
Add a check that the job's source surface RT format matches the blit
destination RT format before allowing the fast path.
Fixes: 66de8b4b5c ("v3d: add a faster TLB blit path")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
There's little point in having two unreachable blocks here. Yeah, sure,
in theory we are a little bit safer against forgetting to add a case for
newly introduced enum values here. But the UNREACHABLE macro should
already tell us when we trigger such cases anyway, and the cost here is
really readability.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40115>
The compiler seems to fail to see that all cases are handled here,
producing a warning thinking "val" can be undefined. So let's make
that very obvious, by replacing the _COUNT-case with a default
block.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40115>
panvk_image.c isn't a per-arch file, so the PAN_ARCH macro doesn't exist
here. We need to do a run-time check here instead.
Fixes: 01ba87a7fc ("panvk: Relax ms2ss afbc disablement")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40220>
In case the FS only writes one output.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40005>
When this is the case. we shouldn't hang or crash.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40005>
This should enable fast-clear for color images with the GENERAL layout
on GFX10-10.3. This seems important because DXVK tends to use that
layout more often now.
There are still issues with MSAA images, so it's only enabled for
single-sampled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40145>
This is actually not needed because nobody is using storage with
depth-only formats and compression doesn't work at all anyways.
PAL and native don't allow this either.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40214>
Like accumulators and ARF address registers, the virtual address
registers are not tracked in a way the defs analysis can know
about. This could actually be fixed, but that is future work.
Fixes: b110b06447 ("brw: introduce a new register type for the address register")
Suggested-by: Lionel
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>
brw_reg::nr encodes both which ARF it is and which instance of that
ARF. In other words, nr for acc0 and acc2 have some bits that say
BRW_ARF_ACCUMULATOR and some bits that say 0 vs 2. The previous test
would only detect acc0.
Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>
This can occur if NULL or an accumulator is an explicit destination.
update_for_reads still needs to process the sources.
v2: Pass a brw_reg to ::mark_invalid, and do the VGRF check in that one
place.
Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40083>
Add a method for determining which MESA_MAP_ACCESS_* flag would be
appropriate for a given OwnedDescriptor, based on both access flags and
write seals (since access mode can be RDWR despite the seals!)
This is useful for virtgpu implementations when mapping incoming buffers
from host software into the guest's address space. Previously Rutabaga
relied on basic heuristics like "SHM is always R/O", but with upcoming
extra protocols to be forwarded over virtgpu channels (like PipeWire)
those assumptions no longer hold true.
Signed-off-by: Val Packett <val@packett.cool>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40132>
Change the TYPE_FLOAT display format from %f to %g with sufficient
significant digits (%.5g for f16, %.9g for f32), so that float
immediates round-trip correctly through disassembly and assembly.
The %f format loses precision for small values: f16 0x0001 (denormal
~5.96e-8) displays as 0.000000, which parses back as 0x0000. The %g
format uses the minimum significant digits per IEEE 754 and strips
trailing zeros, using scientific notation when needed. Whole-number
values use %.1f to keep them unambiguously float (e.g. "1.0").
Update the etnaviv PEST grammar and the freedreno ir3 lexer/parser to
accept the new output formats (scientific notation, stripped zeros).
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40207>
They were already being handled explicitly in vtn_alu, so just handle
them directly for spec constants too -- that has to do special work
for conversions anyway. Remove the bit-size parameters from the function.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40157>
There was an assumption that if the instruction had non-native float
as a source, the first source would have such type. This doesn't
hold for Select, and the code failed in two ways
- The boolean source of Select was being converted to the non-native
float type.
- The loop that resolves the bit-size for unsized operands would
trip at `assert(i == 0)` because Select has more than one source.
Re-organize the code to track the types of the sources independently,
and fix both issues above.
Fixes: 90e1b12890 ("spirv: Add bfloat16 support to SpecConstantOp")
Fixes: 51d3c4c889 ("spirv: support float8 spec constant op")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40157>
Only used by Convert operations, so just pass 0 from callers that
are not Convert and clarify that in the code.
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40157>
Set max outstanding ray queries to 1024. This value can be tuned later
specific to apps.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40182>
- don't set fields that don't exist on some generations
- add gfx_level checks for MEM_ORDERED even when it's technically not needed
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40022>
Some of these should check has_fmask, others should check < GFX11.
v2: move to ac_cu_info
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40022>
This is an old driver bug that could cause Z corruption on gfx8-11.5.
v2: handle allow_expclear differently
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40022>
Applications often miss emitting barriers between a shader
initializing data & another shader writing data in the same location
afterward. This is very common for UAVs (see vkd3d-proton).
Vkd3d-proton does a pretty good job as inserting missing barriers
between UAV clears & writes. But some applications also have similar
issues with custom shaders. Here we introduce an analysis pass that
recognize shaders doing clear/initialization. We'll use that
information in the following commit to insert barriers after those
shaders.
Since Gfx12.5 our HW has become a lot more sensitive to those issues
due to the introduction of an L1 untyped data cache that is not
coherent across the shader units. On Gfx20+, typed data is also L1
cacheable exposing even more issues.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40187>