Commit graph

522 commits

Author SHA1 Message Date
Alyssa Rosenzweig
1d4a59448c treewide: Remove use_scoped_barrier
It is now set by all relevant drivers and not checked anywhere.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Jesse Natalie
082eba6165 nir_lower_mem_access_bit_sizes: Move options into a struct
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Jesse Natalie
4217353e2d nir_lower_mem_access_bit_sizes: Add a bit_size input to the callback
We'd like to use this callback to adjust loads and stores from things
that are unsupported to things that are supported, but if the input
is already supported, we'd prefer not to change it. Rather than making
up a bit size that'd work and doing a bunch of pack/unpack bit math,
only return a different bit size if the input one doesn't work for us
(i.e. can't load enough memory or just an unsupported size entirely).

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
2023-06-13 00:43:36 +00:00
Alyssa Rosenzweig
176c3a2ab7 agx: Use common nir_steal_tex_src
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23513>
2023-06-12 20:09:53 +00:00
Alyssa Rosenzweig
ba27071c8b agx: Fold addressing math into atomics
Like our loads and stores, our global atomics support indexing with a 64-bit
base plus a 32-bit element index, zero- or sign-extended and multiplied by the
word size. Unlike the loads and stores, they do not support additional shifting
(it's not too useful), so that needs an explicit lowering.

Switch to using AGX variants of the atomics, running our address pattern
matching on global atomics in order to delete some ALU.

This cleans up the image atomic lowering nicely, since we get to take full
advantage of the shift + zero-extend + add on the atomic... The shift comes from
multiplying by the bytes per pixel.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23529>
2023-06-09 12:06:00 +00:00
Alyssa Rosenzweig
13535d3f9d agx: Refactor expressions in agx_nir_lower_address
So we can add more instructions without duplication.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23529>
2023-06-09 12:06:00 +00:00
Alyssa Rosenzweig
3a0d1f83d5 agx: Stop bit-inexact conversion propagation
Despite being mathematically equivalent, the following code sequences are not
bit-identical under IEEE 754 rules due to differing internal precision:

   fadd16 r0l, r2, 0.0              z = f2f16 x
   fadd16 r1h, r0l, r0h             w = fadd z, y

versus

   fadd32 r1h, r2, r0h              f2f16(w) = fadd x, f2f32(y)

This is probably fine under GL's relaxed floating point precision rules, but
it's definitely not ok with the more strict OpenCL or Vulkan. It also is a
potential problem with GL invariance rules, if we get different results for the
same shader depending whether we did a monolithic compile or a fast link. The
place for doing inexact transformations is NIR, when we have the information
available to do so correctly. By the time we get to the backend, everything we
do needs to be bit-exact to preserve sanity.

Fixes dEQP-GLES2.functional.shaders.algorithm.rgb_to_hsl_vertex. We believe that
this is a CTS bug, but it's a useful one since it uncovered a serious driver bug
that would bite us in the much less friendly Vulkan (or god forbid OpenCL) CTS
later. It also seems like a magnet for GL app bugs, the fp16 support we do now
is uncovering bad enough bugs as it is.

shader-db results are pretty abysmal, though :|

total instructions in shared programs: 1537964 -> 1571328 (2.17%)
instructions in affected programs: 670231 -> 703595 (4.98%)

total bytes in shared programs: 10533984 -> 10732316 (1.88%)
bytes in affected programs: 4662414 -> 4860746 (4.25%)

total halfregs in shared programs: 483448 -> 474541 (-1.84%)
halfregs in affected programs: 58867 -> 49960 (-15.13%)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
8d019125a0 agx: Emit shader info late
So we can take into account program transformations for the final info. This
reports more accurate metadata.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
1dd513727d agx: Handle centroid and sample interpolation
Works great now that all the infrastructure is wired up.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
b7f130fbbc agx: Model interpolation for iter instructions
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
2548293e8b agx: Split iter and iterproj instructions
These are different (though related) instructions. I've split them in applegpu,
let's mirror that here. This simplifies the IR a bit.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
b9b71bcae6 asahi,agx: Call lower_discard_zs_emit in the driver
The driver needs to lower MSAA (because only it knows the sample count). MSAA
lowering depends on discards getting lowered (in order to get sample masks on the
discards for sample shading to work properly). Discard lowering depends on all
discards emitted. But the driver needs to lower clip planes which generates
discards. To break the circular dependency, we have the driver call the
discard lowering pass itself (in between lowering clip planes and lowering
MSAA). Technically, this is probably a layering violation but it's the least
gross solution I see.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
398851ca53 agx: Lower discard in NIR
We already lower discard in NIR when depth/stencil writes are used in the
shader. In this patch, we extend that lowering for when depth/stencil writes are
not used, in which case the discard is lowered to a sample_mask instruction.
This is a step towards multisampling, since the old lowering assumed
single-sample and there's no way to express a sample mask with a standard NIR
discard instructions so we need to lower in NIR anyway for sample shading (i.e.
if a discard_if diverges between samples in a pixel).

This changes the lowering for discard_if to be free of control flow (instead
executing a sample mask instruction unconditionally). This seems to be slightly
faster in SuperTuxKart and slightly slower in Dolphin, but I'm not too worried
right now.

To make this work, we do need some extra lowering to ensure we always execute a
sample_mask instruction, in case a discard_if is buried in other control flow
(as occurs with Dolphin's ubershaders). So that's added too. We need that for
MSAA anyway, so pardon the line count.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
989d6fd378 agx: Enable tag writes when sample mask written
Including indirectly via discard/demote.

Fixes graphical artefacts in Chromium when API sample masks are hooked up, which
will result in fragment programs that do not write colour/depth but do a lone
sample mask write. These need tag writes enabled (according to a trace from
Metal for a case constructed to test this scenario).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
f514d49ae2 agx: Handle sample_mask_agx
1:1 translation.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
73bbf43bc0 agx: Plumb in nir_intrinsic_load_sample_mask_in
We have a special register for this, although this will need some lowering for
glSampleMask.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
6fd16dd7c9 agx: Model both sources of sample_mask
We need to control both sources to implement multisampling properly. The
semantic is something like:

   foreach sample in the first mask {
      if correspond bit in second bit set {
         make sample live
      } else {
         make sample dead
      }
   }

But I'm reticent to document more formally until the details are really
understood and properly tested.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:49 +00:00
Alyssa Rosenzweig
0b95d81150 agx: Assert that sample shading is lowered
Lest someone mess this up later and then try to "implement" these intrinsics in
the backend.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:48 +00:00
Alyssa Rosenzweig
70b8babe3c agx: Use textures_used, not num_textures
The latter doesn't account for holes. Fixes regression in Neverball on Asahi.

Fixes: e607a89f ("mesa/main: ff-fragshader to nir")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:48 +00:00
Alyssa Rosenzweig
f1c2ea99e2 agx: Constant fold when optimizing int64
Otherwise we can get bcsel(false, ...) in the final optimized code, which isn't
great.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:48 +00:00
Alyssa Rosenzweig
9641fba9ba agx: Set support_16bit_alu
Allows some more optimizations.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
2023-06-07 03:21:48 +00:00
Konstantin Seurer
13c9b490a7 asahi: Reformat using the new style
Now, that the foreach macro list is complete (I hope), let's reformat
drivers that enforce correct formatting in CI.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23275>
2023-05-29 21:06:12 +00:00
Alyssa Rosenzweig
d6b8acbee9 agx: Use common combine_all_barriers callback
This contains a bugfix: execution scopes are now respected when combining
barriers. Otherwise control barriers can disappear during combining, which is
wrong.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23181>
2023-05-24 17:30:03 +00:00
Alyssa Rosenzweig
ecd295bb8b treewide: Avoid nir_lower_regs_to_ssa calls
nir_registers are only supposed to be used temporarily. They may be created by a
producer, but then must be immediately lowered prior to optimizing the produced
shader. They may be created internally by an optimization pass that doesn't want
to deal with phis, but that pass needs to lower them back to phis immediately.
Finally they may be created when going out-of-SSA if a backend chooses, but that
has to happen late.

Regardless, there should be no case where a backend sees a shader that comes in
with nir_registers needing to be lowered. The two frontend producers of
registers (tgsi_to_nir and mesa/st) both call nir_lower_regs_to_ssa to clean up
as they should. Some backend (like intel) already depend on this behaviour.
There's no need for other backends to call nir_lower_regs_to_ssa too.

Drop the pointless calls as a baby step towards replacing nir_register.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23181>
2023-05-24 17:30:03 +00:00
Alyssa Rosenzweig
01e9ee79f7 nir: Drop unused name from nir_ssa_dest_init
Since 624e799cc3 ("nir: Drop nir_ssa_def::name and nir_register::name"), SSA
defs don't have names, making the name argument unused. Drop it from the
signature and fix the call sites. This was done with the help of the following
Coccinelle semantic patch:

    @@
    expression A, B, C, D, E;
    @@

    -nir_ssa_dest_init(A, B, C, D, E);
    +nir_ssa_dest_init(A, B, C, D);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23078>
2023-05-17 23:46:16 +00:00
Alyssa Rosenzweig
c323762f9f treewide: Stop lowering legacy atomics
There are no more producers of legacy atomics so these calls are inert.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23036>
2023-05-16 22:36:21 +00:00
Alyssa Rosenzweig
65469d6b23 agx: Lower legacy atomics sooner
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23036>
2023-05-16 22:36:21 +00:00
Alyssa Rosenzweig
f5d73a9989 agx: Use unified atomics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22914>
2023-05-12 20:39:46 +00:00
Asahi Lina
64a595291e asahi: Add some more system registers
Core and opfifo stuff from the compute helper blob, vm_slot because it
was the only one changing when I poked around yesterday and it hit me
what it was ^^

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22971>
2023-05-11 23:24:48 +00:00
Alyssa Rosenzweig
5a80bf2eb0 agx: Optimize multiplies
We have an imad instruction and our iadd has a small immediate shift on the
second source. Together, these allow expressing lots of integer multiplies more
efficiently. Add some rules to optimize these now that the backend compiler can
ingest the optimized forms.

Half-register changes are from load_const scheduling changing in some vertex
shaders.

   total instructions in shared programs: 1539092 -> 1537949 (-0.07%)
   instructions in affected programs: 167896 -> 166753 (-0.68%)

   total bytes in shared programs: 10543012 -> 10533866 (-0.09%)
   bytes in affected programs: 1218068 -> 1208922 (-0.75%)

   total halfregs in shared programs: 483180 -> 483448 (0.06%)
   halfregs in affected programs: 1942 -> 2210 (13.80%)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22695>
2023-05-11 09:23:23 -04:00
Alyssa Rosenzweig
c2793a304d agx: Fix packing of imsub instructions
The negate for imad is on the third source (a * b - c), not the second source.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22695>
2023-05-11 09:23:23 -04:00
Alyssa Rosenzweig
8289fa253b agx: Handle imadshl_agx, imsubshl_agx
Same hardware instructions as iadd/isub/imad/imsub, just with the extra input
represented in NIR as required.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22695>
2023-05-11 09:23:23 -04:00
Alyssa Rosenzweig
3df4ae3334 agx: Use nir_alu_src_as_uint
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22695>
2023-05-11 09:23:04 -04:00
Alyssa Rosenzweig
e79e743674 agx: Lower I/O to scalar later
This lets us preserve vectorized stores for transform feedback shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:10:36 -04:00
Alyssa Rosenzweig
a561a6c468 agx: Validate that collect sources are the same size
RA asserts this, but by then if you've messed it up, the failure is inscrutable.
Let's check it in the validator for more pleasant debugging.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:10:36 -04:00
Alyssa Rosenzweig
9337f6a865 agx: Rework z/s emit
We were being sloppy with the sizes before. It mostly worked out, but there were
some corner cases where we would end up with mixed sized collects and that won't
end well for us. Let's rework the logic to make all the sizes explicit in NIR --
32-bit for depth and 16-bit stencil -- and then do the needed promotions to make
it happen in the AGX IR side.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:10:36 -04:00
Alyssa Rosenzweig
f4f9269b66 agx: Ensure load_frag_coord has the right sizes
In case .x isn't read, it'll be null which has the wrong size and will fail
the validation added later in this series. We fix this by padding with sized
undefs (something that exists of defined size but undefined value) rather than
nothingness (of undefined size).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:10:36 -04:00
Alyssa Rosenzweig
7f71e1bc2d agx/lower_address: Match multiplies, not only shifts
Sometimes a shader might index with a non-power-of-two stride. For example, if
it's indexing into an array of structures where the structure size is not a
power of two, we'll get a multiply with a constant as opposed to a shift. We
want to handle these cases, too. To do so, we generalize our pattern matching to
look for any kind of multiply (with our new helper), rather than hardcoding
logic for ishl. This eliminates right-shifts in a pile of compute shaders, which
makes me happy from a "I read lots of shader assembly when debugging"
perspective.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:10:36 -04:00
Alyssa Rosenzweig
032d7bd302 agx/lower_address: Add helper to match multiplies
Currently, we hardcode logic in the addressing chasing code to look for ishl
instructions that shift by constants. We can generalize this to looking for
integer multiplies by constants to optimize more addressing patterns. Add a
helper to do so.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:10:36 -04:00
Alyssa Rosenzweig
31c805d0aa agx: Don't wait at the end of the shader
This is totally pointless. This saves some waits at the ends of compute kernels
(waiting for stores to complete before terminating the thread). I don't know
how much this would matter for performance, since the hardware may have to do
these waits internally, but it makes the generated code less silly which is
always nice.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:05:39 -04:00
Alyssa Rosenzweig
87e57eae09 asahi: Rename no colour output to tag write disable
Comparison with PowerVR's XML shows that this is the actual name... And it needs
to be set a bit more carefully than "no colour output" in order to get correct
behaviour for depth-only passes that use sample mask / discard. Fix the name
first, the extra conditions will come when they're needed for multisampling.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:05:39 -04:00
Alyssa Rosenzweig
e13f9caa25 agx: Fix packing for iadd with shift
Wrong bit pattern was packed, oops.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:05:39 -04:00
Alyssa Rosenzweig
bd9c33e16a agx: Defeature fsub
All has_fsub does is fuse fsubs (they're unfused otherwise), no point doing that
if we're going to just going to lower.

shader-db is mostly noise.

total instructions in shared programs: 1487217 -> 1487035 (-0.01%)
instructions in affected programs: 22658 -> 22476 (-0.80%)
helped: 85
HURT: 2
helped stats (abs) min: 1.0 max: 12.0 x̄: 2.19 x̃: 1
helped stats (rel) min: 0.38% max: 2.46% x̄: 0.87% x̃: 0.65%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.58% max: 1.08% x̄: 0.83% x̃: 0.83%
95% mean confidence interval for instructions value: -2.51 -1.67
95% mean confidence interval for instructions %-change: -0.97% -0.70%
Instructions are helped.

total bytes in shared programs: 10189996 -> 10189288 (<.01%)
bytes in affected programs: 158132 -> 157424 (-0.45%)
helped: 85
HURT: 2
helped stats (abs) min: 4.0 max: 48.0 x̄: 8.75 x̃: 4
helped stats (rel) min: 0.22% max: 1.44% x̄: 0.51% x̃: 0.38%
HURT stats (abs)   min: 6.0 max: 30.0 x̄: 18.00 x̃: 18
HURT stats (rel)   min: 0.90% max: 0.91% x̄: 0.91% x̃: 0.91%
95% mean confidence interval for bytes value: -9.98 -6.30
95% mean confidence interval for bytes %-change: -0.56% -0.39%
Bytes are helped.

total halfregs in shared programs: 462536 -> 462556 (<.01%)
halfregs in affected programs: 131 -> 151 (15.27%)
helped: 1
HURT: 4
helped stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
helped stats (rel) min: 28.57% max: 28.57% x̄: 28.57% x̃: 28.57%
HURT stats (abs)   min: 4.0 max: 8.0 x̄: 5.50 x̃: 5
HURT stats (rel)   min: 12.77% max: 36.36% x̄: 25.01% x̃: 25.45%
95% mean confidence interval for halfregs value: -0.65 8.65
95% mean confidence interval for halfregs %-change: -18.64% 47.23%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:05:39 -04:00
Alyssa Rosenzweig
1185ac931f agx: Remove bogus assert
I->mask isn't even valid for iter instructions.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:00:59 -04:00
Alyssa Rosenzweig
fc88876329 agx: Handle linear 2D array textureSize()
We handle linear 2D arrays internally for blit shaders, so we need textureSize
to work for these. That requires some special casing, because there's a line
stride where the layer count would otherwise be. But it's not too bad.

Fixes
dEQP-GLES3.functional.shaders.texture_functions.texturesize.sampler2darray_*
when forcing linear textures.

Since we clamp array access to the maximum layer, we need textureSize() to work
for even the most basic array texturing. So this should fix blits from linear 2D
arrays as well, which finally unlocks support for compressed arrays/cubes/3D
textures.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:00:59 -04:00
Alyssa Rosenzweig
21d7049925 agx/lower_zs_emit: Fix progress returning
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:00:56 -04:00
Alyssa Rosenzweig
c8e331bf72 agx: Fix abs/neg propagation into fcmpsel
The first two sources are floats, the latter two sources and destination (and
hence the opcode) are not. Reflect that when packing and optimizing. Noticed
while debugging a silly dEQP test.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:00:56 -04:00
Alyssa Rosenzweig
632014ece0 agx: Handle splits of uniforms
This is straightforward, and can happen with certain u2u16 patterns.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:00:56 -04:00
Alyssa Rosenzweig
e9b471d1b3 asahi: Fix disk cache disable with AGX_MESA_DEBUG
We go to initialize the disk cache before we've compiled any shaders so
agx_compiler_debug is 0 at this point. Don't try to read it, instead go through
sa safe getter that will do the right thing.

Fixes: 5e9538c12e ("agx: isolate compiler debug flags")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22891>
2023-05-07 09:00:40 -04:00
Asahi Lina
00064ba4e3 asahi: Fix style nits
Found with a grep abomination which is probably too broken/silly to
actually implement in CI... but hey, at least it found some.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
2023-04-07 03:23:04 +00:00