Commit graph

197539 commits

Author SHA1 Message Date
Zan Dobersek
fae4a23ab1 fd/pps: specify counter group for each countable
For each countable that's being set up, the specific counter group is now
also required. This way on a7xx it will be possible to differentiate
between countables that have the same name but can be used through counter
groups for rendering bin or for visibility bin (e.g. CP and BV_CP).

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Acked-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29677>
2024-11-11 13:39:39 +00:00
Danylo Piliaiev
21359417ba ir3/parser: Print the line where parsing error occurred
Super useful with rddecompiler, otherwise it's impossible to
determine the instruction which is failed to be parsed.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31954>
2024-11-11 11:38:17 +00:00
Samuel Pitoiset
30d9166d80 radv: dump the trap handler shader with RADV_DEBUG=dump_trap_handler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32031>
2024-11-11 09:34:05 +00:00
Samuel Pitoiset
4d50691ae9 radv: remove unused parameter to radv_fill_nir_compiler_options()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32031>
2024-11-11 09:34:05 +00:00
Samuel Pitoiset
fb5a3cca7a docs: add missing documentation for RADV_DEBUG=psocachestats
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32077>
2024-11-11 09:26:28 +00:00
Konstantin Seurer
e3cf6290e0 radv: Add RADV_DEBUG=nirdebuginfo
Annotates the shader with source locations into the nir shader.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:14 +00:00
Konstantin Seurer
cf447c5da1 nir: Do not gather source locations for phis
Phi instructions are expected to be the first instructions in a block.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:14 +00:00
Konstantin Seurer
f2c204daf0 nir: Add a first_line parameter to gather_debug_info
Useful when the file contains multiple shaders.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:14 +00:00
Konstantin Seurer
736c8c6f23 radv: Dump nir shaders before compiling
It will allow adding source locations that point to the nir_string to
the shader.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:14 +00:00
Konstantin Seurer
aaf65d6219 radv: Store debug info inside radv_shader
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:14 +00:00
Konstantin Seurer
54c22656b8 radv: Add a helper for accessing the shader binary
Use pointers into the blob instead of hardcoding the layout everywhere.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:13 +00:00
Konstantin Seurer
69ebba82d4 aco: Pass debug information to the driver
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:13 +00:00
Konstantin Seurer
f8ef1afec8 aco: Handle nir_debug_info_instr
Propagated debug info using p_debug_info and Program::debug_info.
Offsets into the shader binary are gathered during assembly.
This will be usefull for mapping back the disassembled shader to
nir, glsl or spirv.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:13 +00:00
Konstantin Seurer
7dd9840128 amd: Add ac_shader_debug_info
This is very similar to nir_debug_info_instr but it can exist outside of
a nir shader.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:13 +00:00
Konstantin
4d09cd7fa5 nir/lower_non_uniform_access: Group accesses using the same resource
Avoids emitting the waterfall loop for every access if they use the same
resource:

waterfall_loop {
   access
}
waterfall_loop {
   access
}

->

waterfall_loop {
   access
   access
}

Totals from 276 (0.33% of 84770) affected shaders:
MaxWaves: 3360 -> 3356 (-0.12%)
Instrs: 3759927 -> 3730650 (-0.78%)
CodeSize: 21125784 -> 20899580 (-1.07%)
VGPRs: 23096 -> 23104 (+0.03%)
Latency: 35593716 -> 35315455 (-0.78%); split: -0.78%, +0.00%
InvThroughput: 7353071 -> 7297309 (-0.76%); split: -0.76%, +0.00%
VClause: 120983 -> 118579 (-1.99%)
SClause: 113073 -> 110671 (-2.12%)
Copies: 358272 -> 348686 (-2.68%)
Branches: 166706 -> 159500 (-4.32%)
PreSGPRs: 18598 -> 18596 (-0.01%)
PreVGPRs: 21417 -> 21424 (+0.03%); split: -0.01%, +0.04%
VALU: 2354862 -> 2350053 (-0.20%)
SALU: 582291 -> 567638 (-2.52%)
SMEM: 139875 -> 137473 (-1.72%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>
2024-11-11 07:53:13 +00:00
Konstantin Seurer
c5e40a60f8 radv: Lower non-uniform access after vectorization
Scalar access can make nir_lower_non_uniform_access emit a lot of
waterfall loops.

Totals from 83 (0.10% of 84770) affected shaders:
Instrs: 2747926 -> 2745959 (-0.07%); split: -0.07%, +0.00%
CodeSize: 15022460 -> 14998240 (-0.16%); split: -0.16%, +0.00%
Latency: 18602932 -> 18404976 (-1.06%); split: -1.18%, +0.12%
InvThroughput: 4500730 -> 4450364 (-1.12%); split: -1.18%, +0.06%
VClause: 93651 -> 91848 (-1.93%); split: -1.93%, +0.00%
SClause: 63672 -> 63595 (-0.12%); split: -0.13%, +0.00%
Copies: 229377 -> 229896 (+0.23%); split: -0.04%, +0.27%
Branches: 107630 -> 107627 (-0.00%); split: -0.01%, +0.00%
PreSGPRs: 5247 -> 5253 (+0.11%)
PreVGPRs: 5911 -> 5903 (-0.14%); split: -0.29%, +0.15%
VALU: 1761158 -> 1761540 (+0.02%); split: -0.01%, +0.03%
SALU: 419743 -> 419783 (+0.01%); split: -0.01%, +0.02%
VMEM: 152142 -> 150208 (-1.27%)
SMEM: 80251 -> 80244 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>
2024-11-11 07:53:13 +00:00
Konstantin Seurer
d44f74896e nir: Add missing access flags to print_access
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>
2024-11-11 07:53:13 +00:00
Konstantin Seurer
01ca436263 util: Fix some brackets in util_dynarray_.*_ptr
Fixes a compiler error when directly accessing members of the returned
pointer.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>
2024-11-11 07:53:13 +00:00
Visan, Tiberiu
d379a3a428 amd/vpelib: remove luma offset (#459)
\[WHY\]
Shader and VPE does not apply brightness adjs in the same manner

\[HOW\]
Removed luma offset added in VPE

\[TESTING\]
Tested on real time video rendering

Co-authored-by: Tiberiu Visan <tvisan@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Reviewed-by: Navid Assadian <Navid.Assadian@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075>
2024-11-11 13:00:54 +08:00
Visan, Tiberiu
2172ab2c2a amd/vpelib: patch to match shader (#456)
\[WHY\]
Shader and VPE had different behavior while adjusting the brightness

\[HOW\]
Apply the same normalization factor

\[TESTING\]
Tested on real video outputs

Co-authored-by: Tiberiu Visan <tvisan@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075>
2024-11-11 13:00:44 +08:00
Leder, Brendan Steve
891c4694ba amd/vpelib: Refactor OCSC and update missing check
Missing check for 601 in limited format check, updated that.
Refactored OCSC to use specific limited depths.
Cleaned up general color processing.

Co-authored-by: Brendan <breleder@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075>
2024-11-11 13:00:29 +08:00
Martin Roukala (né Peres)
dc1fe83aa5 zink/ci: document new-ish vangogh flakes
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32071>
2024-11-10 07:21:41 +02:00
Marek Olšák
1299f5c50a gallium/radeon: import libdrm_radeon source code, drop the dependency
Only radeon_surface.h/c is used from libdrm and radeon_drm.h is imported
too. This code doesn't change anymore. We don't need the dependency.

Acked-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31827>
2024-11-10 00:52:18 +00:00
Russell Greene
ae9d365686 perfetto: fix macos compile
On macos, <sys/types.h> does not declare clockid_t,
but it's instead in <time.h>, which also includes
<sys/types.h> on Linux, so just include <time.h> on
all UNIX platforms.

Fixes: a871eabc
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12064
Tested-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31881>
2024-11-09 09:23:22 +00:00
Deborah Brouwer
276447ef81 ci/b2c: update RESULTS_DIR for .b2c-test jobs
Since $RESULTS_DIR is now centrally defined in setup-test-env.sh it's no
longer necessary to manually add a hard-coded results directory for the
b2b-test job results.

This keeps the results directory consistent between b2c-test jobs and lava.

Fixes: 9b6d14aed1 ("ci: Always create results dir from init")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32051>
2024-11-09 08:40:48 +00:00
Deborah Brouwer
b5b2515f86 ci: Remove duplicate slash before $RESULTS_DIR
The RESULTS_DIR variable is defined by reference to the present
working directory, but if the pwd is the root directory then the
$RESULTS_DIR begins with two slashes instead of one like this: //results.

This is harmless but not necessary, so remove it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32051>
2024-11-09 08:40:48 +00:00
Deborah Brouwer
e368623fff freedreno/ci: add prefix for a630-vk-asan tests
Currently a630-vk-asan has separate files for its expected failures and
skips, but by using the deqp-runner prefix option, the job can use the
common a630 expectation files. This simplifies `a630-vk-asan` without any
substantive changes to the ci job.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31970>
2024-11-09 08:15:36 +00:00
Alyssa Rosenzweig
0a81434adf agx: rewrite address mode lowering
AGX load/stores supports a single family of addressing modes:

   64-bit base + sign/zero-extend(32-bit) << (format shift + optional shift)

This is a base-index addressing mode, where the index is minimally in elements
(never bytes, unless we're doing 8-bit load/stores). Both base and the resulting
address must be aligned to the format size; the mandatory shift means that
alignment of base is equivalent to alignment of the final address, which is
taken care of by lower_mem_access_bit_size anyhow.

The other key thing to note is that this is a 64-bit shift, after the sign- or
zero-extension of the 32-bit index. That means that AGX does NOT implement

   64-bit base + sign/zero-extend(32-bit << shift)

This has sweeping implications.

For addressing math from C-based languages (including OpenCL C), the AGX mode is
more helpful, since we tend to get 64-bit shifts instead of 32-bit shifts.
However, for addressing math coming from GLSL, the AGX mode is rather annoying
since we know UBOs/SSBOs are at most 4GB so nir_lower_io & friends are all
32-bit byte indexing. It's tricky to teach them to do otherwise, and would not
be optimal either since 64-bit adds&shifts are *usually* much more expensive
than 32-bit on AGX *except* for when fused into the load/store.

So we don't want 32-bit NIR, since then we can't use the hardware addressing
mode at all. We also don't want 64-bit NIR, since then we have excessive 64-bit
math resulting from deep deref chains from complex struct/array cases. Instead,
we want a middle ground: 32-bit operations that are guaranteed not to overflow
32-bit and can therefore be losslessly promoted to 64-bit.

We can make that no-overflow guarantee as a consequence of the maximum UBO/SSBO
size, and indeed Mesa relies on this already all over the place. So, in this
series, we use relaxed amul opcodes for addressing
math. Then, we rewrite our address mode pattern matching to fuse AGX address
modes.

The actual pattern matching is rewritten. The old code was brittle handwritten
nir_scalar chasing, based on a faulty model of the hardware (with the 32-bit
shift). We delete it all, it's broken. In the new approach, we add some NIR
pseudo-opcodes for address math (ulea_agx/ilea_agx) which we pattern match with
NIR algebraic rules. Then the chasing required to fuse LEA's into load/stores is
trivial because we never go deeper than 1 level. After fusing, we then lower the
leftover lea/amul opcodes and let regular nir_opt_algebraic take it from
here.

We do need to be very careful around pass order to make sure things like
load/store vectorization still happen. Some passes are shuffled in this commit
to make this work. We also need to cleanup amul before fusing since we
specifically do not have nir_opt_algebraic do so - the entire point of the
pseudo-opcodes is to make nir_opt_algebraic ignore the opcodes until we've had a
chance to fuse. If we simply used the .nuw bit on iadd/imul, nir_opt_algebraic
would "optimize" things and lose the bit and then we would fail to fuse
addressing modes, which is a much more expensive failure case than anything
nir_opt_algebraic can do for us. I don't know what the "optimal" pass order for
AGX would look like at this point, but what we have here is good enough for now
and is a net positive for shader-db.

That all ends up being much less code and much simpler code, while fixing the
soundness holes in the old code, and also optimizing a significantly richer set
of addressing calculations. Now we don't juts optimize GL/VK modes, but also CL.
This is crucial even for GL/VK performance, since we rely on CL via libagx even
in graphics shaders.

Terraintessellation is up 10% to ~310fps, which is quite nice.

The following stats are for the end of the series together, including this
change + libagx change + the NIR changes building up to this... but not
including the SSBO vectorizer stats or the IC modelling fix. In other words,
these are the stats for "rewriting address mode handling". This is on OpenGL,
and since the old code was targeted at GL, anything that's not a loss is good
enough - we need this for the soundness fix regardless.

total instructions in shared programs: 2751356 -> 2750518 (-0.03%)
instructions in affected programs: 372143 -> 371305 (-0.23%)
helped: 715
HURT: 75
Instructions are helped.

total alu in shared programs: 2279559 -> 2278721 (-0.04%)
alu in affected programs: 304170 -> 303332 (-0.28%)
helped: 715
HURT: 75
Alu are helped.

total fscib in shared programs: 2277843 -> 2277008 (-0.04%)
fscib in affected programs: 304167 -> 303332 (-0.27%)
helped: 715
HURT: 75
Fscib are helped.

total ic in shared programs: 632686 -> 621886 (-1.71%)
ic in affected programs: 113078 -> 102278 (-9.55%)
helped: 1159
HURT: 82
Ic are helped.

total bytes in shared programs: 21489034 -> 21477530 (-0.05%)
bytes in affected programs: 3018456 -> 3006952 (-0.38%)
helped: 751
HURT: 107
Bytes are helped.

total regs in shared programs: 865148 -> 865114 (<.01%)
regs in affected programs: 1603 -> 1569 (-2.12%)
helped: 10
HURT: 9
Inconclusive result (value mean confidence interval includes 0).

total uniforms in shared programs: 2120735 -> 2120792 (<.01%)
uniforms in affected programs: 22752 -> 22809 (0.25%)
helped: 76
HURT: 49
Inconclusive result (value mean confidence interval includes 0).

total threads in shared programs: 27613312 -> 27613504 (<.01%)
threads in affected programs: 1536 -> 1728 (12.50%)
helped: 3
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
d466ccc6bd libagx: promote math to use AGX address mode
we want to fit into the 64 + ext() << #n pattern to let us fuse address
arithmetic into our loads, so rework some libagx addressing to better match that

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
77ce91e99b hk: reduce max SSBO size
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
01d2aa1d53 agx: fix bfeil timing
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
db8d467ec6 agx: model IC dispatch
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
3c222da6c0 agx: vectorize SSBOs
this was missed due to the lowering, and mitigates a lot of stats weirdness with
the address mode rework.

total instructions in shared programs: 2755170 -> 2751399 (-0.14%)
instructions in affected programs: 16323 -> 12552 (-23.10%)
helped: 71
HURT: 0
helped stats (abs) min: 10 max: 178 x̄: 53.11 x̃: 42
helped stats (rel) min: 2.04% max: 50.00% x̄: 34.73% x̃: 40.79%
95% mean confidence interval for instructions value: -60.94 -45.28
95% mean confidence interval for instructions %-change: -37.81% -31.65%
Instructions are helped.

total alu in shared programs: 2169888 -> 2168281 (-0.07%)
alu in affected programs: 9547 -> 7940 (-16.83%)
helped: 71
HURT: 0
helped stats (abs) min: 5 max: 90 x̄: 22.63 x̃: 16
helped stats (rel) min: 1.02% max: 43.33% x̄: 25.39% x̃: 29.41%
95% mean confidence interval for alu value: -26.33 -18.93
95% mean confidence interval for alu %-change: -27.91% -22.87%
Alu are helped.

total fscib in shared programs: 2165597 -> 2163990 (-0.07%)
fscib in affected programs: 9547 -> 7940 (-16.83%)
helped: 71
HURT: 0
helped stats (abs) min: 5 max: 90 x̄: 22.63 x̃: 16
helped stats (rel) min: 1.02% max: 43.33% x̄: 25.39% x̃: 29.41%
95% mean confidence interval for fscib value: -26.33 -18.93
95% mean confidence interval for fscib %-change: -27.91% -22.87%
Fscib are helped.

total bytes in shared programs: 21517750 -> 21489352 (-0.13%)
bytes in affected programs: 126270 -> 97872 (-22.49%)
helped: 71
HURT: 0
helped stats (abs) min: 80 max: 1084 x̄: 399.97 x̃: 324
helped stats (rel) min: 1.77% max: 50.57% x̄: 35.07% x̃: 42.31%
95% mean confidence interval for bytes value: -455.66 -344.28
95% mean confidence interval for bytes %-change: -38.34% -31.79%
Bytes are helped.

total regs in shared programs: 864490 -> 865162 (0.08%)
regs in affected programs: 4567 -> 5239 (14.71%)
helped: 4
HURT: 61
helped stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6
helped stats (rel) min: 4.51% max: 5.13% x̄: 4.82% x̃: 4.82%
HURT stats (abs)   min: 2 max: 24 x̄: 11.41 x̃: 12
HURT stats (rel)   min: 1.98% max: 82.35% x̄: 21.05% x̃: 16.00%
95% mean confidence interval for regs value: 8.52 12.16
95% mean confidence interval for regs %-change: 14.91% 24.00%
Regs are HURT.

total threads in shared programs: 27613056 -> 27613312 (<.01%)
threads in affected programs: 3200 -> 3456 (8.00%)
helped: 4
HURT: 0
helped stats (abs) min: 64 max: 64 x̄: 64.00 x̃: 64
helped stats (rel) min: 7.69% max: 8.33% x̄: 8.01% x̃: 8.01%
95% mean confidence interval for threads value: 64.00 64.00
95% mean confidence interval for threads %-change: 7.42% 8.60%
Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
b593a6aa98 rusticl: respect late_lower_int64
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
5c73a8af44 nir/lower_uniforms_to_ubo: use amul
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
fc460e7f20 nir/opt_algebraic: don't lower amul if requested
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
1f3c97547a nir/builder: use amul over ishl on agx
ishl can wrap, amul cannot. so we need amul in the backend, or otherwise we
would need to introduce an ashl opcode instead. that doesn't seem better.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
9ab8d70fa6 nir: add ilea_agx/ulea_agx opcodes
to facilitate address mode lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
23afe968ad nir: add late_lower_int64 option
Some drivers generally need int64 lowered, but prefer to do this lowering
themselves late, to have a chance to optimize targeted int64 patterns before
lowering the rest. This isn't currently possible since nir_lower_int64 takes no
options except what's const* in the shader, and frontends call nir_lower_int64
before passing the shader off to the driver. Add an option to defer int64
lowering. This is a bit ugly but the alternative is replumbing nir_lower_int64's
option handling cross-tree and no-thank-you-not-right-now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
eaf75169ee nir: add amul flag
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
227026b7ad nir/opt_algebraic: add another 64-bit pattern
clpeak

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig
2a3f133fd0 nir/opt_algebraic: add more 64-bit patterns
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:41 -04:00
Alyssa Rosenzweig
a4a3487aae nir/opt_algebraic: optimize patterns from Skia
shaders/skia/1567.shader_test relies on algebraic + constant folding, subtle
changes in the input compiling flow can cause it to baloon. these patterns fix
that. annoying!

shader-db results aren't amazing, but they avert a major stats regression for
that one Skia shader.

total instructions in shared programs: 2751399 -> 2751295 (<.01%)
instructions in affected programs: 6509 -> 6405 (-1.60%)
helped: 21
HURT: 1
helped stats (abs) min: 1 max: 14 x̄: 5.62 x̃: 6
helped stats (rel) min: 0.53% max: 13.73% x̄: 3.57% x̃: 1.62%
HURT stats (abs)   min: 14 max: 14 x̄: 14.00 x̃: 14
HURT stats (rel)   min: 2.45% max: 2.45% x̄: 2.45% x̃: 2.45%
95% mean confidence interval for instructions value: -7.09 -2.36
95% mean confidence interval for instructions %-change: -5.14% -1.45%
Instructions are helped.

total alu in shared programs: 2274577 -> 2274468 (<.01%)
alu in affected programs: 6178 -> 6069 (-1.76%)
helped: 21
HURT: 1
helped stats (abs) min: 1 max: 14 x̄: 5.86 x̃: 7
helped stats (rel) min: 0.55% max: 16.47% x̄: 3.93% x̃: 1.72%
HURT stats (abs)   min: 14 max: 14 x̄: 14.00 x̃: 14
HURT stats (rel)   min: 2.83% max: 2.83% x̄: 2.83% x̃: 2.83%
95% mean confidence interval for alu value: -7.35 -2.56
95% mean confidence interval for alu %-change: -5.67% -1.57%
Alu are helped.

total fscib in shared programs: 2272894 -> 2272785 (<.01%)
fscib in affected programs: 6178 -> 6069 (-1.76%)
helped: 21
HURT: 1
helped stats (abs) min: 1 max: 14 x̄: 5.86 x̃: 7
helped stats (rel) min: 0.55% max: 16.47% x̄: 3.93% x̃: 1.72%
HURT stats (abs)   min: 14 max: 14 x̄: 14.00 x̃: 14
HURT stats (rel)   min: 2.83% max: 2.83% x̄: 2.83% x̃: 2.83%
95% mean confidence interval for fscib value: -7.35 -2.56
95% mean confidence interval for fscib %-change: -5.67% -1.57%
Fscib are helped.

total bytes in shared programs: 21489352 -> 21488668 (<.01%)
bytes in affected programs: 53362 -> 52678 (-1.28%)
helped: 21
HURT: 2
helped stats (abs) min: 6 max: 98 x̄: 35.52 x̃: 40
helped stats (rel) min: 0.39% max: 10.63% x̄: 2.27% x̃: 1.27%
HURT stats (abs)   min: 2 max: 60 x̄: 31.00 x̃: 31
HURT stats (rel)   min: 0.08% max: 1.40% x̄: 0.74% x̃: 0.74%
95% mean confidence interval for bytes value: -42.73 -16.74
95% mean confidence interval for bytes %-change: -3.13% -0.89%
Bytes are helped.

total regs in shared programs: 865162 -> 865148 (<.01%)
regs in affected programs: 509 -> 495 (-2.75%)
helped: 4
HURT: 5
helped stats (abs) min: 2 max: 14 x̄: 6.00 x̃: 4
helped stats (rel) min: 3.17% max: 35.90% x̄: 14.01% x̃: 8.48%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 3.17% max: 3.17% x̄: 3.17% x̃: 3.17%
95% mean confidence interval for regs value: -5.75 2.64
95% mean confidence interval for regs %-change: -14.31% 5.39%
Inconclusive result (value mean confidence interval includes 0).

total uniforms in shared programs: 2120731 -> 2120735 (<.01%)
uniforms in affected programs: 358 -> 362 (1.12%)
helped: 1
HURT: 2
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 2.94% x̄: 2.94% x̃: 2.94%
HURT stats (abs)   min: 2 max: 4 x̄: 3.00 x̃: 3
HURT stats (rel)   min: 1.05% max: 4.00% x̄: 2.53% x̃: 2.53%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>
2024-11-08 21:15:41 -04:00
Chia-I Wu
015f6a7aff panvk: ensure res table is restored after meta
Set res_table to 0 to ensure that the res table is re-emitted.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Fixes: 5067921349 ("panvk: Switch to vk_meta")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32044>
2024-11-09 00:48:47 +00:00
Jianxun Zhang
8906816f49 anv,hasvk,genxml: Rename genxml files using verx10
It could be confusing that a newer platform named with a smaller
number than a half-generation of an older platform like 'gfx20' and
'gfx75' in xml files.

Down the road, it can be a little worse once we pass something like
'gfx40' when there is already a gfx45.xml for the oldest platform.

Unify naming xml files with verx10 numbers to resolve the issue.

Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31943>
2024-11-09 00:04:47 +00:00
Eric Engestrom
7e0e433482 radv+zink/ci: add flakes seen recently
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32066>
2024-11-08 22:49:21 +00:00
Eric Engestrom
66df09ffda nvk+zink/ci: add flakes seen recently
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32066>
2024-11-08 22:49:21 +00:00
Eric Engestrom
f9593d9eb5 freedreno/ci: add flakes seen recently
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32066>
2024-11-08 22:49:21 +00:00
Eric Engestrom
4ab210f588 broadcom/ci: add flakes seen recently
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32066>
2024-11-08 22:49:21 +00:00
Eric Engestrom
8d2620569c ci: make error handling quieter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32054>
2024-11-08 20:56:46 +00:00