Daniel Schürmann
eb8ec12b23
aco/ra: Fix potential out-of-bounds array accesses.
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12748 >
2021-09-10 19:39:18 +00:00
Timur Kristóf
536580b139
aco: Add some useful info to the README for debugging.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12748 >
2021-09-10 19:39:18 +00:00
Marek Olšák
69e96cfc0d
ac,radv: remove unused inputs array and VS input code
...
The previous commit stopped using "inputs".
"load_layer_id" has always been broken and it was probably unused anyway.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12570 >
2021-09-07 17:51:41 +00:00
Marek Olšák
3fb229e010
ac,radeonsi: load VS inputs at the call site of nir_intrinsic_load_input
...
to match ACO
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12570 >
2021-09-07 17:51:41 +00:00
Marek Olšák
bce7c7f3fc
ac/llvm: implement nir_intrinsic_elect
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12570 >
2021-09-07 17:51:41 +00:00
Marek Olšák
e0f07483d0
ac/llvm: implement nir_intrinsic_overwrite_*_arguments_amd
...
This should work if the intrinsics are not called from conditional blocks.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12570 >
2021-09-07 17:51:41 +00:00
Marek Olšák
1e178f7a37
ac: make ac_shader_abi::inputs an array instead of a pointer
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12570 >
2021-09-07 17:51:41 +00:00
Marek Olšák
6df5f268db
ac: remove needless parameters from ac_shader_abi::emit_outputs
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12570 >
2021-09-07 17:51:41 +00:00
Marek Olšák
2e95ad1433
ac/llvm: implement a bunch of NIR AMD intrinsics for NGG
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12570 >
2021-09-07 17:51:41 +00:00
Marek Olšák
a33602b1f9
ac/llvm: remove load_tess_coord callback
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12570 >
2021-09-07 17:51:41 +00:00
Rhys Perry
c1e668d5d1
aco/ra: don't use ds_write_b8_d16_hi/ds_write_b16_d16_hi on GFX8
...
GFX8 doesn't support these opcodes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: c75138ed64 ("aco/ra: refactor subdword definition info")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12721 >
2021-09-06 15:10:26 +00:00
Timur Kristóf
268158a758
aco/optimize_postRA: Use iterators instead of operator[] of std::array.
...
Also add a few more assertions to make sure the registers are
within the bounds of the array.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12682 >
2021-09-03 15:00:55 +00:00
Timur Kristóf
bb956464cb
aco: Skip code paths to emit copies when there are no copies.
...
Found while running with libstdc++ debug mode.
Fixes the following:
Error: attempt to advance a dereferenceable (start-of-sequence)
iterator -1 steps, which falls outside its valid range.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12682 >
2021-09-03 15:00:55 +00:00
Timur Kristóf
728ed892df
aco: Use Builder reference in emit_copies_block.
...
Found while running with libstdc++ debug mode.
Fixes the following:
Error: attempt to copy-construct an iterator from a singular iterator.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12682 >
2021-09-03 15:00:55 +00:00
Rhys Perry
522f135d06
radv: expose VK_KHR_shader_integer_dot_product
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617 >
2021-09-03 13:21:28 +00:00
Rhys Perry
8037b21573
aco/ra: allow v1b operands with 16-bit instructions
...
Instruction selection can create these.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: ec1bbfa608 ("aco/ra: refactor subdword operand stride")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617 >
2021-09-03 13:21:28 +00:00
Rhys Perry
2a7fa132be
aco: implement udot_4x8/sdot_4x8/udot_2x16/sdot_2x16 opcodes
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617 >
2021-09-03 13:21:28 +00:00
Rhys Perry
e0d232c2fc
aco: implement nir_op_pack_32_4x8
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617 >
2021-09-03 13:21:28 +00:00
Rhys Perry
4dd420f76d
radv,aco: implement iadd_sat
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617 >
2021-09-03 13:21:28 +00:00
Rhys Perry
44be450dc1
radv: refactor handling of nir_options
...
Make it easier to change them depending on chip_class and family.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617 >
2021-09-03 13:21:28 +00:00
Rhys Perry
859790ba54
ac/llvm: implement udot_4x8/sdot_4x8/udot_2x16/sdot_2x16 opcodes
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617 >
2021-09-03 13:21:27 +00:00
Rhys Perry
d6619d0a01
ac/llvm,radv: implement uadd_sat/iadd_sat
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617 >
2021-09-03 13:21:27 +00:00
Rhys Perry
f7cdd49a09
ac/llvm: implement nir_op_pack_32_4x8
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617 >
2021-09-03 13:21:27 +00:00
Rhys Perry
40a0935899
ac/gpu_info: add has_accelerated_dot_product
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617 >
2021-09-03 13:21:27 +00:00
Rhys Perry
54f83d718a
aco/spill: add temporary operands of exec phis to next_use_distances_end
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: dfb10e4f4b ("aco/spill: don't count phis as variable access")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12702 >
2021-09-03 14:01:27 +01:00
Rhys Perry
f241bd3749
aco: don't coalesce constant copies into non-power-of-two sizes
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12702 >
2021-09-03 14:01:27 +01:00
Samuel Pitoiset
ad878856e6
radv/llvm: rework VS input loads and implement the callback
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12693 >
2021-09-03 08:14:51 +00:00
Samuel Pitoiset
1402c17e4f
radv: optimize VRS when no depth stencil attachment is bound
...
This is allowed by the Vulkan spec and we have to handle this situation
internally. We used to create and bind a 4096x4096 image to copy the
VRS rates but this wasted too much VRAM (~33MiB). Now, the driver only
allocates a HTILE buffer (~1MiB) and bind it to the framebuffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12243 >
2021-09-02 19:39:04 +00:00
Samuel Pitoiset
ab635b024b
radv: pass the HTILE buffer to radv_copy_vrs_htile()
...
Will be used to use a global HTILE buffer without an image.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12243 >
2021-09-02 19:39:04 +00:00
Samuel Pitoiset
ad60354a92
radv: optimize copying VRS rates to the global HTILE buffer
...
By skipping the read operation which is unnecessary.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12243 >
2021-09-02 19:39:04 +00:00
Samuel Pitoiset
0fd40af59f
radv: allow to conditionally read HTILE value when copying VRS rates
...
When a subpass is bound without a VRS attachment, the driver has to
create one internally and the copy can be a write only operation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12243 >
2021-09-02 19:39:04 +00:00
Daniel Schürmann
8bd7e2392b
aco: preserve subdword RC when lowering p_insert/p_extract
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640 >
2021-09-02 20:39:17 +02:00
Daniel Schürmann
73481338fe
aco/print_ir: always print SDWA dst & src selections
...
This way, it becomes more apparent how SDWA behaves.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640 >
2021-09-02 20:39:17 +02:00
Daniel Schürmann
0988f7b9ba
aco: remove explicit dst_preserve flag
...
Instead, we can rely on the fact that subdword definitions
must preserve the unused bits while dword definitions either
pad or sign-extend.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640 >
2021-09-02 20:39:17 +02:00
Daniel Schürmann
9e3ff06c38
aco: rewrite SDWA selector
...
This commit introduces a new struct SubdwordSel
in order to ease and clean up the usage of SDWA
selections. This includes removing the distinction
between register-allocated and fixed SDWA selections.
Instead, SDWA selections can now also access the high
bits of subdword variables. Alignment and sizes are
validated accordingly. Size, offset and sign_extend
can be evaluated via helper methods.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640 >
2021-09-02 20:39:17 +02:00
Daniel Schürmann
cc4682ed47
aco: fix p_insert lowering with 16bit sources
...
The previous lowering only wrote a single byte.
Fixes: 2f94353735 ('aco: add p_extract/p_insert')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640 >
2021-09-02 20:39:17 +02:00
Samuel Pitoiset
607a14b870
radv: remove NGG streamout support in LLVM
...
It has never really been used due to various issues with GDS in the
past and it will be lowered in NIR at some point.
The driver support is still there because it can likely be re-used.
This implementation can also be used as a reference point.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12695 >
2021-09-02 17:58:51 +02:00
Samuel Pitoiset
b31994cf67
radv: fix determining the maximum number of waves that can use scratch
...
This estimation was incorrect, the number of waves doesn't only
depend of the number of VGPRs.
Though, {SPI,COMPUTE}_TMPRING_SIZE.WAVES should limit the number of
scratch waves in flight, not sure if limiting it really works.
This fixes a GPU hang with an upcoming game, and this might also
helps resolving some spurious random GPU hangs.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12700 >
2021-09-02 15:09:07 +00:00
Daniel Schürmann
ef7d840a70
aco/lower_phis: optimize loop exit phis
...
This optimization works by ensuring that disabled lanes
are zero'd before any merge sequence.
Totals from 6075 (4.05% of 150170) affected shaders: (GFX10.3)
CodeSize: 57913908 -> 57913212 (-0.00%)
Instrs: 11055852 -> 11055678 (-0.00%)
Latency: 438705219 -> 438534652 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 125284101 -> 125251397 (-0.03%); split: -0.03%, +0.00%
Copies: 807388 -> 821035 (+1.69%); split: -0.00%, +1.69%
Branches: 391827 -> 391782 (-0.01%)
PreSGPRs: 574841 -> 574838 (-0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11659 >
2021-09-02 16:41:52 +02:00
Daniel Schürmann
207249d2b2
aco/lower_phis: propagate constants before emitting merge code
...
This generalizes a previous optimization.
Totals from 521 (0.35% of 150170) affected shaders: (GFX10.3)
CodeSize: 1680348 -> 1678884 (-0.09%)
Instrs: 307994 -> 307628 (-0.12%)
Latency: 5799845 -> 5792655 (-0.12%)
InvThroughput: 994859 -> 994030 (-0.08%)
Copies: 18992 -> 18767 (-1.18%)
Branches: 10143 -> 10037 (-1.05%)
PreSGPRs: 21904 -> 21853 (-0.23%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11659 >
2021-09-02 16:41:52 +02:00
Daniel Schürmann
aed7b7d185
aco/lower_bool_phis: avoid creating trivial phis
...
For this purpose, get_ssa() is also refactored slightly.
Totals from 4 (0.00% of 150170) affected shaders: (GFX10.3)
CodeSize: 15504 -> 15376 (-0.83%)
Instrs: 2942 -> 2910 (-1.09%)
Latency: 292444 -> 291642 (-0.27%)
InvThroughput: 30842 -> 30770 (-0.23%)
Copies: 164 -> 150 (-8.54%)
Branches: 96 -> 82 (-14.58%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11659 >
2021-09-02 16:41:52 +02:00
Daniel Schürmann
6d8ea54654
aco: refactor lower_phis()
...
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11659 >
2021-09-02 16:41:52 +02:00
Daniel Schürmann
e4c5062fb7
aco: fix init_any_pred_defined() for loop header phis
...
This includes setting the correct end point of the propagation and
not propagating the incoming values after the loop header.
This patch also changes the propagation to a single iteration for
loop exit phis.
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
aco: don't propagate incoming value in init_any_pred_defined()
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11659 >
2021-09-02 16:41:52 +02:00
Samuel Pitoiset
da9f1a7340
radv: use common vkGet{Buffer,Image}MemoryRequirements()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12416 >
2021-09-02 10:56:39 +00:00
Samuel Pitoiset
48cae114c2
radv: use common vkBind{Buffer,Image}Memory()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12416 >
2021-09-02 10:56:39 +00:00
Samuel Pitoiset
757545b90e
radv: use common vkGetDeviceQueue()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12416 >
2021-09-02 10:56:39 +00:00
Samuel Pitoiset
9fc16b66d0
radv: use common vkGetPhysicalDevice{Image}FormatProperties()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12416 >
2021-09-02 10:56:39 +00:00
Samuel Pitoiset
8a9c17bf4e
radv: use common entrypoints for sparse image requirements/properties
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12416 >
2021-09-02 10:56:39 +00:00
Samuel Pitoiset
e62d3db64b
radv: do not disable DCC for storage images if atomics aren't enabled
...
VK_FORMAT_R32_SFLOAT seems pretty common and it seems we can be a
little smarter when shader image 32-bit float atomics aren't enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12406 >
2021-09-02 10:40:43 +02:00
Samuel Pitoiset
a17756c865
radv: track if shader image 32-bit float atomics are enabled
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12406 >
2021-09-02 10:40:42 +02:00