Commit graph

178009 commits

Author SHA1 Message Date
Jordan Justen
0495f952d4 intel/genxml: Add genxml_import.py script
This script can:
 * validate that genxml files do not duplicate imported items
 * add imports to genxml files and optimize the file by dropping
   duplicate items
 * reverse the import operation by flattening genxml files

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
2023-09-14 11:05:16 -07:00
Jordan Justen
6ad2f39bab intel/genxml: Add GenXml.flatten_xml() method
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
2023-09-14 11:05:16 -07:00
Jordan Justen
c0f7feb239 intel/genxml: Add GenXml.add_xml_imports method
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
2023-09-14 11:05:16 -07:00
Jordan Justen
9e5190ad1f intel/genxml: Drop assertion to allow for importing
For example, gen11.xml will import the HEVC_ARBITRATION_PRIORITY
struct from gen9.xml.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
2023-09-14 11:05:16 -07:00
Jordan Justen
614aa2e62b intel/genxml: Add GenXml.optimize_xml_import()
This function drops duplicated items from a genxml file when they are
equivalent to the same item imported from another genxml file.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
2023-09-14 11:05:16 -07:00
Jordan Justen
1285337218 intel/genxml: Add all xml files as pack dependencies
Since the output can now depend on other imported xml files, we need
to add them all as dependencies to ensure that if any xml file is
changed, then all pack files are rebuilt.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
2023-09-14 11:05:16 -07:00
Jordan Justen
b076b4f99b intel/genxml: Add support for excluding items when importing
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
2023-09-14 11:05:15 -07:00
Jordan Justen
6cc21dc8b5 intel/genxml: Support importing from another genxml file
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
2023-09-14 11:05:15 -07:00
Daniel van Vugt
bb06db5a62 glx: Increment dpy->request before issuing an error that had no request
This ensures the sequence number is unique and recent enough for callers
of `glXQueryDrawable` using `XNextRequest` to selectively trap errors.
The same approach is already used in `glXCreateContextAttribsARB`.

Suggested-by: Sebastian Keller <skeller@gnome.org>
Related-to: https://gitlab.gnome.org/GNOME/mutter/-/issues/3007
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25173>
2023-09-14 16:33:29 +00:00
Konstantin Seurer
73fec95358 radv: Remove ray tracing shader module identifier skips
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25142>
2023-09-14 16:07:46 +00:00
Konstantin Seurer
28dcc5959d radv/rt: Handle stages without nir properly
Fixes: e039e3cd76 ('radv/rt: Store NIR shaders separately')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25142>
2023-09-14 16:07:46 +00:00
Konstantin Seurer
3fd9894e3a radv: Update navi21 llvm fails
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25010>
2023-09-14 15:39:39 +00:00
Konstantin Seurer
77bf1408f3 radv: Don't advertise features requiring PS epilogs with LLVM
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25010>
2023-09-14 15:39:39 +00:00
Konstantin Seurer
4c168635f8 ac/llvm: Use float types for float atomics
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25010>
2023-09-14 15:39:39 +00:00
Konstantin Seurer
60e7b1c69c ac/llvm: Use the correct return type for uadd_carry and usub_borrow
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25010>
2023-09-14 15:39:39 +00:00
Konstantin Seurer
3ae0562c23 ac/llvm: Fix typed loads with 16bit formats
For some reason, LLVM can't handle those. Emit a 32bit load and type
conversion instead,

Fixes: 22ca8c8 ("ac/llvm: Implement typed buffer load intrinsic.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25010>
2023-09-14 15:39:38 +00:00
Konstantin Seurer
0cada27826 radv/ci: Improve ray tracing skips
I didn't know they were regexes. This also excludes all "1048576" tests.
They build an acceleration structure with 1 primitive 1048576 times
which only warms up the Valve farm and doesn't accomplish anything else.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24720>
2023-09-14 15:12:44 +00:00
Konstantin Seurer
97b1caf9f6 radv: Perform multiple sorts in parallel
This was the last part that didn't scale with multiple infos. Reducing
the amount of barriers in this case improves DOOM Eternal performance by
50%. (Running with low resolution)

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24720>
2023-09-14 15:12:44 +00:00
Konstantin Seurer
44c47054bc radv/radix_sort: Vendor the radix sort dispatch code
This needs to be done so we can optimize it for occpuancy when building
multiple acceleration structures in parallel. Changes to the original
code:

- Change // to /* */
- clang-format
- Replace vkCmd calls with calls to the driver entrypoints
- Add a light weight info struct
- Use radv_fill_buffer directly

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24720>
2023-09-14 15:12:44 +00:00
Konstantin Seurer
1cacc64ea7 radv: Remove dead radix_sort_vk_get_memory_requirements call
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24720>
2023-09-14 15:12:43 +00:00
Ruijing Dong
fb0f51bc64 radeonsi/vcn: change max_poc to fixed value for hevc encoder.
problem: max_poc means the number of bits used in poc lsb
         in slice header, and it should not be related to GOP
	 size. When large GOP size used, it could generate
	 corrupted video, as the POC could not be correctly
	 decoded.

solution: use fixed value of max_poc (16) for now.

Cc: mesa-stable
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25214>
2023-09-14 14:43:15 +00:00
Samuel Pitoiset
84390c5c98 ac/spm: initialize and set instance mapping for counters
This configures global, per-SE and per-SA counters with different
indexes. This is still unused because only for the first instance is
used by RADV/RadeonSI, but this will be changed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>
2023-09-14 14:17:19 +00:00
Samuel Pitoiset
0864a7dfa9 ac/spm: rework how segment muxsel RAM are filled
This is more close to PAL and it will be easier to add GFX11 support
on top of it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>
2023-09-14 14:17:19 +00:00
Samuel Pitoiset
6ae64900e2 ac/spm: fix checking if the counter instance is valid
This should be compared against the number of global instances, and
there is also an off-by-one error.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>
2023-09-14 14:17:19 +00:00
Samuel Pitoiset
90d9406436 ac/perfcounter: compute the number of global instances of TCP,SQ,GL1C and GL2C
This will be used by SPM.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>
2023-09-14 14:17:19 +00:00
Samuel Pitoiset
60cb257d26 ac/perfcounter: set the number of instances of GL1C to 4
According to PAL there is 4 GL1C quadrants. This will also be used
by SPM.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>
2023-09-14 14:17:18 +00:00
Samuel Pitoiset
10dc97b20f ac/gpu_info: init num_cu_per_sh from the kernel
This will be used to configure the number of instances of TCP.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>
2023-09-14 14:17:18 +00:00
Samuel Pitoiset
9552716208 ac/spm: add SPM block definition for GFX10-GFX10.3
Instead of using magic values.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>
2023-09-14 14:17:18 +00:00
Samuel Pitoiset
b1ce30539b ac/spm: remove useless SPM block setting for GFX9 and older GPUs
SPM is only implemented for GFX10+ on RADV/RadeonSI, although it's
technically possible on GFX9 but unused by RGP, so don't care.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>
2023-09-14 14:17:18 +00:00
Samuel Pitoiset
303184e4e5 radv,radeonsi: use AC_SPM_SEGMENT_TYPE_xxx instead of magic values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>
2023-09-14 14:17:18 +00:00
Samuel Pitoiset
db6e16a515 radv: enable the PKT3 CAM bit for some SPM register writes
PAL does that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>
2023-09-14 14:17:18 +00:00
David Rosca
d57241d290 radeonsi/vcn: Set H264/HEVC chroma sample location in bitstream
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25078>
2023-09-14 13:39:59 +00:00
David Rosca
8e76b8fb35 frontends/va: Parse chroma sample location in H264/HEVC SPS
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25078>
2023-09-14 13:39:59 +00:00
Samuel Pitoiset
aca2adc36c ac/spm: add SPM counters configuration for GFX11
All SQ counters changed to SQ_WGP and the L2 miss changed too.
Sourced from PAL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25175>
2023-09-14 12:30:52 +00:00
Samuel Pitoiset
42d67183e7 ac/perfcounter: add new SQ_WGP block for GFX11+
According to PAL, these SQ counters can be sampled at WGP granularity.
Some SPM counters captured for RGP are using this GPU block.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25175>
2023-09-14 12:30:52 +00:00
Samuel Pitoiset
31e6c05527 ac,radv,radeonsi: rework SPM counters configuration and share it
This should be easier to add GFX11 support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25175>
2023-09-14 12:30:52 +00:00
Samuel Pitoiset
f88338f801 issue_templates/Bug Report: fix outdated URL for GFXReconstruct
The URL moved.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25218>
2023-09-14 10:22:39 +00:00
Daniel Schürmann
6eaf416f35 aco/insert_exec_mask: Simplify WQM handling (2/2)
by calculating WQM requirements on demand.

No fossil-db changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:23 +00:00
Daniel Schürmann
5f66723188 aco/insert_exec_mask: Simplify WQM handling (1/2)
by using p_end_wqm as indicator for when to end WQM mode.

Totals from 10049 (13.12% of 76572) affected shaders: (GFX11)

MaxWaves: 301126 -> 301136 (+0.00%)
Instrs: 7061909 -> 7049272 (-0.18%); split: -0.21%, +0.03%
CodeSize: 37720684 -> 37664244 (-0.15%); split: -0.18%, +0.03%
VGPRs: 357204 -> 357180 (-0.01%); split: -0.13%, +0.12%
Latency: 62757830 -> 62827080 (+0.11%); split: -0.06%, +0.17%
InvThroughput: 8589248 -> 8589963 (+0.01%); split: -0.02%, +0.02%
VClause: 132541 -> 132547 (+0.00%); split: -0.03%, +0.03%
SClause: 322916 -> 322964 (+0.01%); split: -0.04%, +0.05%
Copies: 546446 -> 547657 (+0.22%); split: -0.13%, +0.35%
Branches: 189527 -> 188293 (-0.65%)
PreSGPRs: 332792 -> 332529 (-0.08%); split: -0.08%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:23 +00:00
Daniel Schürmann
45f6d38a76 aco: insert a single p_end_wqm after the last derivative calculation
This new instruction replaces p_wqm.

Totals from 28065 (36.65% of 76572) affected shaders: (GFX11)
MaxWaves: 823922 -> 823952 (+0.00%); split: +0.01%, -0.01%
Instrs: 22221375 -> 22180465 (-0.18%); split: -0.26%, +0.08%
CodeSize: 117310676 -> 117040684 (-0.23%); split: -0.30%, +0.07%
VGPRs: 1183476 -> 1186656 (+0.27%); split: -0.19%, +0.46%
SpillSGPRs: 2305 -> 2302 (-0.13%)
Latency: 176559310 -> 176427793 (-0.07%); split: -0.21%, +0.14%
InvThroughput: 26245204 -> 26195550 (-0.19%); split: -0.26%, +0.07%
VClause: 368058 -> 369460 (+0.38%); split: -0.21%, +0.59%
SClause: 857077 -> 842588 (-1.69%); split: -2.06%, +0.37%
Copies: 1245650 -> 1249434 (+0.30%); split: -0.33%, +0.63%
Branches: 394837 -> 396070 (+0.31%); split: -0.01%, +0.32%
PreSGPRs: 1019139 -> 1019567 (+0.04%); split: -0.02%, +0.06%
PreVGPRs: 925739 -> 931860 (+0.66%); split: -0.00%, +0.66%

Changes are due to scheduling and re-enabling cross-lane optimizations.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:23 +00:00
Daniel Schürmann
28904839da aco: don't insert a copy when emitting p_wqm
Totals from 351 (0.46% of 76572) affected shaders: (GFX11)

Instrs: 709202 -> 709600 (+0.06%); split: -0.02%, +0.08%
CodeSize: 3606364 -> 3608040 (+0.05%); split: -0.01%, +0.06%
Latency: 3589841 -> 3590756 (+0.03%); split: -0.01%, +0.03%
InvThroughput: 463303 -> 463324 (+0.00%)
SClause: 28147 -> 28201 (+0.19%); split: -0.02%, +0.22%
Copies: 43243 -> 43204 (-0.09%); split: -0.24%, +0.15%
PreSGPRs: 21028 -> 21042 (+0.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Daniel Schürmann
040142684c aco: make p_wqm a marker instruction without Operands/Definitions
Totals from 28277 (36.93% of 76572) affected shaders: (GFX11)

MaxWaves: 833930 -> 833898 (-0.00%); split: +0.01%, -0.01%
Instrs: 21366950 -> 21353346 (-0.06%); split: -0.11%, +0.05%
CodeSize: 112855368 -> 112610508 (-0.22%); split: -0.24%, +0.03%
VGPRs: 1157748 -> 1158540 (+0.07%); split: -0.10%, +0.17%
SpillSGPRs: 2465 -> 2463 (-0.08%); split: -0.16%, +0.08%
Latency: 168339886 -> 168383646 (+0.03%); split: -0.10%, +0.12%
InvThroughput: 25164895 -> 25158376 (-0.03%); split: -0.08%, +0.06%
VClause: 347660 -> 346256 (-0.40%); split: -0.55%, +0.15%
SClause: 794460 -> 799521 (+0.64%); split: -0.33%, +0.97%
Copies: 1151908 -> 1148370 (-0.31%); split: -0.54%, +0.23%
Branches: 359447 -> 359437 (-0.00%); split: -0.01%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Daniel Schürmann
1275981df8 aco: don't optimize cross-lane instructions across p_wqm
We will use p_wqm as a marker in the next step.

Totals from 8846 (11.55% of 76572) affected shaders: (GFX11)

Instrs: 7031274 -> 7072729 (+0.59%); split: -0.02%, +0.60%
CodeSize: 37060272 -> 37355244 (+0.80%); split: -0.01%, +0.80%
VGPRs: 402660 -> 398724 (-0.98%); split: -0.99%, +0.01%
Latency: 62231926 -> 62322311 (+0.15%); split: -0.01%, +0.15%
InvThroughput: 10341361 -> 10392589 (+0.50%); split: -0.00%, +0.50%
VClause: 105344 -> 105368 (+0.02%); split: -0.03%, +0.05%
SClause: 218330 -> 218469 (+0.06%); split: -0.07%, +0.14%
Copies: 378609 -> 377644 (-0.25%); split: -0.42%, +0.17%
Branches: 97218 -> 97207 (-0.01%); split: -0.01%, +0.00%
PreSGPRs: 307654 -> 307644 (-0.00%); split: -0.08%, +0.08%
PreVGPRs: 314744 -> 308650 (-1.94%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Daniel Schürmann
0907b53740 aco/insert_exec_mask: set Exact mode after p_discard_if when necessary
Fixes: 5e9df85b1a ('aco: optimize discard_if when WQM is not needed afterwards')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Rhys Perry
41b6020ff3 aco: remove fast path in insert_exec_mask's process_instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Daniel Schürmann
0e8192a76b aco: append p_logical_end after monolithic RT shaders
Fixes: bdec044c88 ('aco: Do not fixup registers if there are no shader calls')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Dave Airlie
c5fb2fff18 ac,radeonsi: move vcn enc av1 default cdf file to common
This can be used by radv.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25196>
2023-09-14 07:51:24 +00:00
Dave Airlie
daa01703cc ac,radeonsi: move vcn enc structs to common
This just moves the header to make it easier to share with radv.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25196>
2023-09-14 07:51:24 +00:00
Samuel Pitoiset
f8a7c8edd1 radv: emit relocation for mesh/task shaders
RGP requires shaders to be contiguous in memory.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25144>
2023-09-14 07:19:26 +00:00
Samuel Pitoiset
312103e0ff radv: set THREAD_TRACE_MARKER_ENABLE for mesh/task draws
PAL does that.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25144>
2023-09-14 07:19:25 +00:00