Rewrite the code to use linear_asprintf and always flip the
dimensions in place if the element type is an array. The new
code will now (correctly) flip even in the case of unsized arrays.
The flipping is done by swapping the ranges [a, b) and [b, c), as
shown below, with element type int[...] and an array of length 4.
```
+--------------- a: first bracket in the name
| +---------- b: end of the element name
| | +------- c: end of the array name
| | |
int[...][4]$
will be transformed into
int[4][...]$
```
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23278>
And fix all the errors it found.
Note that for the unsized array error, we will print the
toplevel type -- so that the fact that an inner array is
unsized can be seen.
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25200>
since the dependencies were not locked, they got updated and generating
a new container is throwing errors like the following:
error: failed to compile `bindgen-cli v0.62.0`, intermediate artifacts can be found at `/tmp/cargo-installcP54m7`
Caused by:
package `memchr v2.6.3` cannot be built because it requires rustc 1.61 or newer, while the currently active rustc version is 1.60.0
rust packages have Cargo.lock file from when they were released, so add
--locked flag to use it.
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25226>
This is kept as a separate commit because the change looks like a lot
more than it it. The order of the two loops is swapped, then the two
loops are merged.
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
Using a single data structure seems better. There's no appreciable
performance change. On batman_arkham_city_goty.foz, the difference
reported was 0.48%±0.36% (n=20). Several commits in the MR, including
some that should have no effect at all, reported similar changes. I
attribute this primarily changing of loop alignments and similar.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
On batman_arkham_city_goty.foz, this improves fossil-db time by
-3.83%±0.24% (n=20). This fossil takes the longest time of any in my
database.
v2: Add some comments for cmp_entry_src_entry_src and
cmp_entry_src_nr. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
This annoyed me durning development of this MR. Every time I changed the
parameters to this internal function, I had to modify a public header
file... and trigger a much large rebuild.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
The larger predicate here already requires that inst->opcode must be
BRW_OPCODE_MOV, so it can't BRW_OPCODE_SEL. With that removed, the
other simplifications are pretty straight forward.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
The caller already loops over the sources. This means that the caller
must loop over the sources in reverse because constant propagation
prefers to propagate into the last sources first.
The shader-db and fossil-db changes (below) are all due to SEL
instructions. Changing the order sources are visited changes whether a
SEL with two immediate sources is
(+f0.0) sel g12 IMM_A IMM_B
or
(-f0.0) sel g12 IMM_B IMM_A
The ordering of the sources affects the order the constant combining
encounters the values, and the determines which value is "combined"
and which value remains an immediate.
This affects the results by luck. If there are two instructions:
(+f0.0) sel g12 IMM_A IMM_B
(+f0.0) sel g13 IMM_A IMM_C
Picking IMM_A is advantageous over picking IMM_B and IMM_C. Since the
selection algorithm in constant combining is greedy, this case
requires the algorithm see the values in just the right order for the
right thing to happen.
v2: Rebase on many, many changes. Move instruction source fixup
reordering out or try_constant_propagate.
v3: Rebase on !7698.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
This annoyed me durning development of this MR. Every time I changed the
parameters to this internal function, I had to modify a public header
file... and trigger a much large rebuild.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
If the linked list structure used depended on the list head to know when
to terminate, this would be a pretty serious bug. If try_constant_propage
or try_copy_propagate make progress, inst->src[i].nr will change. This
results in the foreach_in_list using a different list header on later
iterations of the loop.
This causes two shaders in shader-db and 9 shaders in fossil-db to
change. Looking at the code changes, these are cases where there was a
copy of a copy that gets propagated. The part that confuses me is the
VGRF numbers involved should **not** hash to the same bucket, so it
should be impossible to find the original source from the intermediate
VGRF.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
Unless the change in liveout also causes livein to change, updates to
liveout cannot have any global effect. Changes to livein already flag
additional interation.
I had additional changes in this area that didn't pan out. While working
on those change, I was a little confused about this bit of code. It's
unnecessary, so it's better to delete it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
This script can:
* validate that genxml files do not duplicate imported items
* add imports to genxml files and optimize the file by dropping
duplicate items
* reverse the import operation by flattening genxml files
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
For example, gen11.xml will import the HEVC_ARBITRATION_PRIORITY
struct from gen9.xml.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
This function drops duplicated items from a genxml file when they are
equivalent to the same item imported from another genxml file.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
Since the output can now depend on other imported xml files, we need
to add them all as dependencies to ensure that if any xml file is
changed, then all pack files are rebuilt.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20593>
I didn't know they were regexes. This also excludes all "1048576" tests.
They build an acceleration structure with 1 primitive 1048576 times
which only warms up the Valve farm and doesn't accomplish anything else.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24720>
This was the last part that didn't scale with multiple infos. Reducing
the amount of barriers in this case improves DOOM Eternal performance by
50%. (Running with low resolution)
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24720>
This needs to be done so we can optimize it for occpuancy when building
multiple acceleration structures in parallel. Changes to the original
code:
- Change // to /* */
- clang-format
- Replace vkCmd calls with calls to the driver entrypoints
- Add a light weight info struct
- Use radv_fill_buffer directly
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24720>
problem: max_poc means the number of bits used in poc lsb
in slice header, and it should not be related to GOP
size. When large GOP size used, it could generate
corrupted video, as the POC could not be correctly
decoded.
solution: use fixed value of max_poc (16) for now.
Cc: mesa-stable
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25214>
This configures global, per-SE and per-SA counters with different
indexes. This is still unused because only for the first instance is
used by RADV/RadeonSI, but this will be changed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25211>