Reported by clang tools.
See: https://clangd.llvm.org/guides/include-cleaner
struct ac_cmdbuf had to be moved to ac_cmdbuf_base.h because we can't
include ac_cmdbuf.h->sid.h->amdgfxregs.h in radeon_winsys.h for r300.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41091>
BV_RB and BV_CCU are supported on some devices (knp, but not glymur or
pakala, for ex).. we don't have a way to deal with that yet.
This doesn't yet _expose_ gen8 perfcntrs. That small patch will come
after PERFCNTR_CONFIG ioctl is supported to ensure that everything gen8
and later supports the new kernel based counter collection/reservation
(so that backwards compat of old userspace on new kernel is limited to
a7xx and earlier).
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>
With concurrent binning, some counter reads or SEL reg programming needs
to happen explicitly on the BR or BV ring. For the most part if there
is a "BV_FOO" counter group that should be on the BV ring and the
corresponding "FOO" group on the BR ring. There are a few exceptions
like "CP" vs "BV_CP" which have different SEL reg offsets for BR vs BV,
rather than the same offsets that should be accessed via the appropriate
aperture.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>
1) only use "ull" for reg64, which avoids some compiler warnings on the
kernel side.
2) use "ull" for booleans as well, if reg64
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>
To generate the perfctr tables we need a bit more information than what
is in the .xml, such as which groups of SELECT regs correspond to which
sets of COUNTER regs, the enum type of the countables (ie. possible
SELECT reg values), etc.
It would be awkward to shoehorn this into an xml schema that is based on
describing registers. But json is easy to consume.
Field description:
- chip: variant enum used for generating correct reg offsets
- groups: array of entries for each group of counters/countables:
- name: group name
- num: the number of counters
- reserved: array of counter indices reserved for KMD use
- select_offset: Offset of the first selector reg, used in cases
where same bank of selectors is used for both BR and
BV
- select: the selector reg name
- counter: counter name if <reg64>, otherwise use counter_lo and
counter_hi
- countable_type: name of <enum> that defines selector reg values
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>
We could generate the rest of the tables, other than these fields. But
they are all "UINT64, AVERAGE" (for the non-derived counters), so just
drop them.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>
I did my stress testing mostly outside of north america work hours, but it
turns out once the runners have 60-70% background CPU usage, these ones
intermittently time out.
Reported-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41163>
There are less than 2^16 lanes within a threadgroup, so it is safe to do
all math at 16-bit. This allows us to use 16-bit integer division which is
much faster than 32-bit integer division (in terms of the lowerings).
In a "hello world" kernel with variable wg size, simd32 goes 72 inst -> 57
inst on jay and 82 -> 67 inst on brw.
OTOH it's a loss for non-variable wg size, so do it only there to avoid
unwelcome stats regresions on Vulkan.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41084>
This reverts commit b83a931cb1 as it causes
regressions with dirty rects enabled on some HW platforms that signal
out of order completion and require individual fence objects per slice
Fixes: b83a931cb1 ("d3d12: Video sliced encode: Use same ID3D12Fence/different per slice values as optimization")
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41160>
KHR_pipeline_library is a base extension whose semantics can only
be exercised by extensions: EXT_graphics_pipeline_library
and KHR_ray_tracing_pipeline. Both remain gated under LLVM,
so advertising the KHR base extension is inert for conformant apps.
The reason for the change is a hard requirement for KHR_pipeline_library
in DXVK 2.7+. DX games under Proton, which uses DXVK, fail adapter
creation if this extension is absent. DXVK supports scenario when
KHR_pipeline_library is available but two dependent extensions aren't.
The !use_llvm condition originated in f1095260a4 when
KHR_pipeline_library was first wired up for ray tracing only. It was
touched also in 045c96d896 when EXT_graphics_pipeline_library also took
it as a dependency.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41134>
These are a mix of fields whose last used was removed or fields that were
never used, possibly because they remained in a patch while the rest of the
code changed before landing.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41139>
A reshape operation just changes the dimensions of a tensor, but doesn't
change the data at all. So we just point the OFM to the IFM data and
we're done.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
The ethosu_lower_add() function can handle other element wise operations
such as multiply, minimum, and maximum, so rename it in preparation to
add those operations.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
Add support for fully-connected convolution. FC convolution lowering is
nearly the same, so refactor the existing convolution code to support
both.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
For axis 1 concatenation, the OFM strides need to match the IFM strides.
Presumably axis -3 can also be supported, but there haven't been any
models with -3. Not sure what axis 2 would need either.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>