Using stripes to deal with the different packet layout variants resulted
in redefining "register" offsets with different values, so use "prefix"
to add a suffix to disambiguate.
drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h:1066: warning: "REG_A6XX_CP_DRAW_INDIRECT_MULTI_INDIRECT" redefined
1066 | #define REG_A6XX_CP_DRAW_INDIRECT_MULTI_INDIRECT 0x00000006
|
drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h:1057: note: this is the location of the previous definition
1057 | #define REG_A6XX_CP_DRAW_INDIRECT_MULTI_INDIRECT 0x00000003
|
(Admittedly it isn't really a "prefix" but that was the field in the
schema available to use, and REG_INDEXED_CP_DRAW_INDIRECT_MULTI_STRIDE
sounds somewhat more funny.)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>
This runs through the SQE bootstrap code to extract the packet-table,
rather than relying on heuristics. As a bonus, it can detect the start
of the LPAC fw in a660+ fw so that we can properly decode the LPAC fw
and packet-table.
Note that this decodes the jmptable as normal instructions, which is a
change in behavior from the previous heuristic based jmptbl extraction.
Not sure if that is a good or bad thing.
For a5xx, for now the legacy heuristic based jmptable decoding is
preserved, at least until enough control regs are figured out.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>
When we start running the bootstrap code thru the emulator we will need
the packet-table loading to actually happen. So add this.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>
Run until the packet-table is populated, so the disassembler can use
this to know the offsets of various pm4 packet handlers without having
to rely on heuristics.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>
Some of the a6xx gens will require some control reg initialization, and
go into an infinite loop if they don't see the values they expect, so
we'll need to extract the compute gpu-id.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>
This is an (at least somewhat complete) logical emulator of the a6xx SQE
that lets us step through firmware execution (bootstrap, cmdstream pkt
handling, etc). It lets us poke at various fw visible state and run
through pm4 packet(s) to better understand what the fw is doing when it
handles various packets.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>
Allow for different mnemonics depending on whether they are used as
source or destination register, to better reflect what they do.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>
With disasm emulator mode, we'll start wanting some things that are
duplicationg what the assembler does, so just split out all the rnndb
bits into shared utils.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10944>
Going beyond 0x100000 results in hangs, however I found that the
last 0x100000 packet just doesn't get executed. Thus the real limit is
0x0FFFFF. At least this is true for a6xx.
This could be tested by appending nops to the cmdstream and placing
e.g. CP_INTERRUPT at the end, at any position other than being
0x100000 packet it results in a hang.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10786>
There is a limit on IB size, which on freedreno is set to 0x100000.
Going beyond it results in hangs, however I found that the last
0x100000 packet just doesn't get executed. Thus the real limit is
0x0FFFFF.
This could be tested by appending nops to the cmdstream and placing
e.g. CP_INTERRUPT at the end, at any position other than being
0x100000 packet it results in a hang.
Fixes:
dEQP-VK.api.command_buffers.record_many_draws_secondary_2
dEQP-VK.api.command_buffers.record_many_draws_primary_2
However these tests could trigger hangcheck timeouts.
Also this fixes hangs when opening captures of games in RenderDoc.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10786>
It looks like this was a copy-and-paste mistake in 827e0d6654 where
the initialiser was moved from being a struct initialiser to a
standalone statement. Some of them were fixed with an unrelated change
in 187218395d but not all of them. This shouldn’t make any practical
difference to the compiled code.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11090>
There are some interactions between these two extensions that need to be
implemented when both are supported. Particularly:
1. Applications can create images that will be bound to swapchain memory
by passing a VkImageSwapchainCreateInfoKHR in the pNext chain
of VkImageCreateInfo. In this case we need to make sure that the
created image takes some of its parameters from the underlying
swapchain.
2. Applications can bind memory from a swapchain image to a VkImage
by passing a VkBindImageMemorySwapchainInfoKHR in the pNext chain
of VkBindImageMemoryInfo.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11037>
This was added with VK_KHR_device_group and allows users to specify
a base offset that will be automatically added to gl_WorkGroupID.
Unfortunately, V3D doesn't support this natively, so we need to add
the base to the workgroup id generated by hardware manually. For this,
we inject add instructions that source from a QUNIFORM that will
retrieve the actual dispatch base from the compute job when it is
dispatched.
Since a compute shader can be dispatched with CmdDispatch and/or
CmdDispatchBase, we always need to add these additional add
instructions and use a base of (0,0,0) for regular dispatches.
Since we don't support any version of OpenGL with this dispatch
base functionality we can avoid the extra instructions there.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11037>
DXVK 1.8.1 marks position as always invariant but the DX12 version
of the game has the same issue and it's not yet fixed on the
vkd3d-proton side.
Fixes some Z-fighting on GFX10.3.
Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11029>
Construct a scissor descriptor correpsonding to the intersection of the
framebuffer, the viewport, and the region selected for scissoring by the app.
Use the intersected scissor in the "clip tile" fields in the viewport. Select
this scissor descriptor from the command stream.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11084>
Use the Zink lowering pass to handle the non-halfz case. Metal, like Vulkan,
uses half-z (and Metal is not configurable, making r/e tricky).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11084>
Not sure what the proper data structure for this is yet, but this will
hold over until we start optimizing for memory usage.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11084>
We can't pack the scissor descriptor for these, and there would be no rendering
anyway, so detect this condition and skip the draw.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11084>
Although there is a scissor enable bit in the hardware rasterizer state, we
cannot rely on it alone as we also "scissor" to the viewport.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11084>