We can't query the actually required alignment from llvmpipe, but 16
bytes is insufficient. In particular, llvmpipe's pixel backend code for
depth/stencil will load/store 4 values at a time, which crashes
with d32s8x24 format if alignment is only 16 bytes (the color backend
does similar things but will use unaligned loads if alignment exceeds
16 bytes at least for now).
32 bytes would be enough, however the "ordinary" llvmpipe resource
layout code currently always aligns to at least 64 bytes, so use
this value as well.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26724>
The initial logic was to remember the place were SPI_SHADER_PGM_LO_*
are written, then assume that we can get the register offset because
the sequence would always be:
PKT3_SET_SH_REG
SPI_SHADER_PGM_LO_* register offset
VA low 32 bits value <- reg_va_low_idx
The problem is that this sequence isn't guaranteed, for instance we
can get this instead:
0 c0067600 |
1 00000046 |
2 003ffffd | SPI_SHADER_PGM_RSRC3_VS
3 00000020 | SPI_SHADER_LATE_ALLOC_VS
4 * 00002080 | SPI_SHADER_PGM_LO_VS
5 00000080 | SPI_SHADER_PGM_HI_VS
So the assert in si_state_draw.cpp would fail as well as the VA
update logic.
So instead remember which the SPI_SHADER_PGM_LO_* offset, and the low
32 bits of the VA in si_update_shaders.
Fixes: 8034a71430 ("radeonsi/sqtt: re-export shaders in a single bo")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
No hash updates as I didn't find a facility to do it in radeonsi
(even though there are flags like forcing fma32).
Note that we do this very late to avoid any optimizations that
might remove the dead stores. (Checked that LLVM doesn't remove
them, but it is admittedly potentially brittle)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26679>
RAW shader did not have dma shader upload, this commit share
the pre/post upload code with ELF, so RAW and ELF can have same
upload mechanism.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26750>
We are going to reuse this helper in anv driver and
also rename it.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Acked-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25651>
In the past, multiple writes to a single register were pretty common,
but since we've transitioned to NIR, and leave the IR in SSA form for
everything not captured in a phi-web, the pattern of generating new
temporary registers at each step is a lot more common.
This pass isn't nearly as useful now. Across fossil-db on Alchemist,
this affects only 0.55% of shaders, which fall into two cases:
- Coarse pixel shading pixel-X/Y setup. There are a few cases where
we write a partial calculation into a register, then have a second
instruction read that as a source and overwrite it as a destination.
While we could use a temporary here, it doesn't actually help with
register pressure at all, since there's the same amount of values
live at both instructions regardless. So while this pass kicks in,
it doesn't do anything useful.
- Geometry shader control data bits (5 shaders total). We track masks
for handling EndPrimitive in a single register across the program,
and apparently in some cases can split the live range. However, it's
a single register...only in geometry shaders...which use EndPrimitive.
None of them appear to be in danger of spilling, either. So this tiny
benefit doesn't seem to justify the cost of running the pass.
So, just throw it out. It's not worth keeping.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26343>
Use the CustomLogger class and CLI tool to create strutured logs
for poe scripts which are used by broadcom and nouveau jobs.
Renamed stage lint to code-validation and added python-test job
which runs the tests for structured and customer logger to ci.
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25179>
The xe2 xml will be updated in following commits. Commit message
has been updated by Jianxun.
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26600>
This was missing from the FS stage primitive-id check. Also add usage
of macro to avoid running any extra code on platforms where this WA
would not apply.
Fixes: 0f14724039 ("iris: Implement Wa_14015297576")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26709>
The clear color BO is sometimes getting unnecessarily zeroed. For
example, if the resource will be fast-cleared, the app may pick a
non-zero color, causing the initial memset to be unneeded. So, skip the
memset and mark the clear color as unknown if it has not been freshly
allocated. For now, we leave the memsets on imported dmabufs alone for
simplicity.
On non-small BAR ACM systems, this also allows internal resources using
compression to be created without executing an IOCTL for memory mapping.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26675>
On small-BAR ACM systems, we don't initialize the clear color at
resource creation time. This could cause an issue with FCV_CCS_E.
Because the rendering and sampling fields may not be in agreement, FCV
may generate fast-cleared blocks during rendering and the sampler may
interpret them with the wrong values.
Thankfully, the register bit which enables FCV on ACM is disabled by
default, so there is not an actually issue today. Regardless, this patch
implements the fix for this issue now. The next patch will actually skip
explicit clear color initialization at resource creation time. So, this
fix will be useful for other systems that do have FCV enabled today.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26675>
Fresh suballocations from fresh allocations have already been zeroed by
the kernel. So, make BO_ALLOC_ZEROED a no-op for them.
This introduces a new field, iris_bo::zeroed, which will be reused for
similar optimizations in the future.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26675>
Sync xe_drm.h with commit a8ff56e160bb ("drm/xe/uapi: Remove reset uevent for now").
This is the last xe_drm.h uAPI break.
The only relevant change for ANV and Iris is that now VM bind uAPI
is asynchronous only so I had to bring back the syncobj creation, wait
and destruction.
Is still in the Xe port TODO list to make VM binds truly asynchronous.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26699>
ROI (region of interest) feature implementation
in va.
It does not support ROI priority, and supports
qp delta, and the maximum number of supported
region is defined as 32, the region sequence implies
the priority, the lower the sequence number, the higher
the region priority, when region overlapping happened,
the higher priority region overwrites the lower
priority one.
And specifically for AV1, the adjust step will be
rounded by 5 when rate control is used, for example,
if qp_delta (q index) is 6, it will use 5, if
qp_delta is 8, it will use 10.
For AVC/HEVC (RC/CQP) and AV1 CQP mode, the
qp_delta granularity is 1.
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26659>