If the clear color isn't 0 or 1, we used a slow clear. This adds a new
DCC clear where the DCC buffer is cleared to a special value and the clear
color is stored at the beginning of each 256B block in the image.
It can be very fast, but it's not always faster than a slow clear.
There is a heuristic that determines whether this new fast clear is
better.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
Since the TCS epilog is no more, this is required to apply those bits
to monolithic shaders.
tessfactors_are_def_in_all_invocs was unused.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
otherwise the options would be ignored if the shader cache had already
cached the same shader with the option inverted.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
Starting from MTL there is registers in HW to read the IP version of
graphics, media and display IPs, those registers are called GMD.
IPs can be used in any combination to form a SOC/platform and each IP
has it own stepping/revision, making complex to track each IP stepping
using just PCI revision.
Since MTL will be supported by default by i915 KMD that don't have
a uAPI fetch IP versions, this feature will only be supported in LNL
and newer that are backed by Xe KMD.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26908>
Sync xe_drm.h with 31ced035ecde ("drm/xe/uapi: Restore flags VM_BIND_FLAG_READONLY and VM_BIND_FLAG_IMMEDIATE").
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26908>
In some (probably malformed) cases, even weights BOs for strided or depthwise
convolutions can become bigger when using ZRL compression.
To avoid running out of space in the BO, play safe and calculate the
actual optimum ZRL bit count. This does slow compilation for quite a
bit, though (2x slower for MobileNetV1).
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28879>
By using the on-chip SRAM to cache the input image we can save some more
bandwidth and increase the utilization of the NN cores, with the
following improvements:
MobileNetV1: 9.991ms -> 6.2ms
SSDLite MobileDet: 27ms -> 24.3ms
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28879>
We were wrongly counting the remaining number of output channels in the
last superblock, when the former isn't divisible by the latter.
MobileNetV1: 9.991ms -> 9.991ms
SSDLite MobileDet: 32.692ms -> 27ms
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28879>
This function has 2 additional parameters to set spacing before
printing register group dword or individual registers.
intel_print_group() is keept with the same spacing as before so no
changes on decoder output is expected here.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28722>
Xe parser will also need to use the option_color parameter.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28722>
Initial work by Rafael Antognolli <rafael.antognolli@intel.com>
Reworks
- Rebase to main
- Emit the right hiz op for higher mip levels when transitioning the
depth buffer
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28629>
The post-multiplier was extended by 8 bits for improved precision.
The shift offset appears to have changed as well.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28878>
in the case where multiple variables get merged into one, try to use
all the names when creating new vars
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28814>