Now that we have intrinsics which map directly to the hardware opcodes,
we can lower PLS inside the gallium driver instead of the back-end
compiler having to know anything about it. This simplifies the back-end
and is less code, if you ignore the new copyright header.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
This is a little more manual (though it's actually less code) but it
gives us a lot more control and makes the whole flow nicer.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
Like we just did with load_tile_pan, this maps directly to ST_TILE in
the hardware. This is more versatile and lets us do more of our
lowering in NIR.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
Instead of making it explicitly about outputs, this switchies it to
being a NIR version of LD_TILE. It means we have to do a bit of work in
NIR and add a builder helper but the end result is something much more
versatile.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
Previously common wsi has a special submission to install implicit fence
to wsi memory directly, which has been deprecated in favor of bonfire
implicit fencing (implicit fencing has been turned into explicit fencing
within vulkan since then). The virtgpu backend is fine but the vtest
backend has been regressed since then, only relying on renderer side hw
driver doing implicit fencing.
With async present landed earlier, we can directly tell which submission
is done by common wsi, and can revive the idle waiting accordingly.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39377>
Gfx 12.5 struct has only one major difference with gfx9, that is OaCntr lenght,
while on gfx 9 it is 36 uint64_t long on gfx 12.5 it is 38 uint64_t long.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
We are missing handling for gfx12.5 so to add it we will need a switch case over
verx.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
Looking at the reference code, there is no new struct for Xe3 so it should
use the same struct as Xe2.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
With no more users of intel_perf_load_configuration() it can be
removed with other i915 functions around it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
We have no usage of the information returned by
intel_perf_load_configuration(). It is only used to add a copy of the
configuration so we have the metric id but we could instead get the
metric id from sysfs, that is added by mdapi.
Xe KMD don't have a uAPI to query the metrics configuration, so
using sysfs also fixes the integration of mdapi with Xe KMD.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
There is no usage for register_config outside of
anv_AcquirePerformanceConfigurationINTEL(), so we don't need to store
it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
a longstanding issue in zink has been the scenario where a dmabuf is
created for e.g., RGBA8888, then the app tries to do SRGB, but the driver
doesn't support mutable formats with the dmabuf modifier. in this scenario, the app
would either crash or break unpredictably
by reusing the existing transient mechanism (previously only for msrtss emulation),
these dmabufs can instead have a shadow image which handles mutable formats and
then syncs back to the main image when necessary
this should greatly improve the situation on e.g., Intel
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39336>
Not setting this results in an extra bitcast to float for u/int target types as
outputs are always floats, which could in turn also result in HW-dependent
behavior with having float outputs getting written to u/int targets. Whether
this ends up being handled correctly is also driver and HW specific. Zink does
not handle this, which causes problems as having FS float outputs writing to
an int attachment is undefined behavior.
Since we know the texture type already, let's use that.
Explicitly setting this to dest_type gets Zink to behave better, and could also
help other drivers by dropping the cast and marking the output type as u/int.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39352>