This allows, for example, fs_inst::components_read() without passing
devinfo as extra argument.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
For SIMD8 half float payload, each component takes a full register, so
we can use existing LOAD_PAYLOAD infrastruture for required padding by
alternating plain 8-wide half float vector and null vector.
Also this patch removes an unwanted assertion from
opt_copy_propagation_local for LOAD_PAYLOAD.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
To support SIMD8 half float payloads, each component takes one full
32bit wide register in both SIMD8H and SIMD16H mode. So we can make use
of existing LOAD_PAYLOAD infrastructure alternating a half float vector
and a null vector, in order to handle required padding.
v2: (Francisco)
- Skip header sources
- Fix comparision units
- Don't allocate VGRF for padded source
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
We are always aligning to REG_SIZE but when we have payload sources less
than REG_SIZE, size written is miscalculated.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
We can use LOAD_PAYLOAD infrastructure in order to handle 16bit float
payload. Let's rely on source type for padding sources, if not set
previously then default one would be 32-bit.
This patch will be used later in the series to handle 16-bit float
payloads.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
Update return format field and add SIMD Mode [2] field in sampler
descriptor. Now we can tell sampler to return data in either 32/16 bit
format precision.
v1:
- Drop unnecessary descriptor fields (Jason)
- Handle return format (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
on GFX8 onwards, we have only single bit to determine correct return
format.
v2:
- Define macro and use it instead of hardcoded value. (Lionel)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
At high frequency sampling, this generates a lot of messages.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13831>
Otherwise we need to include intel headers in generic code.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13831>
On newer HW it will require more work than just reading a dword. It
could also vary depending on the report format.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13831>
For this each driver must :
- report its clock_id (if no particular clock just default to cpu
boottime one)
- be able to sample its clock (gpu_timestamp())
The PPSDataSource will then emit timestamp correlation events in the
trace ensuring perfetto is able to display GPU & CPU events
appropriately on its timeline.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13831>
Now that we removed the intel intrinsic and just use the generic one,
we can skip it in the intel call lowering pass and just deal with it
in the intel rt intrinsic lowering.
v2: rewrite with nir_shader_instructions_pass() (Jason)
v3: handle everything in switch (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 423c47de99 ("nir: drop the btd_resume_intel intrinsic")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12113>
Instead of having a bunch of const vk_sync_type for each permutation of
vk_drm_syncobj capabilities, have a vk_drm_syncobj_get_type helper which
auto-detects features. If a driver can't support a feature for some
reason (i915 got timeline support very late, for instance), they can
always mask off feature bits they don't want.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
BO waits aren't going away any time soon so using a syncobj here doesn't
really gain us anything. It just makes it more complicated.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
These are entirely unused except for the syncobj in submit_simple_batch.
We can use the libdrm wrappers for that as they're basically equivalent
and the core Vulkan sync code already depends on them.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This is, unfortunately, a large flag-day mega-commit. However, any
other approach would likely be fragile and involve a lot more churn as
we try to plumb the new vk_fence and vk_semaphore primitives into ANV's
submit code before we delete it all. Instead, we do it all in one go
and accept the consequences.
While this should be mostly functionally equivalent to the previous
code, there is one potential perf-affecting change. The command buffer
chaining optimization no longer works across VkSubmitInfo structs.
Within a single VkSubmitInfo, we will attempt to chain all the command
buffers together but we no longer try to chain across a VkSubmitInfo
boundary. Hopefully, this isn't a significant perf problem. If it ever
is, we'll have to teach the core runtime code how to combine two or more
VkSubmitInfos into a single vk_queue_submit.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
These are about to be the only use of anv_gettime_ns() so lets switch
them over so the next patch can delete the helper outright.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This gets rid of the bespoke vfunc interface in the WSI code.
Eventually, I'd love to get rid of wsi_display_fence entirely but
this at least makes it a lot more palatable.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This effectively partially reverts 13fe43714c ("anv: Add helpers in
anv_allocator for mapping BOs") where we both added helpers and reworked
memory mapping to stash the maps on the BO. The problem comes with
external memory. Due to GEM rules, if a memory object is exported and
then imported or imported twice, we have to deduplicate the anv_bo
struct but, according to Vulkan rules, they are separate VkDeviceMemory
objects. This means we either need to always map whole objects and
reference-count the map or we need to handle maps separately for
separate VkDeviceMemory objects. For now, take the later path.
Fixes: 13fe43714c ("anv: Add helpers in anv_allocator for mapping BOs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5612
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13795>
We prefix names with an underscore to make them "safe" C identifiers
when necessary. For example, a value of "32x32" would become "_32x32".
However, when specifying something like
<field ... prefix="BLOCK_SIZE">
<value name="32x32" value="0"/>
</field>
we already have a prefix that makes the field name safe. We'd rather
generate a name with a single underscore, i.e.
#define BLOCK_SIZE_32x32 0
rather than
#define BLOCK_SIZE__32x32 0
This also fixes up affected defines in crocus.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13809>
When a <field> tag has multiple <value> children, listing symbolic names
for possible field values, we generate #defines for each value, with an
optional prefix. I don't know why, but this code was checking whether
self.default is None. We want to generate the same list of #defines,
with a prefix, regardless of whether the field has a default value
specified or not.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13809>
When you enable video genxml, lots of warnings about redefined things
appear, just clean those up before things get started.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13788>
We need to guarantee that when vkQueueSubmit() returns the application
can actually wait on a signaled semaphore/syncobj.
When using a thread to do the submission to i915, this gets a bit
tricky in the following case :
A syncobj is used both as a wait & signal semaphore and has been
signaled once already. It contains a fence before entering
vkQueueSubmit().
This means we need to reset the syncobj to ensure when we return
from vkQueueSubmit(), the syncobj contains no stale fence.
Currently in the Anv, the submission thread is in charge of putting
the new fence in the syncobj and also picks up the wait fence
directly from the syncobj. This means we can't reset the syncobj
from vkQueueSubmit().
The solution to this has been pointed by Bas & Jason :
In vkQueueSubmit(), clone the wait syncobj fence into a new
temporary syncobj that will be destroy after submission and use
this temporary syncobj as a wait fence for i915. This allows us to
reset the original syncobj in vkQueueSubmit().
For this to work with wait_before_signal behavior, we also need to
do a wait-on-materialize on binary semaphores from vkQueueSubmit().
Otherwise the application thread calling vkQueueSubmit() could race
the submission thread and pick up the wrong fence when cloing.
v2: Use copy semantic for clone_syncobj_dma_fence() (Jason)
Do the cloning prior to adding the syncobj to anv_queue_submit so
that if the cloning fails don't have an invalid syncobj in
anv_queue_submit (Jason)
v3: Fix another syncobj leak (Jason)
v4: Fix invalid argument order (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4945
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11474>