[WHY]
There is potential memory leak from vpe_alloc_segment_ctx.
This memory leak occurs only in multi-frame VPE tests where
between vpe_create and vpe_destroy, multiple calls are made
to vpe_check_support that allocates new segment context without
releasing the old one.
[HOW]
Allocate segment_ctx only when it is not already allocated. If it is
already allocated, check whether re-allocation is needed. If not, skip
the allocation.
Signed-off-by: Phoebe Chen <phoebe.chen@amd.com>
Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Acked-by: ChuanYu Tseng <ChuanYu.Tseng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35012>
Update BKGR API
Change bg generation code so bg gen isn't hard-coded to stream 0, as
certain cases result in bg being generated at different stream.
Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Reviewed-by: Evan Damphousse <Evan.Damphousse@amd.com>
Acked-by: ChuanYu Tseng <ChuanYu.Tseng@amd.com>
Signed-off-by: Leder, Brendan Steve <BrendanSteve.Leder@amd.co>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35012>
Instead of passing raw u32's arround, this adds a new Phi wrapper struct
which is treated as opaque by most of the rest of the compiler. This is
similar to what we're already doing with Label and SSAValue. This also
gives us the opportunity to properly document NAK's phi model.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34994>
We shouldn't need to worry about HashDoS attacks because:
1. All of these keys are compiler-assigned IDs which makes it very
difficult to force collitions, and
2. Anyone who can hit the NAK compiler backend can already use CPU
power by spamming shader compiles or using O(n^2) behavior in
shader opt loops
As a result, we can afford to use a weaker hash function without
randomization.
This decreases total compile times by around 12% on shaderdb
(from 8113.74 user to 7115.47 user) on my machine.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34865>
wherever this doesn't result in type inference failing.
Using default() makes it easier to swap out the underlying type.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34865>
Improves performance of Phasmophobia with the "Eye Adaptation" video
setting enabled on Arc B570 by about 9.5%.
fossil-db results on Battlemage:
Totals:
Instrs: 148797922 -> 148797865 (-0.00%)
Send messages: 7066341 -> 7066317 (-0.00%)
Cycle count: 21459978352 -> 21459975048 (-0.00%)
Totals from 8 (0.00% of 574410) affected shaders:
Instrs: 4633 -> 4576 (-1.23%)
Send messages: 479 -> 455 (-5.01%)
Cycle count: 611886 -> 608582 (-0.54%)
Observed to cut 15% of sends in a Phasmophobia shader, 8.3% in a Far Cry
New Dawn shader, 7% in a Borderlands 3 DX11 shader, and 3.4-3.7% of
sends in a few Witcher 3 and Dark Souls 3 shaders.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33504>
Some shaders contain back-to-back atomic accesses in SPIR-V with
AcquireRelease semantics. In NIR, we translate these to a release
memory barrier, the atomic, then an acquire memory barrier.
This results in a lot of unnecessary memory barriers in the middle
of the sequence of atomics:
0. Release barrier
1. Atomic
2. Acquire barrier
3. Release barrier
4. Atomic
5. Acquire barrier
6. Release barrier
7. Atomic
8. Acquire barrier
In the absence of loads/stores, and when the atomic destinations are
unused, these barriers in-between atomics shouldn't be required.
This optimization pass would drop them (lines 2-3 and 5-6 above) while
leaving the first and last barriers (0 and 8), so the sequence remains
synchronized against other access elsewhere in the program.
One common example where this occurs is a sequence of min and max
atomics to clamp a certain memory location's value within a range.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33504>
By convention, a struct with a `new()` method which has no parameters
should have a `Default` impl which calls `new()`.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34372>
It was being applied even to platforms that don't require it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
The issues preventing it to be enabled were fixed so now we can enable
it but we need also to enable workaround 16013994831 back again.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
Description of this workaround are not clear but looking at Iris
implementation we need to emit all 3DSTATE_PUSH_CONSTANT_ALLOC_XS if
any 3DSTATE_PUSH_CONSTANT_ALLOC_XS is emitted.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
When a resource is un-referenced, the reference count is decremented,
and intentionally no lock is acquired. This can result in the following
race condition when a resource is created from a handle:
```
[Thread] Operation
[0] Create resource from handle for the first time, refcount set to 1
[0] resource is unreferenced, refcount is decremented to 0 (intentionally
no mutex is locked)
[0] before entering virgl_hw_res_destroy to lock
virgl_drm_winsys::bo_handles_mutex the thread yields
[1] Create resource from handle pulls the resource from
virgl_drm_winsys::bo_handles, refcount is incremented to 1
[1] resource is unreferenced, refcount is decremented to 0
[1] Enter virgl_hw_res_destroy,
[1] acquire the lock on virgl_drm_winsys::bo_handles_mutex
[1] check reference count to be 0, yes -> the resource is destroyed
[1] release the lock on virgl_drm_winsys::bo_handles_mutex
[0] Enter virgl_hw_res_destroy,
[0] acquire the lock on virgl_drm_winsys::bo_handles_mutex
[0] Here the res pointer already points to freed memory
[0] check reference count to be 0, yes -> the resource is destroyed (again!)
double free or corruption (!prev)
```
To work around this race condition, keep track of the number of times
the resource was pulled from virgl_drm_winsys::bo_handles to see whether
it has to be kept alive despite the reference count being zero.
This can be reproduced with the `spec@ext_image_dma_buf_import@ext_image_dma_buf_import-refcount-multithread`
piglit test.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34809>
We move pan_raw_format_mask_midgard to pan_format.c instead making
pan_shader.c not depending on any GENX.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34895>
Also move bifrost_blend_type_from_nir to pan_blend.c, rename it and
makes it not GENX.
This part is related to blend so it makes more sense to have it there
and this will allow us to make pan_shader.c not GENX.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34895>
This was added in v6+ and never changed.
This will allow us to remove GENX code logic that is identical.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34895>