[00000001x_00000000x] nop ; dontcare bits in nop: 0000000100000000
[00000002x_00000000x] nop ; dontcare bits in nop: 0000000200000000
Doesn't seem to make them different from ordinary nops.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21498>
Only apply the clamp in multi patch mode (where the input vertices
vary between [1, 32]).
The clamp NIR pass operates on lowered intrinsics so we need to call
it after the inputs have been lowered.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e25e17dd0c ("intel/fs: clamp per vertex input accesses to patchControlPoints")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8912
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22701>
if the size of the constant buffer + stride overflows UINT32_MAX,
DIV_ROUND_UP will return 0, which is, in some sense, extremely robust,
but for general functionality it's not actually very robust
cc: mesa-stable
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22720>
Commit f9a074dd55 ("dri2/android: Bypass throttling") dropped
unnecessary throtting in the SwapBuffers() path for android. But
unfortunately MSAA resolve got tangled up in the throttle reason
flag. So add a new flag that indicates "no throttingling, but yes
please do MSAA resolve".
Fixes: f9a074dd55 ("dri2/android: Bypass throttling")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22719>
by having va av1 caps enabled, av1 vaapi encoding
is enabled.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22585>
Merge PIPE_H265_ENC_FEATURE into PIPE_ENC_FEATURE enum
because those are common flags, and it will be
used in AV1 encoder as well.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22585>
reason:
extra bytes are not needed and not necessary
in h264/h265 bitstreams, because they are in
between NALs, the only problem is they consumed
extra bits. And for av1 streams, that could be
explained to something else, especially in
multi-layer cases, that can cause syntax errors.
ptr[6] represents the bitstream size,
ptr[8] represents the extra zero bytes.
The total number of bytes of the output
should be ptr[6] - ptr[8]
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22585>
reason:
so far, the output_format_param function can be shared
by different encoders, and just for h264 encoder, there
is no 10bit encoding supported. This is to reduce
the repeated code before having av1 encoder.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22585>
swizzle mode in ref frames could potentially
improve encoding performance, the main reason
is just because linear mapping is the worst mode
for reference frames comparing to block level
mapping.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22585>
2 pass search map is a feature supported by VCN,
the main purpose is to enlarge motion search
range that in pre-encoding path the center global
motion vectors could be obtained and used in the
final path as a block center base. When 2pass is
used, this feature will be automatically enabled.
2 pass feature can be enabled by ffmpeg command
line "-compression_level 1"
and also correct some typos and move quality
package from vcn3.0 to vcn2.0 since it is availabe
in vcn2.0 and vcn3.0 can use it directly. Correct
vcn3.0 hevc spec misc IB package.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22585>
Not sure if this is possible, but we should avoid it anyway.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22714>
If Driver.GetGraphicsResetStatus exists for one context, other contexts
will be able to use it; so there's no need to inherit reset status from
the other contexts.
This also prevented implementing the spec correctly: we're supposed to
report GL_NO_ERROR when the reset is completed (after reporting GL_*_RESET
at least once):
If a reset status other than NO_ERROR is returned and subsequent
calls return NO_ERROR, the context reset was encountered and
completed. If a reset status is repeatedly returned, the context may
be in the process of resetting.
With the existing code, the contexts will report INNOCENT_CONTEXT_RESET
forever.
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22290>
On older kernel the completion of the reset isn't signalled to userspace,
yet we need it to implement the EXT_robustness extension correctly.
In this situation, try to create a new context and submit a no-op job. If
the reset isn't done the kernel will reject the submission (-ECANCELED);
otherwise the submission will go through and we'll know that the reset is
done.
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22290>
This way apps know they can recreate their contexts when
the status go back to NO_ERROR.
This depends on new UAPI in the kernel; for older kernel, radeonsi
will stop reporting a reset after 3 seconds. Apps will be able to
create new contexts but they'll have to handle not being able to
submit tasks.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7460
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22290>
RGP seems to use that to calibrate timestamps. This also introduces
a new helper to reset thread data between captures.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22594>
In order to use load_global_const_block_intel we need to ensure the
64bit address in src[0] is uniform. This is not the case in the
vkd3d-proton test_bindless_cbv tests for example.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22624>
We have been stopping as soon as we find a conflict but that doesn't
mean we can't merge it in an earlier slot, so keep going. Going by
shader-db, this sometimes allows us to merge the final thrsw a bit
earlier and avoid emitting NOP instructions at the program end to
make up for its delay slots. I have not observed cases where this
helps with regular thrsw though, but it doesn't hurt to try with
those too.
total instructions in shared programs: 11526876 -> 11526354 (<.01%)
instructions in affected programs: 10760 -> 10238 (-4.85%)
helped: 236
HURT: 0
Instructions are helped.
total max-temps in shared programs: 2231705 -> 2231677 (<.01%)
max-temps in affected programs: 276 -> 248 (-10.14%)
helped: 27
HURT: 0
Max-temps are helped.
total inst-and-stalls in shared programs: 11545177 -> 11544655 (<.01%)
inst-and-stalls in affected programs: 10777 -> 10255 (-4.84%)
helped: 236
HURT: 0
Inst-and-stalls are helped.
total nops in shared programs: 321624 -> 321152 (-0.15%)
nops in affected programs: 751 -> 279 (-62.85%)
helped: 236
HURT: 0
Nops are helped.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22679>
This change updates the sqtt layer to add mesh shader specific RGP
instrumentation logic. This should allow RGP to correctly identify GPU
work derived from vkCmdDrawMeshTasksEXT, vkCmdDrawMeshTasksIndirectEXT,
and vkCmdDrawMeshTasksIndirectCountEXT API calls.
This change also updates the mesa-to-RGP shader stage translation logic
to handle the mesh & task stages.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21917>