This is slightly more accurate in the IR, and means we instruction
select the current 16-bit size floating point instructions when all
non-immediate operands are 16-bit.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>
A bunch of the emitted combines were unnecessary, or unnecessarily
large. Fix the accounting now that combines are variable size.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>
MTL has different CS prefetch sizes for each CS type.
So here replacing the cs_prefetch_size in intel_device_info struct
by a function that takes as argument the i915 engine class.
Fixes:
- func.cmd-buffer.small-secondaries.q0
- dEQP-VK.multiview.secondary_cmd_buffer.*
- Several other VK CTS tests that uses secondary_cmd_buffer
v2:
- renamed to intel_device_info_get_engine_prefetch() (Jordan)
v3:
- renamed to intel_device_info_calc_engine_prefetch()
- store each engine class prefetch in intel_device_info
BSpec: 45718
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18597>
EXT_EGL_image_storage requires OpenGL 4.2, OpenGL ES 3.0, or
ARB_texture_storage, and EXT_direct_state_access or equivalent for
`EGLImageTargetTextureStorageEXT`.
`target` can be one of GL_TEXTURE_2D, GL_TEXTURE_2D_ARRAY, GL_TEXTURE_3D,
GL_TEXTURE_CUBE_MAP, GL_TEXTURE_CUBE_MAP_ARRAY. On non-ES GL it can also be
GL_TEXTURE_1D or GL_TEXTURE_1D_ARRAY. If OES_EGL_image_external is supported,
it can also be GL_TEXTURE_EXTERNAL_OES.
Signed-off-by: Simon Zeni <simon@bl4ckb0ne.ca>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18673>
Indicate to increase if the env var D3D12_VIDEO_ENC_METADATA_BUFFERS_COUNT
if not enough buffers in the get_feedback function
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18715>
Separate d3d12_video_encode flush on end_frame (ie. vaEndPicture) from fence sync in get_feedback (ie. vaSyncSurface/Buffer)
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18715>
Keep track of previous executed encodes with their metadata as vaSyncSurface
which queries this can be called with latency of several frames.
Fixes gstreamer encoding tearing issues
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18715>
when running recent Mesa on i855 (gen 2) without amber drivers:
error: Kernel is too old for Iris. Consider upgrading to kernel v4.16.
libGL error: glx: failed to create dri3 screen
libGL error: failed to load driver: iris
error: Kernel is too old for Iris. Consider upgrading to kernel v4.16.
libGL error: glx: failed to create dri2 screen
libGL error: failed to load driver: iris
move the i915 feature check to after the hardware generation check
which results in:
MESA: warning: Driver does not support the 0x3582 PCI ID.
libGL error: glx: failed to create dri3 screen
libGL error: failed to load driver: iris
MESA: warning: Driver does not support the 0x3582 PCI ID.
libGL error: glx: failed to create dri2 screen
libGL error: failed to load driver: iris
Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18563>
The nir code for AOS (aka linear) mode had a number of issues.
In some cases, the RGB->BGR swizzling wasn't happening, leading to
incorrect colors. In other cases, bad swizzling caused the first
pixel's color to be written to four adjacent pixels.
Writemasks must also be swizzled. For example, if an instruction's
writemask indicates the X component but the AOS component order is
BGRA we need to change the writemask to Z.
Another issue was with constant buffer values not getting consistently
convert to BGRA order. Fixing this involves removing the
lp_nir_aos_conv_const() function and immediately converting immediate
values from 4 x f32 in [0,1] to 16 x u8 when we translate nir's
load_const so that we know the value is in the right linear/AOS layout
right away.
Finally, the llvmpipe_nir_fn_is_linear_compat() function was not
checking that nir_instr_type_load_const values are in [0,1] for AOS
execution. The info.unclamped_immediates field is not needed for
the NIR path (but still used for the old TGSI path).
This fixes quite a few tests in our VMware suite.
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18213>
Move the lp_build_nir_aos_context struct declaration and
lp_nir_aos_context() cast wrapper from lp_bld_nir.h to
lp_bld_nir_aos.c and use the cast wrapper in more places.
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18213>
I previously bumped this to 32, but we need at least 64 to pass
a few other VMware tests (e.g. dx11-slots-uav-write-vs-gs-all-64).
Also update/generalize a comment.
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18213>
if the driver attemping to load is not zink and not software, then
attempt a zink fallback on failure
this conservatively handles the case of "only zink is built", though it
is going to be noticeably slower at startup than loading zink directly
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16168>
Traces of GLES games that ANGLE has taken frequently have no-op stencil
writes, which ANGLE and Zink both pass straight through. Given that we
support dynamic stencil state updates via tu_CmdSetStencil*(), draw time
really is the time for deciding this state unfortunately.
Reuse the fancier stencil write enables check from "can we do early z?" in
"can we do LRZ?". This gets one set of draws in among_us to have LRZ, but
I don't see a detectable performance difference.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18691>
Change the terminology around the post-RA optimizer, primarily this
changes the use of "clobbered" to "overwritten" to avoid confusion,
and it removes some redundant states.
Proposed for backporting to stable, to make sure it is easy to
backport further fixes (if any) on top of this.
Fossil DB stats unaffected on Navi 21.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
Change reset_block() so it only considers the logical
predecessors for VGPRs. Relevant for some optimizations
across loops.
This commit fixes an assertion failure which was triggered
by Zink in a piglit test.
Fossil DB stats unaffected on Navi 21.
Fixes: 2e56e23420
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
This assumption is no longer true since the post-RA optimizer
can work across blocks. It is now possible that some control
flow paths overwrite some but not all registers of an operand.
This commit may prevent invalid optimizations and/or assertion
failures (on debug builds).
Fossil DB stats unaffected on Navi 21.
Fixes: 0e4747d3fb
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>