The v_min_f64 was broken, because we needs to allow 1.0 as fract result.
Otherwise floor(-DBL_MIN) will not return -1.0.
NaN also doesn't need any special handling, because fadd(NaN, b) is NaN.
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41852>
The `libanv_drirc` custom target passes `00-mesa-defaults.conf` to the
generator script via the `--validate` flag, but it was hardcoded as a
path rather than being declared as an input dependency.
When translating the build to a hermetic build system, this missing
dependency causes the build to fail because the undeclared file is not
provisioned in the sandboxed build environment.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41886>
With VK_DYNAMIC_STATE_COLOR_BLEND_ADVANCED_EXT the advanced blend op and its
parameters are only known at draw time. The static color_blend_op still
carries the pipeline's blend equation op, so lowering advanced blend in the
fragment shader already at pipeline creation used a stale op and, together with
the draw-time lowering, blended the result twice.
Fixes: 0444b5877f ("lavapipe: Implement VK_EXT_blend_operation_advanced")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41885>
When color blending is disabled for an attachment the fragment shader output
is written unmodified, so an advanced blend op on that attachment must have no
effect. lavapipe decided whether to lower advanced blend from the blend op
alone and ignored the blend enable, so the output got blended instead of being
left untouched.
Fixes: 0444b5877f ("lavapipe: Implement VK_EXT_blend_operation_advanced")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41885>
While Apple GPUs do seem to handle robustness for cases where index count
exceeds the actual `MTLBuffer` size, Metal validation complains about it.
In the interest of valid API usage we can handle this case ourselves for
now, and later Metal 4 encoders will do away with the need to handle index
robustness altogether.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41873>
Metal validation fails if an image-to-image copy is performed with
dimensions that are not a multiple of the block size, which we hit
for 1D compressed images. According to documentation, it should be
fine to clamp the extent to the edge of the image, but it triggers
anyway, so route these to the buffer path to pass validation.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41873>
Use `util_format_get_nblocks(x/y/z)` for safer extent division, and
make sure 1D/2D extensions have extent y/z set to 1 as appropriate.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41873>
This was originally disabled by a22ad99bdd ("pvr: set device
features/props/extensions to Vulkan 1.0 minimums (unless implemented)") in order
to concentrate efforts on passing "base" Vulkan conformance before layering on
additional functionality. The driver is now Vulkan 1.2 conformant.
As the functionality is already implemented, simply enable the extension.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41859>
The Vulkan spec allows this to be used in a mesh shader as long as
it's not accessed, so it can be eliminated.
This fixes dEQP-VK.mesh_shader.ext.misc.payload_not_accessed.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41828>
This adds an F16 struct which provides a 16-bit float type using Mesa's
existing half-precision support internally. Right now, it only contains
the basics but it could be expanded if needed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41375>
genX(batch_emit_vertex_input) reserves 3DSTATE_VERTEX_ELEMENTS and then
writes into that reserved memory. Any later anv_batch_emit() may
allocate a new batch and finalize the previous one, running the valgrind
defined-memory check over it.
Fill the draw-parameter and dynamic VERTEX_ELEMENT_STATE entries before
emitting 3DSTATE_VF_INSTANCING.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41790>
Linux eventfds contain a 64-bit value which can be increased by arbitrary
numbers, and waiting returns a numeric value that consumers might need
to actually read.
Also, reading/waiting does mutate kernel state, so make it &mut self
like reading on std::fs::File is.
Signed-off-by: Val Packett <val@invisiblethingslab.com>
Signed-off-by: Gurchetan Singh <gurchetan.singh.foss@gmail.com>
Reviewed-by: David Gilhooley <djgilhooley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41754>
Apparently GRAS_CL_INTERP_CNTL has two fields FACENESS and CENTERRHW
which allows us to not enable IJ_LINEAR_PIXEL input, which can
improve performance in trivial cases by ~50%.
Mirrors Turnip change.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41848>
Apparently GRAS_CL_INTERP_CNTL has two fields FACENESS and CENTERRHW
which allows us to not enable IJ_LINEAR_PIXEL input, which can
improve performance in trivial cases by ~50%.
Found via gpu-ratemeter bench: vk.pix.noaa.1flat.face
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41848>
Coverity notices that there is an error case where
`nir_get_io_data_src_number` could return `-1`, and that is then used to
index into an array. Given that that is an exceptional case, we can just
assert here.
CID: 1681480
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40146>
On Xe2 and Xe3, the flushing is necessary due to aliasing of TGM data
in L1 memory (HSD 14020414266). On newer platforms, it is necessary
for proper post-format data conversion handling (HSD 22020984324).
See the Instruction_Fence page (63969) for documentation on the fact
that the threadgroup scope ignores flushes.
Thanks to Francisco Jerez and Kenneth Graunke on their help for this
patch.
v2: restrict the flushing to TGM (Lionel).
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40732>
This register seems to be fairly critical on A7XX for vertex processing
performance, and was set to an unoptimal value for the A730/A735/A740
which has now been updated to a value that maximizes performance and
aligns with the proprietary driver.
Fixes#15411
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41451>
When reloading live-ins, child intervals need to be extracted to ensure
we can add live-in phi nodes for them.
Fixes asserts with spillall for a bunch of ray_query and
ray_tracing_pipeline CTS tests:
src/freedreno/ir3/ir3_spill.c: add_live_in_phi: Assertion `entry' failed.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41756>
tu6_build_depth_plane_z_mode has a dependency on
occlusion_query_may_be_running.
Fixes: 8f5d433840 ("tu: Occlusion query counting should happen after FS that kills")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41856>
Some values were wrong, so here adding the whole table with all fixed values.
Just to make easier to read and compare I have added all shader stages to
XEHP_URB_MIN_MAX_ENTRIES.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41789>
Right now this value is not use but it will in the next patch.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41789>