From the Haswell PRM Vol. 7, "IEEE Floating Point Mode":
"Single precision (F, Float) denorms are flushed to sign-preserved
zero on input and output of any floating-point mathematical
operation."
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20232>
The rounding mode only needs to be set once, because 16-bit floats or
preserving denorms aren't supported for the platforms where vec4 is
used.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20232>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21210>
The size in src[2] is in byte and needs to cover any possible data
accessed in src[0] by the indirection. That way the register
allocation is aware of what cannot be spilled for the instruction to
execute on valid data.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 70ace2bbcd ("intel/compiler: Implement Task Output and Mesh Input")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21188>
Functions that are in hot paths will have a different treatment to
support i915 and Xe KMD.
Each KMD will have an anv_kmd_backend that will have the hot path
functions set, this way we can avoid branch prediction misses.
Other functions will gradually be moved to anv_kmd_backend.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20948>
As we continue to refactor the code base to support Xe KMD here I'm
dropping anv_gem_create() and unifying all graphics memory allocation
calls to anv_gem_create_regions().
anv_gem_create_regions() will call DRM_IOCTL_I915_GEM_CREATE_EXT
for integrated platforms too only leaving DRM_IOCTL_I915_GEM_CREATE
calls to kernel versions that do not support
DRM_IOCTL_I915_GEM_CREATE_EXT.
This can be detected by devinfo->mem.use_class_instance as
DRM_I915_QUERY_MEMORY_REGIONS uAPI landed in the same kernel version
as DRM_IOCTL_I915_GEM_CREATE_EXT.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20948>
Also using pointers to intel_device_info struct instead of replicate
the same information.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20948>
This is a KMD independent struct to hold memory class and instance
values.
drm_i915_gem_memory_class_instance usage will be gradually replaced.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20948>
These values are taking from runtime interrogation of the media driver.
It would be nice to know if they are correct, but they work.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20782>
This hardware bug is the result of a control flow optimization present
in Gfx8-9 meant to prevent the ELSE instruction from disabling all
channels and update the control flow stack only to have them
re-enabled at the ENDIF instruction executed immediately after it.
Instead, on Gfx8-9 an ELSE instruction that would normally have ended
up with all channels disabled would pop off the last element of the
stack and jump directly to JIP+1 instead of to the ENDIF at JIP,
skipping over the ENDIF instruction. In simple cases this would work
okay (though it's actual performance benefit is questionable), but in
cases where a branch instruction within the IF block (e.g. BREAK or
CONTINUE) caused all active channels to jump outside the IF
conditional, the optimization would break the JIP chain of "join"
instructions by skipping the ENDIF, causing the block of instructions
immediately after the ENDIF to execute with all channels disabled
until execution reaches the reconvergence point.
This issue was observed on SKL in the
dEQP-VK.reconvergence.subgroup_uniform_control_flow_elect.compute.nesting4.0.38
test in combination with some Vulkan binding model changes Lionel is
working on. In such cases the execution with all channels disabled
was leading to corruption of an indirect message descriptor, causing a
hang.
Unfortunately the hardware bug doesn't provide a recommended
workaround. In order to fix the problem we point the JIP of an ELSE
instruction to the instruction immediately before the ENDIF -- However
that's not expected to work due to the restriction that JIP and UIP
must be equal if and only if BranchCtrl is disabled -- So this patch
also enables BranchCtrl, which is intended to support join
instructions within the "ELSE" block, which in turn disables the
optimization described above, which in turn causes us to execute the
instruction immediately *before* the ENDIF with all channels disabled
-- So in order to avoid further fallout from executing code with all
channels disabled we need to insert a NOP before ENDIF instructions
that have a matching ELSE instruction.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20921>
Setting default expected values as default in the xml.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21105>
On integrated products this makes almost no difference but on discrete
it's pretty important.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Chuansheng Liu <chuansheng.liu@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21131>
The decoder context needs to know what engine it's associated with.
Nowadays, we have render, compute, blitter, even video engines being
used from the same driver. Rather than trying to have a single decoder
and thwacking the engine field back and forth between calls, we make
one per queue family, and stash a pointer in anv_queue for easy access.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21149>
This will be useful for RADV since it hashes the state.
v3dv changes:
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20731>
This new helper, do_emit_fb_writes() does the actual walk over all the
render targets to emit each of the different FB writes. We want this in
a helper because we're about to go a bit crazy with coarse.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
This is required by code-gen since it generates a 1-wide OR and it'll
blow up if the register width > 1. It's also way better than the "your
register is the wrong size" assert you get from the more generic
validation check.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
This allows us to communicate to the back-end that we don't actually
know if the framebuffer is multisampled or not. No drivers set anything
but ALWAYS/NEVER and we still have a few ALWAYS/NEVER assumptions but
those should be asserted.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
This allows for the possibility that we may not know at compile time if
sample shading is enabled through the API. While we're here, also
document exactly what this bit means so we don't confuse ourselves.
v2: Fixup coarse pixel values (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
Whenever one of them is BRW_SOMETIMES, we depend on dynamic flag pushed
in as a push constant. In this case, we have to often have to do the
calculation both ways and SEL the result. It's a bit more code but
decouples MSAA from the shader key.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
This is more similar to what we do for single-sample and it should be
more clear going forward once our lowering gets more complex.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>