Given an array of gen_insts representing a structured program,
fill in the missing JIPs and UIPs to follow that structure.
The input array must provide JIPs for the WHILE instructions (the
"back-edges", since there's no DO in Gfx9+). It optionally can
provide other JIPs or UIPs, their values will be used instead of
the calculated one.
The input JIPs and UIPs are absolute index values in the array,
and after finish they will be converted into relative byte offsets,
which is what the hardware will use.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Port the validation rules from brw_eu_validate.cpp. This also
ports the tests of the validation, so we can check whether the
rules actually flag the cases.
Also include some new validation cases derived from asserts in
brw_eu encoding logic.
Assisted-by: Pi coding agent (gpt-5.5, opus-4.6)
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Add a new module that can produce the binary encoded representation of
the instructions. Some key differences from existing encoding logic
in brw:
- Use a struct to represent the instructions before final encoding.
This is similar to the struct we already use in validation. This
allows generator/validation code to ignore details of instruction
formatting and just "set src0 to something".
- Split the encoding logic between Pre-Xe (Gfx9 and Gfx11) and Xe (from
Gfx12 and up). They are documented differently, so splitting makes
both sides easier to deal with.
- Try to follow the bit range numbers as they are documented in the
spec, programatically shifting them when needed. This means numbers
in code match PRMs / BSpec.
Later patches will add compaction and make use of the module in various
parts of the code.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
ends_program calls into nir_cf_node_get_function repeadtly to fetch the
same function and to check whether we are inside an entry point or not.
But we already got the information higher up the chain so use that
instead.
nir_cf_node_get_function is quite expensive, because it follows pointers
through the tree.
Speeds up compilation of more complex shaders by quite a bit. I am seeing
a 66% cut of compilation time spent in e.g. llama-bench.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41891>
We're required to support this extension for Android VP17.
We've tried supporting it through the use of
CMF_DISABLE_WRITE_COMPRESSION but some regressions are measures
(-0.5~-1.0%).
We're not aware using CMF_DISABLE_WRITE_COMPRESSION would prevent any
application bug so it doesn't feel useful to implement.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41187>
Currently, display_fd gets leaked during vulkan loader driver
probing on platforms where there's no v3dv device, as nothing
closes this fd before returning with INCOMPATIBLE_DRIVER. As
the display_fd also holds MASTER, this in turn prevents the
actual driver from becoming master on the display node.
Close the fd before returning to prevent this.
Fixes: bb532a7a ("v3dv: Fix assertion failure for not-found primary_fd during enumeration.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41058>
Since the D3D11 traces were heavier, I needed to add support for traces
declaring about how much memory they use, so we can cut down the
parallelism and avoid hitting swap. It's a bit of a nuisance to set up,
but it cut the job time after I'd added all the traces with necesstary
"singlehread" flags in half.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41662>
The headless backend's defaults are quite low, and it was preventing
renderdoc replays from running any larger than that, which in turn cut off
the screenshots we wanted to take.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41662>
This implements the extension on the Graphics and Compute queues using
Blorp OpenCL compute shaders. Support for the Transfer queue will come
in a later patch. We also don't support 24/48/96 bpp formats yet.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
The members are all naturally aligned to 4, but other
naturally-aligned-to-4 structs in this file still have the attribute
declared (such as VkDispatchIndirectCommand), so I'm adding the
attributes to these as well.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
This structure, despite containing 8-bit members, can be 4-byte
aligned:
"VUID-VkCopyMemoryIndirectInfoKHR-copyAddressRange-10942
copyAddressRange.address must be 4 byte aligned"
So do it like we do with the other structures.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
In the next patch we will use mi_builder.h from blorp code, so this
commit prepares the terrain for that by adding the necessary
definitions that the header requires.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
We're going to use this for indirect copies, as we need to iterate
through the indirect buffer checking the copy sizes, then pick the
maximum copy size in order to launch the indirect compute shader.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
Just like mi_ior(), but for xor. We're going to use it in one of the
next commits.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
This option is only meant for min/max, where gfx8 and older
didn't flush denorms. Fma flushes denorms on all hardware
and the optimizer removes the manual flushing again anyway.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41868>
Directories are named using the process name and PID to avoid overwriting dumps from
subsequent runs of the same application.
v2 (Caio): Use util_get_process_name(). Change to be default behavior.
Old behavior still accessible via MDA_OUTPUT_DIR="." env var.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39125>
It's not always locked. If it's not, and we destroy its BO,
lock_front_buffer will set dri2_surf->current->locked = true but then
return NULL, so release_buffer will never be called, preventing the
buffer from being used again.
Fixes: dd7ae41091 ("egl/gbm: Destroy excess BOs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41845>
This reverts commit 97391328a3.
This broke devenv because DRIRC_CONFIGDIR doesn't point the folder that
contains everything anymore.
DRIRC_CONFIGDIR will be modified to take the standard `:`-separated list
of paths, but until then, revert this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41890>