This implements a minimalistic algorithm that can be used to obtain an
approximate solution for the integer programming problem of finding
the optimal tile dimensions based on an estimate of the tile cache
consumption per pixel of the current graphics pipeline -- Including
the TC footprint of render targets, depth and stencil buffers and
their auxiliary surfaces. Considering other (less local) memory
accesses performed by the pipeline (like texturing and shader storage)
would be useful (and could be considered by this algorithm with little
modification), but it would be pretty difficult to estimate the L3
cache consumption per pixel of such accesses based on static analysis
of the pipeline state alone without some sort of dynamic feedback.
The present algorithm returns a config with tile area large enough to
utilize a target fraction of the L3, which can be adjusted to obtain
greater/lower utilization of the L3 at the cost of higher/lower risk
of L3 cache thrashing respectively. The aspect ratio of the tile
layout returned attempts to minimize the number of poorly utilized
tiles around the boundaries of the framebuffer (due to partial
coverage), since having the tile sequencer process additional tiles
comes at a cost due to the latency of the additional passes, even if
they're mostly empty. Finally, among the solutions with satisfactory
cache footprint and tile count, the tile aspect ratio closest to 1 is
returned where possible, since tiles with very high aspect ratios can
have a negative impact on cache locality.
The algorithm is primarily intended for TBIMR, but it could be used
for PTBR as well with little modifications, since the TBIMR-specific
assumptions are few and noted in comments below.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
Gfx12.5+ devices use special-purpose memory for the URB instead of
requiring a portion of the L3 to be carved out.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
Sync xe_drm.h with commit xxxxx ("drm/xe/uapi: Fix naming of XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY").
One not so straght forward change is that sync VM binds now don't
require a syncobj anymore, the uAPI will return as soon the VM bind
operations are done.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25300>
Make intel_aux_map_add_mapping return false if a mapping is attempted
that would conflict with an existing one. If this function doesn't
return false, it will either fail to return or return true.
The Vulkan driver will make use of this feature to opportunistically
enable CCS if a BO's VMA range has not been already mapped.
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25003>
In future patches, we will be creating a separate companion RCS engine
and each engine is created with it's own address space, and we really
don't want. CCS and RCS engine writes should be visible to each other in
order to get the wait/signal mechanism working.
v2:
- Move drm_i915_gem_context_create_ext_setparam out of if block (Lionel)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
Useful when you want to compare 2 batches with different ordering in
instruction emission. Also when the driver tries to avoid re-emitting
state.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
Rework:
* Split clflushopt into a separate file as recommended by Ken.
If we enable -mclflush on all driver source compilation, then
gcc may insert uses of it on processors that don't support it.
* Add uintptr_t casting to cpu_caps->cacheline usage
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22379>
The default batch size was increased to support large numbers of
INTEL_MEASURE snapshots for complex workloads. Some titles create
large numbers of small secondary command buffers, and quickly exhaust
memory. An example of this is Dota2, where INTEL_MEASURE increases
the memory usage by a factor of 20.
Allow the user to specify smaller batch sizes and buffer sizes.
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24860>
Fix defect reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable text_data going out of scope leaks the storage it points to.
Fixes: b4c8d2dc45 ("intel/decoder: Add intel_spec_load_common()")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24659>
MOCS = 0 is a invalid MOCS index on MTL, so it is necessary get a
valid value and set to MI_MATH instructions.
So here the mocs index is set with mi_builder_set_mocs(), it can be
always set but it is required when mi_build will emit MI_MATH
instructions.
The mocs index will only be stored and used in gfx12.5+ platforms
so no changes were are required in crocus or hasvk.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22508>
MOCS = 0 is a invalid MOCS index on MTL, so it is necessary get a
valid value and set to MI_MATH instructions.
So here the mocs index is set with mi_builder_set_mocs(), it can be
always set but it is required when mi_build will emit MI_MATH
instructions.
The mocs index will only be stored and used in gfx12.5+ platforms
so no changes were are required in crocus or hasvk.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22508>
This allow us to remove one more i915_drm.h include from code shared
by both backends.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23905>
When a default value of a struct's field, which is in the
higher half of the first dword, is specified in a gen xml
file, setting op mask makes decoder treat the field as a
header (intel_field_is_header()). As a result, it won't
output the field in batch dump. This is not a common case
but can happen once a gen xml file includes such fields.
The op mask is only meaningful to instructions, so we fix
the above issue by not setting op mask of structs (also
registers).
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24268>
Each entry is a uint64_t, L2 and L1 maps 12 bits so:
(1 << 12) = 4096
sizeof(uint64_t) = 8
4096 * 8 = 32768 = 32K
Same value but easier to understand.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24077>
The expression 'l1_gpu_addr + l1_index * sizeof(*l1_map)' could cause
bit 47 to be set so it needs to be converted to canonical.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24077>
The bits above index 47 in l1 entry are used to define format,
depth and luminance.
aux_address is formated as canonical, so bits above 47 could all be
set to 1 causing wrong values being set to format, depth and luminance.
intel_aux_get_meta_address_mask() was previously using 2 shifts to
mask out bits above index 47, what is not so obvious and are 2
operations, so here doing a AND with VALID_ADDRESS_MASK to make it
easier to understand.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24077>
No changes in behavior here, mostly doing this types of renames:
- address to main_address, to know that addresses refers to main
surface address or aux surface address
- gpu to addr
- main_map_addr to main_inc_addr
- aux_dest_addr to aux_inc_addr
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24077>
remove_mapping() duplicated almost half of get_aux_entry(), it is
only dropping the cases were entries are not alocated but during
removal it is expected that entries were already alocated so we can
reuse get_aux_entry() and drop duplicated code.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24077>
The only user of format_enum is intel_aux_map_get_alignment() that
can easily use information in format->main_page_size.
This allow us to nuke format_enum and remove duplicated information
in intel_aux_map_get_alignment().
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24077>
Instead of on Android. Which allows an end user to turn off expat
without breaking or disabling Intel support. I've additionally
refactored to separate expat and xmlconfig a bit more in the root
meson.build
This does make expat a hard dependency for building Intel tools, despite
the fact that only aubinator actually requires it. This simplifies the
build for the common case, and in the event that someone wants to build
the Intel tools and doesn't have libexpat, they can fall back to the
meson wrap for expat instead.
fixes: 75276deebc
closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8791
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23605>
With MTL onwards, creating protected contexts too early
may block for a longer period. To prevent that, use the new
kernel GET_PARAM:I915_PARAM_PXP_STATUS interface to get the
status of PXP support immediately without blocking.
Using this same interface, we can also wait for platform
dependency readiness before attempting to create a protected
context. Use a longer timeout when user explicitly requests
for protected context as the kernel assures readiness will be
achieved.
Reference to kernel change: https://patchwork.freedesktop.org/patch/533241/?series=112647&rev=8
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23382>