Many games have short periods where a certain mode might win
consistently but this trend doesn't hold after that. Only allowing
locking to occur on RPs where a certain mode consistently stays
winning for 30s allows us to partially mitigate these bad locks.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
The maximum probability was limited to 95% earlier due to the step
delta of 5% (95+5=100% which we wanted to avoid). This introduces a
new slower step delta after 95% which steps at 1% up to 99% which
is significantly better in terms of eliminating the performance loss
or stuttering from when there is a large difference between the modes.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
There are certain scenarios where even switching to another render
mode has significant negative implications for performance even
when done for a single invocation. Now we try to heuristically
pick out these cases and lock them into the optimal mode, at the
moment the heuristic is fairly conservative but it manages to lock
RPs in under a minute in most cases.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
PC games tend to almost always run far better in SYSMEM due to the
high FS complexity, and so preferring SYSMEM tends to be a winning
policy until profiled mode reaches a state where it can surpass it.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
Certain games tend to use rendering patterns that strongly prefer
one mode over the other, and thus we're better off not bothering
with profiling them.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
This introduces a new option that makes autotune optimize for low
preemption latency which is crucial to ensure responsiveness on
systems with GPU-based composition. A large enough draw can entirely
block the compositor from running with draw-level preemption, this can
be mitigated by preferring to use GMEM which breaks up the draw into
smaller pieces and generally has a lower latency for preemption.
As a further mitigation, tiles in GMEM are then divided into smaller
and smaller pieces which lowers the non-preemptible duration. There
are static checks in place to avoid doing this when it would incur a
cost that is too large.
Uses performance counters read during ambles to detect preemption
latency events while rendering in SYSMEM. This approach is superior
to using RBBM draw time thresholds which could be imprecise as only
the average was calculated rather than true maximum draw time.
However, converting the preemption latency performance counter value
from CP ticks to wall clock is based on the average GPU frequency of
the whole period from the start of the RP until the switch-away amble
while the preemption latency stars counting from the request. Thus, if
the GPU frequency shifts rapidly throughout the RP, it may cause the
estimated wall clock time to be inaccurate, but it should be good enough
in the vast majority of cases.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Co-authored-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
Tuning these small renderpasses is difficult due to their high
variability across command buffers and low impact on overall performance
in most cases. This change disables autotuning for renderpasses with 5
or fewer draw calls unless the TUNE_SMALL modifier flag is explicitly
set.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
In cases where only SW binning is possible and where there would be
a performance impact from not using HW binning (i.e. > 2 tiles), it
is preferable to default to SYSMEM as the performance impact of
using GMEM is almost definitely not going to be worth it.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
The default ROUND_DOWN_TO only handles POT alignment values, so
an additional variant was added which handles NPOT alignment too.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
This algo measures the time taken by each RP as a whole, and uses that
to move a probability distribution of whether to use GMEM or SYSMEM for
that RP. This is done with a delta of 5% per run, and the probability is
clamped to 5% and 95% to avoid getting stuck when conditions change.
Additionally, an "immediate resolve" variant which tries to work off a
single data point in SYSMEM and GMEM, then immediately resolves to the
faster path. This is useful for usage in CI which runs a single frame
multiple times where the performance isn't varying change from frame to
frame.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
The tu_autotune_end_renderpass function collects timestamp data for
the renderpass and should be called as late as possible for the most
complete data.
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37802>
Midpoint sorting is incompatible with how our traversal works.
Specifically, we change tMax when a hit is committed so we can skip over
BVH nodes that are guaranteed not to produce a closer hit. However,
changing tMax also changes the intersection interval of box nodes with
the ray, and thus, the midpoints of that interval. Stackless traversal
relies on getting nodes back in the exact same order as before, and if
that requirement is not met, traversal may incorrectly skip over nodes.
The likely benefit of midpoint sorting does not make up for the loss of
ability to skip over BVH nodes exceeding tMax, so simply disable
midpoint sorting.
This fixes geometry being visible behind other geometry when it
shouldn't be in various applications, including Half-Life 2 RTX.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40795>
Instead of loading and parsing the XML spec everytime a CLIF is created,
do it once and cache for further calls.
This also avoids leaking the spec loading.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40747>
Replace printf and nir_print_shaders by proper mesa_logX and
nir_log_shaderX functions, that provides better features (like logging
to a file, setting the logging verbosity, etc) and works better with
Android.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40434>
This will give better flexibility on how and where the dumps will be
done.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40434>
Replace printf and nir_print_shaders by proper mesa_logX and
nir_log_shaderX functions, that provides better features (like logging
to a file, setting the logging verbosity, etc) and works better with
Android.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40434>
This will give better flexibility on how and where the dumps will be
done.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40434>
Fixes the following building error happening with clang:
FAILED: src/amd/vulkan/libvulkan_radeon.so.p/radv_shader_object.c.o
...
../src/amd/vulkan/radv_shader_object.c:163:72: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info debug[MESA_VULKAN_SHADER_STAGES] = {};
^
../src/amd/vulkan/radv_shader_object.c:164:53: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info gs_copy_debug = {};
^
../src/amd/vulkan/radv_shader_object.c:192:75: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info debug[MESA_VULKAN_SHADER_STAGES] = {};
^
../src/amd/vulkan/radv_shader_object.c:193:56: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info gs_copy_debug = {};
^
../src/amd/vulkan/radv_shader_object.c:246:43: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info cs_dbg = {};
^
../src/amd/vulkan/radv_shader_object.c:465:69: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info debug[MESA_VULKAN_SHADER_STAGES] = {};
^
../src/amd/vulkan/radv_shader_object.c:468:50: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info gs_copy_debug = {};
^
7 errors generated.
Fixes: 06b9660b ("radv: move radv_shader_create out of radv_compute_pipeline_compile")
Fixes: 2260105b ("radv: move radv_shader_create out of radv_graphics_shaders_compile")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40772>
Fixes the following building error happening with clang:
FAILED: src/amd/vulkan/libvulkan_radeon.so.p/radv_pipeline_compute.c.o
...
../src/amd/vulkan/radv_pipeline_compute.c:213:43: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info cs_dbg = {};
^
1 error generated.
Fixes: 06b9660b ("radv: move radv_shader_create out of radv_compute_pipeline_compile")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40772>
Fixes the following building error happening with clang:
FAILED: src/amd/vulkan/libvulkan_radeon.so.p/radv_pipeline_rt.c.o
...
../src/amd/vulkan/radv_pipeline_rt.c:537:42: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info debug = {};
^
1 error generated.
Fixes: 4c3a74be ("radv: move radv_shader_create out of radv_rt_nir_to_asm")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40772>
Fixes the following building error happening with clang:
FAILED: src/amd/vulkan/libvulkan_radeon.so.p/radv_pipeline_graphics.c.o
...
../src/amd/vulkan/radv_pipeline_graphics.c:3199:69: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info debug[MESA_VULKAN_SHADER_STAGES] = {};
^
../src/amd/vulkan/radv_pipeline_graphics.c:3200:50: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_debug_info gs_copy_debug = {};
^
2 errors generated.
Fixes: 2260105b ("radv: move radv_shader_create out of radv_graphics_shaders_compile")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40772>
Fixes the following building error happening with clang:
FAILED: src/vulkan/runtime/libvulkan_runtime.a.p/vk_shader.c.o
...
../src/vulkan/runtime/vk_shader.c:622:63: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct vk_sampler_state_array embedded_samplers = {};
^
1 error generated.
Fixes: e8558de1 ("vulkan/shader: Call vk_nir_lower_descriptor_heaps()")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40772>
ballot_bit_count_reduce expects the ballot to have 4 components causing
validation failures on targets where 1 < ballot_components < 4. Fix this
by padding the ballot to 4 components.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: ae66bd1c00 ("nir/opt_uniform_subgroup: use ballot_bit_count")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40792>
macOS struct stat uses st_mtimespec instead of st_mtim. Use
st_mtimespec on Apple platforms to get nanosecond file modification
time.
Fixes: e68a0dfb12 ("llvmpipe: Implement manual context resets")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40791>
This could override data allocated by the application when shader code
is loaded from binary in vkCreateShaderObjectEXT().
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d39e443ef8 ("anv: add infrastructure for common vk_pipeline")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40727>
Make it possible to trigger a emulated GPU reset by creating a file at
the path of the `LP_CONTEXT_RESET_FILE` env var. All llvmpipe contexts
using the `LOSE_CONTEXT_ON_RESET` strategy that where created before
the that file will start returning `PIPE_UNKNOWN_CONTEXT_RESET` in
`get_device_reset_status()`, while newer contexts - most notably those
created by apps in reaction to the device reset - continue to return
`PIPE_NO_RESET`.
Note that, because we're dealing with files, rtime is used instead of
mtime as the later
Usage:
```
LP_CONTEXT_RESET_FILE=~/llvmpipe_reset LIBGL_ALWAYS_SOFTWARE=1
someapp
touch ~/llvmpipe_reset
```
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40681>