This reverts commit 768238fdc0 which
is not only leading to memory leaks, but also reportedly breaks KDE
pretty badly.
Fixes: #7674, #7435
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19972>
The parameter `nr` is currenlty an `int` but it only gets assigned to an
`unsigned int`. Make it clear in the function signature what's actually
required.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19423>
GCC 12.2.0 warns:
../src/intel/compiler/brw_fs.cpp: In member function ‘bool fs_visitor::
split_virtual_grfs()’:
../src/intel/compiler/brw_fs.cpp:2199:10: warning: ‘void* memset(void*, int,
size_t)’ specified size between 18446744071562067968 and 18446744073709551615
exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
2199 | memset(vgrf_has_split, 0, num_vars * sizeof(*vgrf_has_split));
`num_vars` is an `int` but gets assigned the value of `this->alloc.count`,
which is an `unsigned int`. Thus, `num_vars` will be negative if
`this->alloc.count` is larger than int max value. Converting that negative
`int` to a `size_t`, which `memset` expects, then blows it up to a huge
positive value.
Simply turning `num_vars` into an `unsigned int` would be enough to fix this
specific problem, but there are many other instances where an `unsigned int`
gets assigned to an `int` for no good reason in this function. Some of which
the compiler warns about now, some of which it doesn't warn about.
This turns all variables in `fs_visitor::split_virtual_grfs`, which should
reasonably be unsigned, into `unsigned int`s. While at it, a few now pointless
casts are removed.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19423>
This is meant to remove any integrated GPU only code paths that can't
be compiled in CPU architectures different than x86.
Discrete GPUS don't have need_clflush set to true so it was just
matter of remove some code blocks around need_clflush but was left a
check in anv_physical_device_init_heaps() to fail physical device
initialization if it ever became false.
Signed-off-by: Philippe Lecluse <philippe.lecluse@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19812>
Fixes a number of CTS patterns on DG2 :
- dEQP-VK.dynamic_rendering.primary_cmd_buff.random*
- dEQP-VK.draw.*secondary_cmd*
- dEQP-VK.dynamic_rendering.*secondary_cmd*
- dEQP-VK.geometry.*secondary_cmd_buffer
- dEQP-VK.multiview.*secondary_cmd*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9c1c1888d9 ("intel/fs: put scratch surface in the surface state heap")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19946>
We want to avoid those settings so that we do not have to emit a tile
fence to implement Wa_22013689345.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19322>
The initial implementation is a pretty big hammer. Implement the HW
recommendation to minimize cases in which we need a fence.
This improves by 10FPS on some of the Sascha Willems RT demos.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 6031ad4bf6 ("intel/fs: Add Wa_22013689345")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19322>
Fix defect with Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable pass_array going out of scope leaks the storage it points to.
Fixes: d4cbb66506 ("intel/perf: support more than 64 queries")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19888>
TGL+ specification ask the API mode to be set to DX10.1 for Vulkan API.
BSpec: 46947
Reference: TGL PRMs, Volume 2d: Command Reference: Structures: 3DSTATE_RASTER_BODY
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19934>
It is still marked as a flake (along with other copyteximage cases) on all
these boards, so this will reduce the CI IRC channel noise given that we
actually expect a Pass. I haven't found where exactly in history we went
from generally-fail to generally-pass, but it looks like around Feb 2022.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19912>
source_root() function is deprecated in Meson version 0.56.0 because
it returns the source root of the parent project if called from a
subproject.
Why would anyone need Mesa as a meson subproject?
It would be used as subproject in a project generated by command buffer
"decompiler" for Freedreno.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19901>
We need to set CPS_MODE_NONE when no per coarse pixel dispatch.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 231651fd89 ("anv: implement VK_KHR_fragment_shading_rate")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19867>
Previously only 'decode_vs_state' would dump the referenced shader,
which meant we completely ignored every other shader when decoding
the '3DSTATE_PIPELINED_POINTERS' command.
Move the program disassembly logic from 'decode_vs_state' into a
common 'disasm_program_from_group' helper and call it from every
other decode_*_state function, too.
Signed-off-by: Daniil Tatianin <99danilt@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19880>
In 4ceaed7839 we made scratch surface state allocations part of the
internal heap (mapped to STATE_BASE_ADDRESS::SurfaceStateBaseAddress)
so that it doesn't uses slots in the application's expected 1M
descriptors (especially with vkd3d-proton).
But all our compiler code relies on BSS
(STATE_BASE_ADDRESS::BindlessSurfaceStateBaseAddress).
The additional issue is that there is only 26bits of surface offset
available in CS instruction (CFE_STATE, 3DSTATE_VS, etc...) for
scratch surfaces. So we need the drivers to put the scratch surfaces
in the first chunk of STATE_BASE_ADDRESS::SurfaceStateBaseAddress
(hence all the driver changes).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4ceaed7839 ("anv: split internal surface states from descriptors")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7687
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19727>
one of those error message:
../../src/intel/compiler/brw_vec4_cmod_propagation.cpp:53:8: error: variable 'ip' set but not used [-Werror,-Wunused-but-set-variable]
int ip = block->end_ip + 1;
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19527>
When we're not using queries, all the counters from the
MI_REPORT_PERF_COUNT are available. This is the case when using
perfetto with the global pps datasource that capture global counter
values.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8750f43a90 ("intel/perf: add performance query layout using MI_SRM")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
This avoids device lost events when we replay a command buffer 1k
times on DG2.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
The intel_perf_counter_pass::pass field is actually useless and
invalid.
Once you have mapped all the counters to all the metrics, the order of
the metrics capture is dictated by intel_perf_get_n_passes().
When reading values that is the order we should follow.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2001a80d4a ("anv: Implement VK_KHR_performance_query")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
We don't need to go through all the metric sets as we're already built
a bitset matching per counter to figure out in which metric set a
particular counter is.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
This array of structure needs to be initialized to 0 as it contains a
bitset we don't explicitly clear.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3144bc1d33 ("intel/perf: move query_mask and location out of gen_perf_query_counter")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
Useful to look at the layout of the queries.
v2: Rework based on Marcin's comment
v3: Rebase
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
With DG2, the number of perf groups and metrics climbs into the
thousands. 16bit fields are not sufficient for storing metrics
indices, and the build throws warnings when compiling the generated
intel_perf_metrics.c
Use a 32bit integer for these values.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
__builtin_clz(value - 1) is undefined for with value=1 (because
__builtin_clz(0) is undefined).
Because we set rt_pipeline->stack_size = 1 when a ray tracing pipeline
doesn't need any stack allocation to differentiate from a dynamic size
(rt_pipeline->stack_size = 0) we can run into this undefinied behavior
issue.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: f68d64dac0 ("anv: Add support for vkCmdSetRayTracingPipelineStackSizeKHR")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19781>
This extension is basically a no-op exposing some new enums.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19726>
In 23c7142cd6 ("anv: disable SIMD16 for RT shaders") we were forcing the SIMD8
using the mechanism for subgroup size control, which is problematic since it has
other effects on the shader behavior.
The code was changed to select the SIMD in a different way in the previous patches,
so we can revert the behavior to the original semantics.
Fixes dEQP-VK.subgroups.builtin_var.ray_tracing.subgroupsize.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>