The last parameter of radeon_set_sh_reg_seq() is the number of
dwords to emit. We were lucky because WAVES_PER_SH(0x3) is 3 but
it was initialized to 0.
COMPUTE_RESOURCE_LIMITS is correctly set when generating
compute pipelines, so we don't need to initialize it.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Behavior wrt firstInstance got changed, and a divisor of 0 has been
disallowed.
The new version of the ext got published in specification 1.1.81.
Sending to stable since the only known user is DXVK, which needs
this for correctness.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: 18.2 <mesa-stable@lists.freedesktop.org>
In environments where we cannot cache, e.g. Android (no homedir),
ChromeOS (readonly rootfs) or sandboxes (cannot open cache), the
startup cost of creating a device in radv is rather high, due
to compiling all possible built-in pipelines up front. This meant
depending on the CPU a 1-4 sec cost of creating a Device.
For CTS this cost is unacceptable, and likely for starting random
apps too.
So if there is no cache, with this patch radv will compile shaders
on demand. Once there is a cache from the first run, even if
incomplete, the driver knows that it can likely write the cache
and precompiles everything.
Note that I did not switch the buffer and itob/btoi compute pipelines
to on-demand, since you cannot really do anything in Vulkan without
them and there are only a few.
This reduces the CTS runtime for the no caches scenario on my
threadripper from 32 minutes to 8 minutes.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Now that all the build scripts are compatible with both Python 2 and 3,
we can flip the switch and tell Meson to use the latter.
Since Meson already depends on Python 3 anyway, this means we don't need
two different Python stacks to build Mesa.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
On Python 3, executing `foo != bar` will first try to call
foo.__ne__(bar), and fallback on the opposite result of foo.__eq__(bar).
Python 2 does not do that.
As a result, those __eq__ methods were never called, when we were
testing for inequality.
Expliclty adding the __ne__ methods fixes this issue, in a way that is
compatible with both Python 2 and 3.
However, this means the __eq__ methods are now called when testing for
`foo != None`, so they need to be guarded correctly.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Python 3 doesn't call objects __cmp__() methods any more to compare
them. Instead, it requires implementing the rich comparison methods
explicitly: __eq__(), __ne(), __lt__(), __le__(), __gt__() and __ge__().
Fortunately Python 2 also supports those.
This commit only implements the comparison methods which are actually
used by the build scripts.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
radv implements the Android Vulkan HAL interface, this patch adds
Android.mk building rules by porting of radv automake rules.
vendor HAL module is installed as /vendor/lib/hw/vulkan.radv.so
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Patch changes radv entrypoints generator to not skip this extension even
though it is set as disabled in the vk.xml
Reference: 63525ba730 ("android: enable VK_ANDROID_native_buffer")
Fixes: 69f447553c ("vulkan: Drop vk_android_native_buffer.xml")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Android build system will try to compile vk_format_table.c
as a shipped source, but at compile time it will be missing,
we move it to generated source, where it belongs
Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
If we have tracing enabled we could do all the tracing emits
and overflow the precalculated cdw_max.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The code sizes return here get passed to the cache shader insert function,
which then memcpy from the code ptr, and causes all sorts of valgrind
errors like:
==6755== Invalid read of size 8
==6755== at 0x4C32FEE: memcpy@GLIBC_2.2.5 (vg_replace_strmem.c:1021)
==6755== by 0x2305D4C7: radv_pipeline_cache_insert_shaders (radv_pipeline_cache.c:416)
==6755== by 0x2305791D: radv_create_shaders (radv_pipeline.c:2158)
==6755== by 0x2305C523: radv_pipeline_init (radv_pipeline.c:3404)
==6755== by 0x2305C890: radv_graphics_pipeline_create (radv_pipeline.c:3515)
==6755== by 0x230188AB: radv_device_init_meta_blit_color (radv_meta_blit.c:871)
==6755== by 0x2301D50E: radv_device_init_meta_blit_state (radv_meta_blit.c:1278)
==6755== by 0x23011893: radv_device_init_meta (radv_meta.c:352)
==6755== by 0x2300744B: radv_CreateDevice (radv_device.c:1576)
==6755== by 0x5187D0F: ??? (in /usr/lib64/libvulkan.so.1.1.77)
==6755== by 0x518F6A3: ??? (in /usr/lib64/libvulkan.so.1.1.77)
==6755== by 0x5192A42: vkCreateDevice (in /usr/lib64/libvulkan.so.1.1.77)
==6755== Address 0x22a58548 is 4 bytes after a block of size 116 alloc'd
==6755== at 0x4C2EBAB: malloc (vg_replace_malloc.c:299)
==6755== by 0x23089DC4: ac_elf_read (ac_binary.c:144)
==6755== by 0x23090A60: ac_compile_module_to_binary (ac_llvm_helper.cpp:162)
==6755== by 0x23053F06: compile_to_memory_buffer (radv_llvm_helper.cpp:58)
==6755== by 0x23053F06: radv_compile_to_binary (radv_llvm_helper.cpp:98)
==6755== by 0x23052769: ac_llvm_compile (radv_nir_to_llvm.c:3394)
==6755== by 0x23052823: ac_compile_llvm_module (radv_nir_to_llvm.c:3418)
==6755== by 0x23053C05: radv_compile_nir_shader (radv_nir_to_llvm.c:3542)
==6755== by 0x23061B4E: shader_variant_create (radv_shader.c:580)
==6755== by 0x23061CFD: radv_shader_variant_create (radv_shader.c:634)
==6755== by 0x23057765: radv_create_shaders (radv_pipeline.c:2123)
==6755== by 0x2305C523: radv_pipeline_init (radv_pipeline.c:3404)
==6755== by 0x2305C890: radv_graphics_pipeline_create (radv_pipeline.c:3515)
Since we are just inserting the code into the cache, we can avoid these
bad reads and data in the cache by just using the binary code size here.
Fixes: 939e5a382 (radv: add padding for the UMR disassembler)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The driver might emit up to 4 dwords when RADV_TRACE_FILE is
used.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This wasn't wrong but it looks better to me like this. It's
only used for debugging purposes (ie. RADV_TRACE_FILE).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Move the integer conversion after the fixup.
This fixes some regressions with
dEQP-VK.pipeline.vertex_input.single_attribute.mat4.as_a2r10g10b10*
Fixes: b722b29f10 ("radv: add support for 16bit input/output")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Python 2 has a range() function which returns a list, and an xrange()
one which returns an iterator.
Python 3 lost the function returning a list, and renamed the function
returning an iterator as range().
As a result, using range() makes the scripts compatible with both Python
versions 2 and 3.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
In Python 2, dictionaries have 2 sets of methods to iterate over their
keys and values: keys()/values()/items() and iterkeys()/itervalues()/iteritems().
The former return lists while the latter return iterators.
Python 3 dropped the method which return lists, and renamed the methods
returning iterators to keys()/values()/items().
Using those names makes the scripts compatible with both Python 2 and 3.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
That we don't have a background disk cache does not mean we should
prevent the app caching anything.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Users shouldn't use this debugging option except when we
ask them to do!
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
modules[i] can be NULL for merged shaders but we have to
free the NIR code. radv_can_dump_shader_stats() already handles
if modules[i] is NULL, no need to check it twice.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
That shouldn't be needed because the DB state is invalid.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We already check that in radv_cmd_buffer_resolve_subpass().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The goal is to use radv_barrier()/radv_subpass_barrier() as
much as possible for further optimizations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Assignment and usage of this variable both happen inside an
if(rad_image_has_dcc()) {} blocks. It seems gcc plays it safe and
assumes that both function calls could have different return values.
But in this case we should be safe.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Inherited commands buffers are not supported.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
By default, our internal rendering commands are discarded
only if the predicate is non-zero (ie. DRAW_VISIBLE). But
VK_EXT_conditional_rendering also allows to discard commands
when the predicate is zero, which means we have to use a
different flag.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
VK_EXT_conditional_rendering allows to discard draw commands
(not only normal draws).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
VK_EXT_conditional_rendering allows to discard dispatch commands.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We don't need to emit PS_PARTIAL_FLUSH for the pre fragment shader
stages (ie. geometry/tessellation). Emitting VS_PARTIAL_FLUSH
is enough for these stages. Note that PS_PARTIAL_FLUSH also
synchronizes all vertex stages.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>