This reverts two of the vk_error changes:
reporting unsupported format is common,
and testing non-amdgpu drivers and ignoring them is also common.
Fixes: cd64a4f70 (radv: use vk_error() everywhere an error is returned)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This should reduce the overhead of adding a BO to the current
list, especially when the list is huge. Also, when a new pipeline
is bound, we only need to update the descriptor, the buffer objects
should already be in the list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
I don't think we will need a 64-bit unsigned integer for the
dirty flags in the future, and there is still 20 bits left.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
VkDeviceQueueCreateInfo::queueFamilyIndex is an unsigned 32-bit
integer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
radv_fill_buffer() ensures that the image BO is added to the list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
radv_fill_buffer() ensures that the image BO is added to the list.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
It seems safe and it improves performance by +4% (73->76).
A drirc based solution is not what we want for now, keep it
simple and improve later if it's really needed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Otherwise we can leak the old syncobj.
Fixes: eaa56eab6d "radv: initial support for shared semaphores (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This should not be needed, if the allocation fails an error is
returned and the host should handle it.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
All radv_fence fields are initialized here.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
For consistency and it might help for debugging purposes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Instead just dirty RADV_CMD_DIRTY_FRAMEBUFFER and it will be
re-emitted if necessary before the next draw.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Just after the vertex shader.
This seems to give a minor boost for, at least, Serious Sam
Fusion 2017 and Dawn of War 3. I don't see any real impacts
with The Talos Principle.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Will be used for VBO descriptors prefetching.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
For consistency because this function will also prefetch VBO
descriptors.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This register is the same on all gpus so far, so emit it in one
place and also for the pre-gfx9 gpus set the value in the pipeline
creation.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This moves some calculations of register values into the pipeline
construction, it saves looking at outinfo in the cmd buffer emit.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is derived from tgsi/radeonsi code from the GLSL intrinsics.
This should pre-fix radv for the upcoming spirv patches.
v2: actually use wait_cnt, sleep deprived dad time! (Bas)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Otherwise it will be missing from the release tarball
Fixes: 7f33e94e43 ("amd/addrlib: update to latest version")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
This uses C++11 initializer lists.
I just overwrote all Mesa files with internal addrlib and discarded
hunks that we should probably keep, but I might have missed something.
The code depending on ADDR_AM_BUILD is removed. We can add it back next
time if needed.
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
I was hacking something stupid in doom, and hit an assert for the bitcast
following this, it definitely looks like this should be the number of 32-bit
components, not the instr level ones.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We can avoid adding the buffer in the non-local case, this will
avoid all the overhead of the indirect call.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The next patch will try and avoid calling the indirect function.
v2: add a missing conversion.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The function that calls us has just added the buffer to the
list already, no need to try and add it again.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
There's no point recalculating these the whole time on descriptor
emission, just store them at pipeline creation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It appears the latest dota2 vulkan uses this,
and we get a hang in VR mode without it.
v2: remove finishme I left in after finishing.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of storing all the pointers and zeroing them all out,
just store a valid bitmask in the state. This also moves
the CmdBindPipeline path down the cpu usage path for the
multithreading demo as it no longer has to traverse MAX_SETS
to find the active descriptor sets.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This isn't required to be cleared, since buffers are only linked
by vertex elements, so if elements are clear then no buffers
should be referenced.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just removes a hole in the cmd_state and packs some bools
together.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we allocate attachments in the begin command buffer due to the
render pass continue bit, we were leaking them.
Since renderpasses inside a cmd buffer malloc/free these properly,
and set to NULL, we just need to call free at end.
Fixes a memory leak with multithreading demo.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
uint32_t data[MAX_SETS * 2] = {}; was getting executed before
the exit and took significant amounts of time. By having the
check outside the function, we skip the execution of the clear.
Reviewed-by: Dave Airlie <airlied@redhat.com>
The vram_list linked list resulted in lots of pointer chasing.
Replacing this with an array instead improves descriptor set
allocation CPU usage by 3x at least (when also considering the free),
because it had to iterate through 300-400 sets on average.
Not a huge improvement as the pre-improvement CPU usage was only
about 2.3% in the busiest thread.
Reviewed-by: Dave Airlie <airlied@redhat.com>