Commit graph

6262 commits

Author SHA1 Message Date
Bas Nieuwenhuizen
7c25578863 radv: Free temporary syncobj after waiting on it.
Otherwise we leak it.

Fixes: eaa56eab6d "radv: initial support for shared semaphores (v2)"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-11-14 10:03:02 +01:00
Bas Nieuwenhuizen
917d3b43f2 radv: Free syncobj with multiple imports.
Otherwise we can leak the old syncobj.

Fixes: eaa56eab6d "radv: initial support for shared semaphores (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-11-14 10:03:02 +01:00
Dylan Baker
46a7fdd7ca meson: Remove build_by_default from amd code
This is the same logic as the previous two patches.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-13 13:43:20 -08:00
Samuel Pitoiset
934b77f2fe radv: add unlikely() around radv_save_descriptors()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:40 +01:00
Samuel Pitoiset
305745457c radv: optimize calling radv_cmd_buffer_trace_emit()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:38 +01:00
Samuel Pitoiset
957d42271b radv: optimize calling radv_save_pipeline()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:36 +01:00
Samuel Pitoiset
ebab5c8ff4 radv: use vk_zalloc instead of vk_alloc+memset
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:35 +01:00
Samuel Pitoiset
0f68208f1d radv: remove unnecessary memset() in radv_AllocateCommandBuffers()
This should not be needed, if the allocation fails an error is
returned and the host should handle it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:32 +01:00
Samuel Pitoiset
66da4c75bc radv: remove useless initializations in radv_create_cmd_buffer()
There is a memset() above.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:30 +01:00
Samuel Pitoiset
3d95fde661 radv: remove useless memset() in radv_CreateFence()
All radv_fence fields are initialized here.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:28 +01:00
Samuel Pitoiset
cd64a4f705 radv: use vk_error() everywhere an error is returned
For consistency and it might help for debugging purposes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:26 +01:00
Samuel Pitoiset
4e16c6a41e radv: make radv_emit_framebuffer_state() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:04:25 +01:00
Samuel Pitoiset
be01197d8d radv: do not emit the framebuffer when restoring a pass
Instead just dirty RADV_CMD_DIRTY_FRAMEBUFFER and it will be
re-emitted if necessary before the next draw.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:04:22 +01:00
Samuel Pitoiset
f87c58dde3 radv: prefetch VBO descriptors at the right place
Just after the vertex shader.

This seems to give a minor boost for, at least, Serious Sam
Fusion 2017 and Dawn of War 3. I don't see any real impacts
with The Talos Principle.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:03:16 +01:00
Samuel Pitoiset
9444a34f4a radv: add radv_emit_prefetch_TC_L2_async() helper
Will be used for VBO descriptors prefetching.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:03:13 +01:00
Samuel Pitoiset
36c2e46328 radv: rename radv_emit_shaders_prefetch() to radv_emit_prefetch()
For consistency because this function will also prefetch VBO
descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:03:11 +01:00
Dave Airlie
bec716e844 radv: emit esgs ring size in one place.
This register is the same on all gpus so far, so emit it in one
place and also for the pre-gfx9 gpus set the value in the pipeline
creation.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-13 07:17:09 +00:00
Dave Airlie
031e591923 radv: move calculating vs out info regs into pipeline.
This moves some calculations of register values into the pipeline
construction, it saves looking at outinfo in the cmd buffer emit.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-13 07:16:53 +00:00
Timothy Arceri
8fe6abd964 ac: add emit_vertex to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-12 11:08:26 +11:00
Chad Versace
9ec33972cc radv: Fix architecture in radeon_icd.{arch}.json
Use the host arch, not the target arch. In Meson and in recent
Autotools, the host arch is where the binary will be used. The target
arch is useful only when compiling a compiler.

See: http://mesonbuild.com/Cross-compilation.html
See: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html
Reported-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-09 16:29:28 -08:00
Dave Airlie
6bec8bcd79 ac/nir: add support for all intrinsics. (v2)
This is derived from tgsi/radeonsi code from the GLSL intrinsics.

This should pre-fix radv for the upcoming spirv patches.

v2: actually use wait_cnt, sleep deprived dad time! (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-09 01:25:59 +00:00
Emil Velikov
01d91b3718 amd: add amdgpu_asic_addr.h to the sources list
Otherwise it will be missing from the release tarball

Fixes: 7f33e94e43 ("amd/addrlib: update to latest version")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-11-08 14:07:27 +00:00
Marek Olšák
7f33e94e43 amd/addrlib: update to latest version
This uses C++11 initializer lists.

I just overwrote all Mesa files with internal addrlib and discarded
hunks that we should probably keep, but I might have missed something.

The code depending on ADDR_AM_BUILD is removed. We can add it back next
time if needed.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 00:55:13 +01:00
Marek Olšák
cde664ab81 radeonsi: use ac_create_target_machine
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:58:38 +01:00
Marek Olšák
81f81fdb54 radeonsi: use ac_get_llvm_processor_name
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:58:36 +01:00
Marek Olšák
24e9004708 radeonsi: remove unused field in the PCI ID table
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-11-07 17:26:36 +01:00
Dave Airlie
0084f4a422 ac/nir: for ubo load use correct num_components
I was hacking something stupid in doom, and hit an assert for the bitcast
following this, it definitely looks like this should be the number of 32-bit
components, not the instr level ones.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-07 14:54:19 +10:00
Dave Airlie
201b3b8d0d radv: move is_local up to the winsys level.
We can avoid adding the buffer in the non-local case, this will
avoid all the overhead of the indirect call.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:45:59 +00:00
Dave Airlie
25660499b6 radv: wrap cs_add_buffer in an inline. (v2)
The next patch will try and avoid calling the indirect function.

v2: add a missing conversion.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:45:59 +00:00
Dave Airlie
31b5da7958 radv: when loading regs no need to add buffer
The function that calls us has just added the buffer to the
list already, no need to try and add it again.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:44:49 +00:00
Dave Airlie
3bf8be41b8 radv: pre-calculate user_data_0 registers and store in pipeline
There's no point recalculating these the whole time on descriptor
emission, just store them at pipeline creation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:44:49 +00:00
Dave Airlie
4bcb48b831 radv: add initial copy descriptor support. (v2)
It appears the latest dota2 vulkan uses this,
and we get a hang in VR mode without it.

v2: remove finishme I left in after finishing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 19:12:39 +00:00
Dave Airlie
60a9705e00 radv: move descriptor sets out of cmd_state.
Instead of storing all the pointers and zeroing them all out,
just store a valid bitmask in the state. This also moves
the CmdBindPipeline path down the cpu usage path for the
multithreading demo as it no longer has to traverse MAX_SETS
to find the active descriptor sets.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:11:03 +00:00
Dave Airlie
3a0d098252 radv: add helper for setting a descriptor.
This is just a simple refactor.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:11:00 +00:00
Dave Airlie
b48063a2f2 radv: move vertex binding out of cmd state.
This isn't required to be cleared, since buffers are only linked
by vertex elements, so if elements are clear then no buffers
should be referenced.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:10:56 +00:00
Dave Airlie
7365626d78 radv: reorder cmd_state to remove a hole.
This just removes a hole in the cmd_state and packs some bools
together.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:10:53 +00:00
Dave Airlie
f0ae06a13c radv: free attachments on end command buffer.
If we allocate attachments in the begin command buffer due to the
render pass continue bit, we were leaking them.

Since renderpasses inside a cmd buffer malloc/free these properly,
and set to NULL, we just need to call free at end.

Fixes a memory leak with multithreading demo.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:03:47 +00:00
Bas Nieuwenhuizen
608af05ffb radv: Optimize calling radv_save_descriptors.
uint32_t data[MAX_SETS * 2] = {}; was getting executed before
the exit and took significant amounts of time. By having the
check outside the function, we skip the execution of the clear.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-04 20:18:17 +01:00
Bas Nieuwenhuizen
cecbcf4b2d radv: Use an array to store descriptor sets.
The vram_list linked list resulted in lots of pointer chasing.
Replacing this with an array instead improves descriptor set
allocation CPU usage by 3x at least (when also considering the free),
because it had to iterate through 300-400 sets on average.

Not a huge improvement as the pre-improvement CPU usage was only
about 2.3% in the busiest thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-04 20:18:17 +01:00
Timothy Arceri
6e2eb96b64 ac: remove the remaining duplicate llvm types
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
e73a467005 ac: remove usused v4f32
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
7f4966731f ac: add v2f32 to the common code and make use of it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
cd6cfd1095 ac: use the ac f16 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
8f651ae062 ac: use the ac f32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
368654a299 ac: use the ac f64 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
d927db0672 ac: use the common v8i32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
9db51b2393 ac: use the common v4i32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
ee376ac6f4 ac: add v3i32 to the common code and make use of it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
309a51411d ac: add v2i32 to the common code and use it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
c64cfa0392 ac: use the ac i64 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00