Commit graph

112841 commits

Author SHA1 Message Date
Chih-Wei Huang
a74285def2 android: radv: import include paths from used libraries
It's unnecessary to manually add these include paths since they could
be imported automatically.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-07-10 08:56:28 +02:00
Chih-Wei Huang
f982c6789c android: anv: import include path of libmesa_nir
Add libmesa_nir to a common LOCAL_STATIC_LIBRARIES defined by
ANV_STATIC_LIBRARIES so that its include path can be imported
automatically. Then ANV_INCLUDES is unnecessary and could be
eliminated.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-07-10 08:56:23 +02:00
Chih-Wei Huang
5cb61f27d0 android: anv: eliminate libmesa_anv_entrypoints
The dummy library libmesa_anv_entrypoints is totally unnecessary.
The four VULKAN_GENERATED_FILES could be generated and built in
libmesa_vulkan_common directly. The libraries using the generated
headers should get it via the exported include path.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-07-10 08:56:16 +02:00
Chih-Wei Huang
4338e08bd6 android: vulkan/util: fix export path
Export the correct include path so that the libraries use it can
get it automatically.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-07-10 08:56:10 +02:00
Chih-Wei Huang
e2ef281da1 android: radv: fix improper use of LOCAL_WHOLE_STATIC_LIBRARIES
The libmesa_git_sha1 is a dummy library. There is no reason to put
it into LOCAL_WHOLE_STATIC_LIBRARIES.

Move libmesa_vulkan_util to the vulkan.radv which really needs it.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-07-10 08:56:04 +02:00
Chih-Wei Huang
8ff01f0342 android: anv: fix improper use of LOCAL_WHOLE_STATIC_LIBRARIES
The libmesa_anv_entrypoints and libmesa_genxml are dummy libraries.
There is no reason to put them into LOCAL_WHOLE_STATIC_LIBRARIES.

Move libmesa_vulkan_util to the vulkan HAL which really needs it.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-07-10 08:55:59 +02:00
Chih-Wei Huang
352d91ce5b android: radv: remove unused LOCAL_EXPORT_C_INCLUDE_DIRS
The vulkan module is the final HAL. No need to export its headers
since none will import it.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-07-10 08:55:50 +02:00
Chih-Wei Huang
4fb11c01c5 android: anv: remove unused LOCAL_EXPORT_C_INCLUDE_DIRS
The vulkan module is the final HAL. No need to export its headers
since none will import it.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2019-07-10 08:55:42 +02:00
Jason Ekstrand
7e0fcea727 nir/loop_analyze: Pass nir_const_values directly to helpers
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
ff972c7a3a nir/loop_analyze: Properly handle swizzles in loop conditions
This commit re-plumbs all of nir_loop_analyze to use nir_ssa_scalar for
all intermediate values so that we can properly handle swizzles.  Even
though if conditions are required to be scalars, they may still consume
swizzles so you could have ((a.yzw < b.zzx).xz && c.xx).y == 0 as your
loop termination condition.  The old code would just bail the moment it
saw its first non-zero swizzle but we can now properly chase the scalar
from the if condition to all the way to a, b, and c.

Shader-db results on Kaby Lake:

    total loops in shared programs: 4388 -> 4364 (-0.55%)
    loops in affected programs: 29 -> 5 (-82.76%)
    helped: 29
    HURT: 5

Shader-db results on Haswell:

    total loops in shared programs: 4370 -> 4373 (0.07%)
    loops in affected programs: 2 -> 5 (150.00%)
    helped: 2
    HURT: 5

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
0333649e63 nir/loop_analyze: Refactor detection of limit vars
This commit reworks both get_induction_and_limit_vars() and
try_find_trip_count_vars_in_iand to return true on success and not
modify their output parameters on failure.  This makes their callers
significantly simpler.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
8f7405ed9d nir: Add some helpers for chasing SSA values properly
There are various cases in which we want to chase SSA values through ALU
ops ranging from hand-written optimizations to back-end translation
code.  In all these cases, it can be very tricky to do properly because
of swizzles.  This set of helpers lets you easily work with a single
component of an SSA def and chase through ALU ops safely.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
9a3cb6f5fe nir/loop_analyze: Bail if we encounter swizzles
None of the current code knows what to do with swizzles.  Take the safe
option for now and bail if we see one.  This does have a small shader-db
impact but it is at least safe.

Shader-db results on Kaby Lake:

    total loops in shared programs: 4364 -> 4388 (0.55%)
    loops in affected programs: 5 -> 29 (480.00%)
    helped: 5
    HURT: 29

Shader-db results on Haswell:

    total loops in shared programs: 4373 -> 4370 (-0.07%)
    loops in affected programs: 5 -> 2 (-60.00%)
    helped: 5
    HURT: 2

Fixes: 6772a17acc "nir: Add a loop analysis pass"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
6455fa9710 nir/loop_analyze: Use new eval_const_* helpers in test_iterations
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
268ad47c11 nir/loop_analyze: Handle bit sizes correctly in calculate_iterations
The current code assumes everything is 32-bit which is very likely true
but not guaranteed by any means.  Instead, use nir_eval_const_opcode to
do the calculations in a bit-size-agnostic way.  We also use the new
constant constructors to build the correct size constants.

Fixes: 6772a17acc "nir: Add a loop analysis pass"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
9f7ffe41dd nir/loop_analyze: Fix phi-of-identical-alu detection
One issue was that the original version didn't check that swizzles
matched when comparing ALU instructions so it could end up matching
very different instructions.  Using the nir_instrs_equal function from
nir_instr_set.c which we use for CSE should be much more reliable.
Another was that the loop assumes it will only run two iterations which
may not be true.  If there's something which guarantees that this case
only happens for phis after ifs, it wasn't documented.

Fixes: 9e6b39e1d5 "nir: detect more induction variables"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
6e984bcb92 nir/instr_set: Expose nir_instrs_equal()
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
64328f947e nir/builder: Use nir_const_value_for_* for constructing immediates
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
3acddc733f nir: Refactor nir_src_as_* constant functions
Now that we have the nir_const_value_as_* helpers, every one of these
functions is effectively the same except for the suffix they use so we
can easily define them with a repeated macro.  This also means that
they're inline and the fact that the nir_src is being passed by-value
should no longer really hurt anything.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Jason Ekstrand
ce5581e23e nir: Add more helpers for working with const values
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-10 00:20:59 +00:00
Chia-I Wu
b44bb8bded virgl: remove virgl_transfer_queue_lists
COMPLETED_LIST is always empty.  We only need one list.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-07-09 14:26:55 -07:00
Chia-I Wu
48aefcbd6b virgl: simplify virgl_transfer_queue_extend
We can reuse virgl_transfer_queue_find_pending.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-07-09 14:26:55 -07:00
Chia-I Wu
eae4527551 virgl: remove transfer after transfer_write
Now that virgl_transfer_queue_is_queued does not search
COMPLETED_LIST, we don't need to move transfers to that list.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-07-09 14:26:55 -07:00
Chia-I Wu
bec2a85c48 virgl: improve virgl_transfer_queue_is_queued
Search only the pending list and return immediately on the first
hit.

When the transfer queue was introduced, the function was used to
deal with

  write transfer -> draw -> write transfer

sequence.  It was used to tell if the second transfer intersects
with the first transfer. If yes, the transfer queue avoided
reordering the second transfer to before the draw (by flushing) in
case the draw uses the transferred data.

With the recent changes to the transfer code, the function is used
to deal with

  write transfer -> readback transfer

We want to avoid reordering the readback transfer to before the
first transfer (also by flushing).

In the old code, we needed to track the compeleted transfers as well
to avoid reordering.  But in the new code, a readback transfer is
guaranteed to see the data from the completed transfers (in other
words, it cannot be reoderered to before the already completed
transfers).  We don't need to search the COMPLETED_LIST.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-07-09 14:26:55 -07:00
Chia-I Wu
5f6aab2ee2 virgl: fix transfers_intersect for mipmaps
We never use transfers_intersect with textures, but fix it anyway to
avoid confusion.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-07-09 14:26:55 -07:00
Chia-I Wu
6ca1bbabbe virgl: fix some false positives in transfers_overlap
Rewrite the function and check z/depth more carefully.  We
intentionally avoid u_box_test_intersection_2d because it returns
true when two boxes touch but do not intersect and can be confusing.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-07-09 14:26:55 -07:00
Marek Olšák
2b2093961e radeonsi/gfx10: enable primitive binning by default
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
9f68367d19 radeonsi/gfx10: implement primitive binning
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
4e56a2aaa8 radeonsi: simplify primitive binning enablement
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
3521297251 radeonsi: set primitive binning tunables for dGPUs
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
d7e80ba1e7 radeonsi: set FLUSH_ON_BINNING_TRANSITION when needed
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
9dbe63ceea radeonsi/gfx10: use the new scan converter when binning is disabled
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
80b3f4b4bd radeonsi/gfx9: fix an oversight in primitive binning code
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
1f53a3e766 radeonsi: use BREAK_BATCH instead of FLUSH_DFSM when CB_TARGET_MASK changes
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
605900d7dd radeonsi/gfx10: don't expose unimplemented PIPE_CAP_QUERY_SO_OVERFLOW
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
270a8ab648 radeonsi/gfx10: launch 2 compute waves per CU before going onto the next CU
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
ab1f36a1d3 radeonsi/gfx10: set more registers and fields
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
9b65f6618c radeonsi/gfx10: enable LATE_ALLOC_GS
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
4985c3ee22 radeonsi/gfx10: set HS/GS/CS.WGP_MODE
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
329406ec9c radeonsi/gfx10: set GE_PC_ALLOC
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
9d1483de3b radeonsi/gfx10: enable 1D textures
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
1d3bffaf9c radeonsi/gfx10: enable image stores with DCC
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
5b50fb9b7f radeonsi/gfx10: no need to invalidate L2 for framebuffer -> texture coherency
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
fbf781e401 radeonsi/gfx10: support pixel shaders without exports
It only works if there are not color and no Z exports.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
2adc8e2736 radeonsi/gfx10: enable vertex shaders without param space allocation
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
07fe51156d radeonsi: update DCC settings from PAL
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
4002913f8d radeonsi: reorder shader IO indices for better IO space usage for tess and GS
The highest used index determines the stride for shader outputs in shaders
that use LDS or memory for outputs.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
1c99a13f89 radeonsi: decrease maximum supported GENERIC varying index from 42 to 31
This can decrease LDS and/or memory usage for shader outputs when geometry
shaders or tessellation is used.

Only PS inputs support higher indices and those aren't eliminated by
kill_outputs.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
6335cc6a58 radeonsi: cosmetic cleanup in si_shader_io_get_unique_index
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Marek Olšák
3be4ed2fe1 radeonsi: fix and clean up shader_type passing
- don't pass it via a parameter if it can be derived from other parameters
- set shader_type for ac_rtld_open
- use enum pipe_shader_type instead of unsigned

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00