It's unnecessary to manually add these include paths since they could
be imported automatically.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Add libmesa_nir to a common LOCAL_STATIC_LIBRARIES defined by
ANV_STATIC_LIBRARIES so that its include path can be imported
automatically. Then ANV_INCLUDES is unnecessary and could be
eliminated.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
The dummy library libmesa_anv_entrypoints is totally unnecessary.
The four VULKAN_GENERATED_FILES could be generated and built in
libmesa_vulkan_common directly. The libraries using the generated
headers should get it via the exported include path.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Export the correct include path so that the libraries use it can
get it automatically.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
The libmesa_git_sha1 is a dummy library. There is no reason to put
it into LOCAL_WHOLE_STATIC_LIBRARIES.
Move libmesa_vulkan_util to the vulkan.radv which really needs it.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
The libmesa_anv_entrypoints and libmesa_genxml are dummy libraries.
There is no reason to put them into LOCAL_WHOLE_STATIC_LIBRARIES.
Move libmesa_vulkan_util to the vulkan HAL which really needs it.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
The vulkan module is the final HAL. No need to export its headers
since none will import it.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
The vulkan module is the final HAL. No need to export its headers
since none will import it.
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
This commit re-plumbs all of nir_loop_analyze to use nir_ssa_scalar for
all intermediate values so that we can properly handle swizzles. Even
though if conditions are required to be scalars, they may still consume
swizzles so you could have ((a.yzw < b.zzx).xz && c.xx).y == 0 as your
loop termination condition. The old code would just bail the moment it
saw its first non-zero swizzle but we can now properly chase the scalar
from the if condition to all the way to a, b, and c.
Shader-db results on Kaby Lake:
total loops in shared programs: 4388 -> 4364 (-0.55%)
loops in affected programs: 29 -> 5 (-82.76%)
helped: 29
HURT: 5
Shader-db results on Haswell:
total loops in shared programs: 4370 -> 4373 (0.07%)
loops in affected programs: 2 -> 5 (150.00%)
helped: 2
HURT: 5
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This commit reworks both get_induction_and_limit_vars() and
try_find_trip_count_vars_in_iand to return true on success and not
modify their output parameters on failure. This makes their callers
significantly simpler.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
There are various cases in which we want to chase SSA values through ALU
ops ranging from hand-written optimizations to back-end translation
code. In all these cases, it can be very tricky to do properly because
of swizzles. This set of helpers lets you easily work with a single
component of an SSA def and chase through ALU ops safely.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
None of the current code knows what to do with swizzles. Take the safe
option for now and bail if we see one. This does have a small shader-db
impact but it is at least safe.
Shader-db results on Kaby Lake:
total loops in shared programs: 4364 -> 4388 (0.55%)
loops in affected programs: 5 -> 29 (480.00%)
helped: 5
HURT: 29
Shader-db results on Haswell:
total loops in shared programs: 4373 -> 4370 (-0.07%)
loops in affected programs: 5 -> 2 (-60.00%)
helped: 5
HURT: 2
Fixes: 6772a17acc "nir: Add a loop analysis pass"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
The current code assumes everything is 32-bit which is very likely true
but not guaranteed by any means. Instead, use nir_eval_const_opcode to
do the calculations in a bit-size-agnostic way. We also use the new
constant constructors to build the correct size constants.
Fixes: 6772a17acc "nir: Add a loop analysis pass"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
One issue was that the original version didn't check that swizzles
matched when comparing ALU instructions so it could end up matching
very different instructions. Using the nir_instrs_equal function from
nir_instr_set.c which we use for CSE should be much more reliable.
Another was that the loop assumes it will only run two iterations which
may not be true. If there's something which guarantees that this case
only happens for phis after ifs, it wasn't documented.
Fixes: 9e6b39e1d5 "nir: detect more induction variables"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Now that we have the nir_const_value_as_* helpers, every one of these
functions is effectively the same except for the suffix they use so we
can easily define them with a repeated macro. This also means that
they're inline and the fact that the nir_src is being passed by-value
should no longer really hurt anything.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
COMPLETED_LIST is always empty. We only need one list.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Now that virgl_transfer_queue_is_queued does not search
COMPLETED_LIST, we don't need to move transfers to that list.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Search only the pending list and return immediately on the first
hit.
When the transfer queue was introduced, the function was used to
deal with
write transfer -> draw -> write transfer
sequence. It was used to tell if the second transfer intersects
with the first transfer. If yes, the transfer queue avoided
reordering the second transfer to before the draw (by flushing) in
case the draw uses the transferred data.
With the recent changes to the transfer code, the function is used
to deal with
write transfer -> readback transfer
We want to avoid reordering the readback transfer to before the
first transfer (also by flushing).
In the old code, we needed to track the compeleted transfers as well
to avoid reordering. But in the new code, a readback transfer is
guaranteed to see the data from the completed transfers (in other
words, it cannot be reoderered to before the already completed
transfers). We don't need to search the COMPLETED_LIST.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
We never use transfers_intersect with textures, but fix it anyway to
avoid confusion.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Rewrite the function and check z/depth more carefully. We
intentionally avoid u_box_test_intersection_2d because it returns
true when two boxes touch but do not intersect and can be confusing.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
It only works if there are not color and no Z exports.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
The highest used index determines the stride for shader outputs in shaders
that use LDS or memory for outputs.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
This can decrease LDS and/or memory usage for shader outputs when geometry
shaders or tessellation is used.
Only PS inputs support higher indices and those aren't eliminated by
kill_outputs.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
- don't pass it via a parameter if it can be derived from other parameters
- set shader_type for ac_rtld_open
- use enum pipe_shader_type instead of unsigned
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>