This should drop the CPU overhead of processing buffers on SKL+ by
dropping some of the logic contained in anv_reloc_list_add() whenever we
have enough compile-time information to know we have softpin.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11236>
The relocation list currently serves two purposes. One is for
relocations on older non-softpin platforms. The second is to keep track
of driver-managed BOs which are used by the given command buffer. We
going to need a mechanism to add BOs to the command buffer without doing
a relocation into the batch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11236>
Whenever we have the GFX_VERx10 macro available, we can make use_softpin
a compile-time thing for everything but Broadwell and Cherryview. This
should save us some CPU cycles especially on SKL+.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11236>
Softpin was added to i915 in
commit 506a8e87d8d2746b9e9d2433503fe237c54e4750
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 8 11:55:07 2015 +0000
drm/i915: Add soft-pinning API for execbuffer
which was included in Linux 4.5. It's been over 5 years so it's
probably reasonable to make it a hard requirement.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11236>
Mesh and Task shaders can use workgroup memory, so generalize its
handling in anv by moving it from anv_pipeline_compile_cs() to
anv_pipeline_lower_nir().
Update Pipeline Statistics accordingly.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11230>
Move it out the "cs" sub-struct, since the bit will be used for other
shader stages in the future.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>
This drops
-0000000000011e90 R isl_format_layouts
+0000000000008f48 R isl_format_layouts
I think that's about 36k.
Thanks to Jason for suggesting PACKED
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11232>
Be consistent with other usages in Vulkan and SPIR-V, and the recently
added workgroup_size field.
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
Since 40e1d798c6, we are now using physical register numbers for
everything which makes it all simpler. In particular, we no longer need
the special case for setting up the payload for SIMD16 on Gen4-5. This
fixes a pile of piglit tests on ILK and similar.
Fixes: 40e1d798c6 "intel/fs: Use ra_alloc_contig_reg_class()..."
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11221>
The format enum space isn't necessarily contiguous so we can't assume
that if it's in the table it's valid. We need to check something.
Fixes: ed6e586562 "intel: properly constify isl_format_layouts"
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11191>
By using the new class type, we don't need to make 1928 different
registers to represent each contigous reg size starting from the actual
128 HW register, or have a mapping between RA regs and HW base regs. With
the number of regs reduced, and the fast q computation when using the new
classes, we no longer need to compute our own q.
This drops the FS RA initialization time on my CFL system from about 1ms to
50us.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
Putting a const char * in the struct means it's a pointer that has to be
resolved at rtld time, which means it can be in .data.rel.ro but not
.rodata like you'd hope. Fix this with the usual string table trick.
Cuts about 20k (-80k read-write +60k read-only) and ~280 relocations
from the gallium driver.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11168>
This avoids multiple copies as we will need this in multiple places.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10366>
In 2db8867943, we introduced a new meta-op MOV_FOR_SCRATCH which is
identical to MOV except it lets us identify MOVs emitted during spilling
so we know not to re-spill those instructions. We emit them from
shuffle_for_64bit_data whenever the new for_scratch parameter is true.
Unfortunately, we missed the one used for resolving swizzles.
Fixes: 2db8867943 "intel/vec4: Don't spill fp64 registers more..."
Tested-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11155>
Iris only runs on BDW+ and ANV already handles this by not even trying
on anything older than HSW. The only driver benefiting from this common
check is i965. Moving it out makes the pass more generic and if some
driver comes along which can push UBOs on IVB, it should work for that.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11145>
Fixes the following building error:
FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/i965_dri_intermediates/LINKED/i965_dri.so
...
ld.lld: error: undefined symbol: brw_compile_ff_gs_prog
>>> referenced by brw_ff_gs.c:56 (external/mesa/src/mesa/drivers/dri/i965/brw_ff_gs.c:56)
Fixes: 52e426fd8b ("intel/compiler: add support for compiling fixed function gs")
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10718>
Note that CCS isn't ambiguated during a HiZ ambiguate. Dumping the CCS
surface after a HiZ ambiguate shows that the CCS is unchanged.
Fixes: 98dc7f56b7 ("intel/isl: Add a separate ISL_AUX_USAGE_HIZ_CCS_WT")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9112>
There is an effort to drop the "Gen" prefix from much of our codebase.
This just applies this to the metrics.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10930>
Adding a register, similar to what was done for RenderBasic.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10930>
The availability of those counters depends on the topology.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10930>