mesa/src/intel/Makefile.sources

295 lines
6.9 KiB
Text
Raw Normal View History

BLORP_FILES = \
blorp/blorp.c \
blorp/blorp.h \
blorp/blorp_blit.c \
blorp/blorp_clear.c \
blorp/blorp_nir_builder.h \
blorp/blorp_genX_exec.h \
blorp/blorp_priv.h
COMMON_FILES = \
common/gen_clflush.h \
common/gen_debug.c \
common/gen_debug.h \
common/gen_decoder.c \
common/gen_decoder.h \
common/gen_defines.h \
common/gen_l3_config.c \
common/gen_l3_config.h \
common/gen_urb_config.c \
intel: Add simple logging façade for Android (v2) I'm bringing up Vulkan in the Android container of Chrome OS (ARC++). On Android, stdio goes to /dev/null. On Android, remote gdb is even more painful than the usual remote gdb. On Android, nothing works like you expect and debugging is hell. I need logging. This patch introduces a small, simple logging API that can easily wrap Android's API. On non-Android platforms, this logger does nothing fancy. It follows the time-honored Unix tradition of spewing everything to stderr with minimal fuss. My goal here is not perfection. My goal is to make a minimal, clean API, that people hate merely a little instead of a lot, and that's good enough to let me bring up Android Vulkan. And it needs to be fast, which means it must be small. No one wants to their game to miss frames while aiming a flaming bow into the jaws of an angry robot t-rex, and thus become t-rex breakfast, because some fool had too much fun desiging a bloated, ideal logging API. If people like it, perhaps we should quickly promote it to src/util. The API looks like this: #define INTEL_LOG_TAG "intel-vulkan" #define DEBUG intel_logd("try hard thing with foo=%d", foo); n = try_foo(...); if (n < 0) { intel_loge("%s:%d: foo failed bigtime", __FILE__, __LINE__); return VK_ERROR_DEVICE_LOST; } And produces this on non-Android: intel-vulkan: debug: try hard thing with foo=93 intel-vulkan: error: anv_device.c:182: foo failed bigtime v2: Fix meson build. [for dcbaker] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-25 14:14:34 -07:00
common/gen_sample_positions.h \
common/intel_log.c \
common/intel_log.h
COMPILER_FILES = \
compiler/brw_cfg.cpp \
compiler/brw_cfg.h \
compiler/brw_clip.h \
compiler/brw_clip_line.c \
compiler/brw_clip_point.c \
compiler/brw_clip_tri.c \
compiler/brw_clip_unfilled.c \
compiler/brw_clip_util.c \
compiler/brw_compile_clip.c \
compiler/brw_compile_sf.c \
compiler/brw_compiler.c \
compiler/brw_compiler.h \
compiler/brw_dead_control_flow.cpp \
compiler/brw_dead_control_flow.h \
compiler/brw_disasm.c \
compiler/brw_disasm_info.c \
compiler/brw_disasm_info.h \
compiler/brw_eu.c \
compiler/brw_eu_compact.c \
compiler/brw_eu_defines.h \
compiler/brw_eu_emit.c \
compiler/brw_eu.h \
compiler/brw_eu_util.c \
compiler/brw_eu_validate.c \
compiler/brw_fs_builder.h \
intel/fs: Implement GRF bank conflict mitigation pass. Unnecessary GRF bank conflicts increase the issue time of ternary instructions (the overwhelmingly most common of which is MAD) by roughly 50%, leading to reduced ALU throughput. This pass attempts to minimize the number of bank conflicts by rearranging the layout of the GRF space post-register allocation. It's in general not possible to eliminate all of them without introducing extra copies, which are typically more expensive than the bank conflict itself. In a shader-db run on SKL this helps roughly 46k shaders: total conflicts in shared programs: 1008981 -> 600461 (-40.49%) conflicts in affected programs: 816222 -> 407702 (-50.05%) helped: 46234 HURT: 72 The running time of shader-db itself on SKL seems to be increased by roughly 2.52%±1.13% with n=20 due to the additional work done by the compiler back-end. On earlier generations the pass is somewhat less effective in relative terms because the hardware incurs a bank conflict anytime the last two sources of the instruction are duplicate (e.g. while trying to square a value using MAD), which is impossible to avoid without introducing copies. E.g. for a shader-db run on SNB: total conflicts in shared programs: 944636 -> 623185 (-34.03%) conflicts in affected programs: 853258 -> 531807 (-37.67%) helped: 31052 HURT: 19 And on BDW: total conflicts in shared programs: 1418393 -> 987539 (-30.38%) conflicts in affected programs: 1179787 -> 748933 (-36.52%) helped: 47592 HURT: 70 On SKL GT4e this improves performance of GpuTest Volplosion by 3.64% ±0.33% with n=16. NOTE: This patch intentionally disregards some i965 coding conventions for the sake of reviewability. This is addressed by the next squash patch which introduces an amount of (for the most part boring) boilerplate that might distract reviewers from the non-trivial algorithmic details of the pass. The following patch is squashed in: SQUASH: intel/fs/bank_conflicts: Roll back to the nineties. Acked-by: Matt Turner <mattst88@gmail.com>
2017-06-15 15:23:57 -07:00
compiler/brw_fs_bank_conflicts.cpp \
compiler/brw_fs_cmod_propagation.cpp \
compiler/brw_fs_combine_constants.cpp \
compiler/brw_fs_copy_propagation.cpp \
compiler/brw_fs.cpp \
compiler/brw_fs_cse.cpp \
compiler/brw_fs_dead_code_eliminate.cpp \
compiler/brw_fs_generator.cpp \
compiler/brw_fs.h \
compiler/brw_fs_live_variables.cpp \
compiler/brw_fs_live_variables.h \
compiler/brw_fs_lower_conversions.cpp \
compiler/brw_fs_lower_pack.cpp \
compiler/brw_fs_nir.cpp \
compiler/brw_fs_reg_allocate.cpp \
compiler/brw_fs_register_coalesce.cpp \
compiler/brw_fs_saturate_propagation.cpp \
compiler/brw_fs_sel_peephole.cpp \
compiler/brw_fs_surface_builder.cpp \
compiler/brw_fs_surface_builder.h \
compiler/brw_fs_validate.cpp \
compiler/brw_fs_visitor.cpp \
compiler/brw_inst.h \
compiler/brw_interpolation_map.c \
compiler/brw_ir_allocator.h \
compiler/brw_ir_fs.h \
compiler/brw_ir_vec4.h \
compiler/brw_nir.h \
compiler/brw_nir.c \
compiler/brw_nir_analyze_boolean_resolves.c \
compiler/brw_nir_analyze_ubo_ranges.c \
compiler/brw_nir_attribute_workarounds.c \
compiler/brw_nir_lower_cs_intrinsics.c \
compiler/brw_nir_opt_peephole_ffma.c \
compiler/brw_nir_tcs_workarounds.c \
compiler/brw_packed_float.c \
compiler/brw_predicated_break.cpp \
compiler/brw_reg.h \
compiler/brw_reg_type.c \
compiler/brw_reg_type.h \
compiler/brw_schedule_instructions.cpp \
compiler/brw_shader.cpp \
compiler/brw_shader.h \
compiler/brw_vec4_builder.h \
compiler/brw_vec4_cmod_propagation.cpp \
compiler/brw_vec4_copy_propagation.cpp \
compiler/brw_vec4.cpp \
compiler/brw_vec4_cse.cpp \
compiler/brw_vec4_dead_code_eliminate.cpp \
compiler/brw_vec4_generator.cpp \
compiler/brw_vec4_gs_visitor.cpp \
compiler/brw_vec4_gs_visitor.h \
compiler/brw_vec4.h \
compiler/brw_vec4_live_variables.cpp \
compiler/brw_vec4_live_variables.h \
compiler/brw_vec4_nir.cpp \
compiler/brw_vec4_gs_nir.cpp \
compiler/brw_vec4_reg_allocate.cpp \
compiler/brw_vec4_surface_builder.cpp \
compiler/brw_vec4_surface_builder.h \
compiler/brw_vec4_tcs.cpp \
compiler/brw_vec4_tcs.h \
compiler/brw_vec4_tes.cpp \
compiler/brw_vec4_tes.h \
compiler/brw_vec4_visitor.cpp \
compiler/brw_vec4_vs_visitor.cpp \
compiler/brw_vec4_vs.h \
compiler/brw_vue_map.c \
compiler/brw_wm_iz.cpp \
compiler/gen6_gs_visitor.cpp \
compiler/gen6_gs_visitor.h
COMPILER_GENERATED_FILES = \
compiler/brw_nir_trig_workarounds.c
DEV_FILES = \
dev/gen_device_info.c \
dev/gen_device_info.h
GENXML_XML_FILES = \
genxml/gen4.xml \
genxml/gen45.xml \
genxml/gen5.xml \
genxml/gen6.xml \
genxml/gen7.xml \
genxml/gen75.xml \
genxml/gen8.xml \
genxml/gen9.xml \
genxml/gen10.xml \
genxml/gen11.xml
genxml: New generated header genX_bits.h (v6) genX_bits.h contains the sizes of bitfields in genxml instructions, structures, and registers. It also defines some functions to query those sizes. isl_surf_init() will use the new header to validate that requested pitches fit in their destination bitfields. What's currently in genX_bits.h: - Each CONTAINER::Field from gen*.xml that has a bitsize has a macro in genX_bits.h: #define GEN{N}_CONTAINER_Field_bits {bitsize} - For each set of macros whose name, after stripping the GEN prefix, is the same, genX_bits.h contains a query function: static inline uint32_t __attribute__((pure)) CONTAINER_Field_bits(const struct gen_device_info *devinfo); v2 (Chad Versace): - Parse the XML instead of scraping the generated gen*_pack.h headers. v3 (Dylan Baker): - Port to Mako. v4 (Jason Ekstrand): - Make the _bits functions take a gen_device_info. v5 (Chad Versace): - Fix autotools out-of-tree build. - Fix Android build. Tested with git://github.com/android-ia/manifest. - Fix macro names. They were all missing the "_bits" suffix. - Fix macros names more. Remove all double-underscores. - Unindent all generated code. (It was floating in a sea of whitespace). - Reformat header to appear human-written not machine-generated. - Sort gens from high to low. Newest gens should come first because, when we read code, we likely want to read the gen8/9 code and ignore the gen4 code. So put the gen4 code at the bottom. - Replace 'const' attributes with 'pure', because the functions now have a pointer parameter. - Add --cpp-guard flag. Used by Android. - Kill class FieldCollection. After Jason's rewrite, it was just a dict. v6 (Chad Versace): - Replace `key not in d.keys()` with `key not in d`. [for dylan] Co-authored-by: Dylan Baker <dylan@pnwbakers.com> Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v5) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v6)
2017-03-24 14:35:24 -07:00
GENXML_GENERATED_PACK_FILES = \
genxml/gen4_pack.h \
genxml/gen45_pack.h \
genxml/gen5_pack.h \
genxml/gen6_pack.h \
genxml/gen7_pack.h \
genxml/gen75_pack.h \
genxml/gen8_pack.h \
genxml/gen9_pack.h \
genxml/gen10_pack.h \
genxml/gen11_pack.h
genxml: New generated header genX_bits.h (v6) genX_bits.h contains the sizes of bitfields in genxml instructions, structures, and registers. It also defines some functions to query those sizes. isl_surf_init() will use the new header to validate that requested pitches fit in their destination bitfields. What's currently in genX_bits.h: - Each CONTAINER::Field from gen*.xml that has a bitsize has a macro in genX_bits.h: #define GEN{N}_CONTAINER_Field_bits {bitsize} - For each set of macros whose name, after stripping the GEN prefix, is the same, genX_bits.h contains a query function: static inline uint32_t __attribute__((pure)) CONTAINER_Field_bits(const struct gen_device_info *devinfo); v2 (Chad Versace): - Parse the XML instead of scraping the generated gen*_pack.h headers. v3 (Dylan Baker): - Port to Mako. v4 (Jason Ekstrand): - Make the _bits functions take a gen_device_info. v5 (Chad Versace): - Fix autotools out-of-tree build. - Fix Android build. Tested with git://github.com/android-ia/manifest. - Fix macro names. They were all missing the "_bits" suffix. - Fix macros names more. Remove all double-underscores. - Unindent all generated code. (It was floating in a sea of whitespace). - Reformat header to appear human-written not machine-generated. - Sort gens from high to low. Newest gens should come first because, when we read code, we likely want to read the gen8/9 code and ignore the gen4 code. So put the gen4 code at the bottom. - Replace 'const' attributes with 'pure', because the functions now have a pointer parameter. - Add --cpp-guard flag. Used by Android. - Kill class FieldCollection. After Jason's rewrite, it was just a dict. v6 (Chad Versace): - Replace `key not in d.keys()` with `key not in d`. [for dylan] Co-authored-by: Dylan Baker <dylan@pnwbakers.com> Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v5) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v6)
2017-03-24 14:35:24 -07:00
GENXML_GENERATED_FILES = \
$(GENXML_GENERATED_PACK_FILES) \
genxml/genX_bits.h \
genxml/genX_xml.h
ISL_FILES = \
isl/isl.c \
isl/isl.h \
isl/isl_drm.c \
isl/isl_format.c \
isl/isl_genX_priv.h \
isl/isl_priv.h \
isl/isl_storage_image.c
ISL_GEN4_FILES = \
isl/isl_gen4.c \
isl/isl_gen4.h \
isl/isl_emit_depth_stencil.c \
isl/isl_surface_state.c
ISL_GEN5_FILES = \
isl/isl_emit_depth_stencil.c \
isl/isl_surface_state.c
ISL_GEN6_FILES = \
isl/isl_gen6.c \
isl/isl_gen6.h \
isl/isl_emit_depth_stencil.c \
isl/isl_surface_state.c
ISL_GEN7_FILES = \
isl/isl_gen7.c \
isl/isl_gen7.h \
isl/isl_emit_depth_stencil.c \
isl/isl_surface_state.c
ISL_GEN75_FILES = \
isl/isl_emit_depth_stencil.c \
isl/isl_surface_state.c
ISL_GEN8_FILES = \
isl/isl_gen8.c \
isl/isl_gen8.h \
isl/isl_emit_depth_stencil.c \
isl/isl_surface_state.c
ISL_GEN9_FILES = \
isl/isl_gen9.c \
isl/isl_gen9.h \
isl/isl_emit_depth_stencil.c \
isl/isl_surface_state.c
ISL_GEN10_FILES = \
isl/isl_emit_depth_stencil.c \
isl/isl_surface_state.c
ISL_GEN11_FILES = \
isl/isl_emit_depth_stencil.c \
isl/isl_surface_state.c
ISL_GENERATED_FILES = \
isl/isl_format_layout.c
VULKAN_FILES := \
vulkan/anv_allocator.c \
vulkan/anv_batch_chain.c \
vulkan/anv_blorp.c \
vulkan/anv_cmd_buffer.c \
vulkan/anv_descriptor_set.c \
vulkan/anv_device.c \
vulkan/anv_dump.c \
vulkan/anv_formats.c \
vulkan/anv_genX.h \
vulkan/anv_image.c \
vulkan/anv_intel.c \
vulkan/anv_nir.h \
vulkan/anv_nir_apply_pipeline_layout.c \
vulkan/anv_nir_lower_input_attachments.c \
vulkan/anv_nir_lower_multiview.c \
vulkan/anv_nir_lower_push_constants.c \
vulkan/anv_nir_lower_ycbcr_textures.c \
vulkan/anv_pass.c \
vulkan/anv_pipeline.c \
vulkan/anv_pipeline_cache.c \
vulkan/anv_private.h \
vulkan/anv_queue.c \
vulkan/anv_util.c \
vulkan/anv_wsi.c \
vulkan/vk_format_info.h
anv: Implement VK_ANDROID_native_buffer (v9) This implementation is correct (afaict), but takes two shortcuts regarding the import/export of Android sync fds. Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync fd into a VkSemaphore or VkFence, the driver instead simply blocks on the sync fd, then puts the VkSemaphore or VkFence into the signalled state. Thanks to implicit sync, this produces correct behavior (with extra latency overhead, perhaps) despite its ugliness. Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID to export a collection of wait semaphores as a sync fd, the driver instead submits the semaphores to the queue, then returns sync fd -1, which informs the caller that no additional synchronization is needed. Again, thanks to implicit sync, this produces correct behavior (with extra batch submission overhead) despite its ugliness. I chose to take the shortcuts instead of properly importing/exporting the sync fds for two reasons: Reason 1. I've already tested this patch with dEQP and with demos apps. It works. I wanted to get the tested patches into the tree now, and polish the implementation afterwards. Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915 supports neither Android's sync_fence, nor upstream's sync_file, nor drm_syncobj. Again, I tested these patches on Android with a 3.18 kernel and they work. I plan to quickly follow-up with patches that remove the shortcuts and properly import/export the sync fds. Non-Testing =========== I did not test at all using the Android.mk buildsystem. I may have broke it. Please test and review that. Testing ======= I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel. The following pass (as of patchset v9): - a little spinning cube demo APK - several Sascha demos - dEQP-VK.info.* - dEQP-VK.api.wsi.android.* (except dEQP-VK.api.wsi.android.swapchain.*.image_usage, because dEQP wants to create swapchains with VK_IMAGE_USAGE_STORAGE_BIT) - dEQP-VK.api.smoke.* - dEQP-VK.api.info.instance.* - dEQP-VK.api.info.device.* v2: - Reject VkNativeBufferANDROID if the dma-buf's size is too small for the VkImage. - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory during vkCreateImage. Instead, directly import its dma-buf during vkCreateImage with anv_bo_cache_import(). [for jekstrand] - Rebase onto Tapani's VK_EXT_debug_report changes. - Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not exist. v3: - Delete duplicate #include "anv_private.h". [per Tapani] - Try to fix the Android-IA build in Android.vulkan.mk by following Tapani's example. v4: - Unset EXEC_OBJECT_ASYNC and set EXEC_OBJECT_WRITE on the imported gralloc buffer, just as we do for all other winsys buffers in anv_wsi.c. [found by Tapani] v5: - Really fix the Android-IA build by ensuring that Android.vulkan.mk uses Mesa' vulkan.h and not Android's. Insert -I$(MESA_TOP)/include before -Iframeworks/native/vulkan/include. [for Tapani] - In vkAcquireImageANDROID, submit signal operations to the VkSemaphore and VkFence. [for zhou] v6: - Drop copy-paste duplication in vkGetSwapchainGrallocUsageANDROID(). [found by zhou] - Improve comments in vkGetSwapchainGrallocUsageANDROID(). v7: - Fix vkGetSwapchainGrallocUsageANDROID() to inspect its VkImageUsageFlags parameter. [for tfiga] - This fix regresses dEQP-VK.api.wsi.android.swapchain.*.image_usage because dEQP wants to create swapchains with VK_IMAGE_USAGE_STORAGE_BIT. v8: - Drop unneeded goto in vkAcquireImageANDROID. [for tfiga] v8.1: (minor changes) - Drop errant hunks added by rerere in anv_device.c. - Drop explicit mention of VK_ANDROID_native_buffer in anv_entrypoints_gen.py. [for jekstrand] v9: - Isolate as much Android code as possible, moving it from anv_image.c to anv_android.c. Connect the files with anv_image_from_gralloc(). Remove VkNativeBufferANDROID params from all anv_image.c funcs. [for krh] - Replace some intel_loge() with vk_errorf() in anv_android.c. - Use © in copyright line. [for krh] Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v5) Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (v9) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v9) Cc: zhoucm1 <david1.zhou@amd.com> Cc: Tomasz Figa <tfiga@chromium.org>
2016-11-14 16:13:51 -08:00
VULKAN_ANDROID_FILES := \
vulkan/anv_android.c
VULKAN_WSI_WAYLAND_FILES := \
vulkan/anv_wsi_wayland.c
VULKAN_WSI_X11_FILES := \
vulkan/anv_wsi_x11.c
VULKAN_GEM_FILES := \
vulkan/anv_gem.c
VULKAN_GEM_STUB_FILES := \
vulkan/anv_gem_stubs.c
VULKAN_GENERATED_FILES := \
vulkan/anv_entrypoints.c \
vulkan/anv_entrypoints.h \
vulkan/anv_extensions.c \
vulkan/anv_extensions.h
VULKAN_GENX_FILES := \
vulkan/genX_blorp_exec.c \
vulkan/genX_cmd_buffer.c \
vulkan/genX_gpu_memcpy.c \
vulkan/genX_pipeline.c \
vulkan/genX_query.c \
vulkan/genX_state.c
VULKAN_GEN7_FILES := \
vulkan/gen7_cmd_buffer.c \
$(VULKAN_GENX_FILES)
VULKAN_GEN75_FILES := \
vulkan/gen7_cmd_buffer.c \
$(VULKAN_GENX_FILES)
VULKAN_GEN8_FILES := \
vulkan/gen8_cmd_buffer.c \
$(VULKAN_GENX_FILES)
VULKAN_GEN9_FILES := \
vulkan/gen8_cmd_buffer.c \
$(VULKAN_GENX_FILES)
VULKAN_GEN10_FILES := \
vulkan/gen8_cmd_buffer.c \
$(VULKAN_GENX_FILES)
VULKAN_GEN11_FILES := \
vulkan/gen8_cmd_buffer.c \
$(VULKAN_GENX_FILES)