fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 15:50:11 +01:00

Author	SHA1	Message	Date
Jason Ekstrand	ebad00d9e7	anv: Delete dead shader constant pushing code As of `2d78e55a8c`, nir_intrinsic_load_constant with a constant offset is constant-folded so we should never end up with any that trigger brw_nir_analyze_ubo_ranges. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	0709c0f6b4	anv: Flatten descriptor bindings in anv_nir_apply_pipeline_layout This lets us stop tracking the pipeline layout. It also means less indirection on a very hot path. As an extra bonus, we can make some of our data structures smaller. No measurable CPU overhead improvement. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	fa120cb31c	anv: Input attachments are always single-plane Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Rafael Antognolli	d4f628235e	anv: Use mocs settings from isl_dev. v2: Remove device->default_mocs and external_mocs (Jason). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-12 20:41:52 +00:00
Lionel Landwerlin	c1c346f166	anv: implement VK_KHR_separate_depth_stencil_layouts v2: Use ternary to simplify code (Jason) v3: Reorder switch cases to follow existing section ordering (Nanley) Add missing comment in cmd_buffer_end_subpass() about new layout (Nanley) v4: Fix layout comparison for stencil case (Nanley) Update a few more comments (Nanley) Move VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL_KHR in color attachment case for future stencil-CCS support (Nanley) v5: Missed comments update (Nanley) Updated relnotes.txt (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-11-06 20:13:30 +00:00
Jason Ekstrand	f60ef0fff4	anv: Move the RT BTI flush workaround to begin_subpass Now that we're no longer compacting binding table entries, the only time they can possibly change is when we actually switch subpasses. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-31 21:07:15 +00:00
Jason Ekstrand	853d3b59fd	anv: Allocate misc BOs from the cache Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	0a6d2593b8	anv: Allocate descriptor buffers from the BO cache Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	c4be72934e	anv: Fix a relocation race condition Previously, we would read the offset from the BO in anv_reloc_list_add to generate the presumed offset and then again in the caller to compute the 64-bit address to write into the buffer. However, if the offset somehow changed between these two points, the presumed offset would no longer match the written offset. This is unlikely to actually ever be a problem in practice because the presumed offset gets recorded first and so if the written address is wrong then the presumed offset is almost certainly wrong and the relocation will trigger. However, it's much safer to simply have anv_reloc_list_add return the 64-bit address. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Rafael Antognolli	3c317e8187	anv: Add Tile Cache Flush for Unified Cache.	2019-10-30 19:51:03 +00:00
Jason Ekstrand	beca63c6c0	anv: Avoid emitting UBO surface states that won't be used This shaves around 4-5% off of a CPU-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 16:05:57 +00:00
Plamena Manolova	4fe2317601	anv: Implement new way for setting streamout buffers. For gen12 we set the streamout buffers using 4 separate commands instead of 3DSTATE_SO_BUFFER. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-29 19:21:20 +00:00
Nanley Chery	6451008e8b	intel: Refactor blorp_can_hiz_clear_depth() Prepare this function to be used in iris and to handle new Gen12 behavior. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	300d77c2fa	anv/cmd_buffer: Don't assume CCS_E includes CCS_D There's no longer a clear-only compression mode of CCS on Gen12+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Jordan Justen	7737f56544	anv/gen12: Write GFX_AUX_TABLE base address register Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-28 00:09:14 -07:00
Lionel Landwerlin	2b5f30b1d9	anv: implement VK_INTEL_performance_query v2: Introduce the appropriate pipe controls Properly deal with changes in metric sets (using execbuf parameter) Record marker at query end v3: Fill out PerfCntr1&2 v4: Introduce vkUninitializePerformanceApiINTEL v5: Use new execbuf extension mechanism v6: Fix comments in genX_query.c (Rafael) Use PIPE_CONTROL workarounds (Rafael) Refactor on the last kernel series update (Lionel) v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:15 +00:00
Jordan Justen	9790cfcefa	anv,iris: L3ALLOC register replaces L3CNTLREG for gen12 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-06 13:11:25 -07:00
Francisco Jerez	c2fe7a0fb8	anv/gen9: Optimize slice and subslice load balancing behavior. See "i965/gen9: Optimize slice and subslice load balancing behavior." for the rationale. According to Jason, improves Aztec Ruins performance by 2.7%. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) v2: Undo CPU performance micro-optimization done in i965 and iris due to lack of data justifying it on anv. Use cmd_buffer_apply_pipe_flushes wrapper instead of emitting pipe control command directly. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-12 14:40:21 -07:00
Lionel Landwerlin	cefb4341b7	anv: drop unused code We stopped using this when we moved to Jason's mi_builder. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-09 17:01:38 +03:00
Jason Ekstrand	bc612536eb	anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D There is an object-level preemption workaround which requires this. However, even without object-level preemption, we seem to have issues with geometry flickering when 3D and compute are combined in the same batch and this appears to fix it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109630 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111267 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-06 05:46:28 +00:00
Eric Engestrom	e775b938b2	intel: drop incorrect MAYBE_UNUSED All these are actually always used. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Sagar Ghuge	806e5a37ed	anv: Implement VK_KHR_imageless_framebuffer v2: Pass pointer instead of struct instance (Lionel) v3: 1) Fix small nits (Jason) 2) Add way to detect anv_framebuffer don't have attachments (Jason) 3) Get rid of unncessary pNext chain walk (Jason) 4) Keep framebuffer instance in anv_cmd_state (Jason) v4: 1) Dump attachments from cmd_buffer (Jason) v5: 1) Fix condition check and add assertion (Lionel) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-23 10:01:45 -07:00
Jason Ekstrand	6a2ff217b8	anv: Set Stateless Data Port Access MOCS This is the MOCS setting used for the A64 stateless messages which we sometimes use for SSBO operations. Fixes: `48ed2a7bb0` "anv: Implement VK_EXT_buffer_device_address" Fixes: `79fb0d27f3` "anv: Implement SSBOs bindings with GPU addr..." Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-10 19:35:23 +00:00
Jason Ekstrand	9672b7044c	anv: Set STATE_BASE_ADDRESS upper bounds on gen7 This should fix floating-point border color on all gen7 HW. Integer is still thoroughly busted on gen7 because it doesn't exist on IVB and it's crazy on HSW. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-17 18:53:07 -05:00
Jason Ekstrand	f3ea0cf828	anv: Add stencil texturing support for gen7 Intel hardware didn't get support for sampling from W-tiled (required for stencil) images until Broadwell so we can't directly sample from stencil. Instead, if we want to support stencil texturing on gen7 hardware, we have to keep a texture-capable shadow copy around and use BLORP to update when stencil changes. The one thing this commit does not implement is self-dependencies with stencil input attachments. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99493 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-17 22:32:26 +00:00
Jason Ekstrand	2b736d9e6c	anv/cmd_buffer: Add a stencil transition helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-17 22:32:26 +00:00
Jason Ekstrand	86fc268142	anv/blorp: Take an aspect in anv_image_copy_to_shadow Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-17 22:32:26 +00:00
Ville Syrjälä	6230bfeb65	anv/cmd_buffer: Reuse gen8 Cmd{Set, Reset}Event on gen7 Modern DXVK requires event support [1], but looks like it only uses vkCmdSetEvent() + vkGetEventStatus(). So we can just borrow the relevant code from gen8, leaving CmdWaitEvents still unimplemented. [1] `8c3900c533` v2: Also move CmdWaitEvents into genX_cmd_buffer.c (Jason) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-11 16:25:07 -05:00
Nanley Chery	b4198e792c	anv/cmd_buffer: Initalize the clear color struct for CNL+ On CNL+, the clear color struct is composed of RGBA channel values and fields which are either reserved by the HW or used to control fast-clears. Currently anv initializes the channel values to zero and allows the other fields to be undefined. Satisfy the MBZ field requirements by removing an optimization that doesn't hold true for CNL+ and pulling in the number of dwords to initialize from ISL. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-07 18:43:06 +00:00
Caio Marcelo de Oliveira Filho	f7d53fffa2	anv: Remove special allocation for anv_push_constants The key reason for that mechanism is gone: all the extra optional data that could be in the anv_push_constants was moved elsewhere. At this point, just put anv_push_constants directly in anv_cmd_state (part of anv_cmd_buffer). v2: Remove a NULL check we don't need anymore in anv_cmd_buffer_push_constants(). (Lionel) Fix size we consider for valid push params. (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-09 19:01:14 -07:00
Jason Ekstrand	e6803f6b6f	anv: Use bindless textures and samplers This commit changes anv to put bindless handles and sampler pointers into the descriptor buffer and use those instead of bindful when we run out of binding table space. This "spilling" of descriptors allows to to advertise an almost unbounded number of images and samplers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	3b755b52e8	anv: Put image params in the descriptor set buffer on gen8 and earlier This is really where they belong; not push constants. The one downside here is that we can't push them anymore for compute shaders. However, that's a general problem and we should figure out how to push descriptor sets for compute shaders. This lets us bump MAX_IMAGES to 64 on BDW and earlier platforms because we no longer have to worry about push constant overhead limits. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	83b943cc2f	anv: Make all VkDeviceMemory BOs resident permanently We spend a lot of time in the driver adding things to hash sets to track residency. The reality is that a properly built Vulkan app uses large memory objects and sub-allocates from them. In a typical frame, most of if not all of those allocations are going to be resident for the entire frame so we're really not saving ourselves much by tracking fine-grained residency. Just throwing everything in the validation list does make it a little bit more expensive inside the kernel to walk the list and ensure that all our VA is in order. However, without relocations, the overhead of that is pretty small. If we ever do run into a memory pressure situation where the fine- grained residency could even potentially help, we would likely be swapping one page out to make room for another within the draw call and performance is totally lost at that point. We're better off swapping out other apps and just letting ours run a whole frame. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	0d6dea0ac8	anv/cmd_buffer: Use gen_mi_sub instead of gen_mi_add with a negative Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	d17dd46b09	anv: Move mi_memcpy and mi_memset to gen_mi_builder Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	48da45891e	anv: Use gen_mi_builder for conditional rendering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	a3b0894afc	anv: Use gen_mi_builder for indirect dispatch Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	b829dc30c1	anv: Use gen_mi_builder for indirect draw parameters Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	0122a6f037	anv: Use gen_mi_builder for computing resolve predicates Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Jason Ekstrand	83b46ad6d8	anv: Use gen_mi_builder for CmdDrawIndirectByteCount Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-11 18:04:09 +00:00
Danylo Piliaiev	ecb98c6898	anv: Treat zero size XFB buffer as disabled Vulkan spec doesn't explicitly forbid zero size transform feedback buffers. Having zero size xfb caused SurfaceSize overflow and triggered assert in debug build. The only way to have zero size SO_BUFFER is to disable SO_BUFFER as stated in hardware spec. From SKL PRM, Vol 2a, "3DSTATE_SO_BUFFER": "If set, stream output to SO Buffer is enabled, if 3DSTATE_STREAMOUT::SO Function ENABLE is also enabled. If clear, the SO Buffer is considered "not bound" and effectively treated as a zero- length buffer for the purposes of SO output and overflow detection. If an enabled stream's Stream to Buffer Selects includes this buffer it is by definition an overflow condition. That stream will cause no writes to occur, and only SO_PRIM_STORAGE_NEEDED[<stream>] will increment." Fixes: `36ee2fd61c` "anv: Implement the basic form of VK_EXT_transform_feedback" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-18 16:09:42 +00:00
Jason Ekstrand	c520f4dec9	anv: Add a concept of a descriptor buffer This buffer goes along side the CPU data structure and may contain pointers, bindless handles, or any other descriptor information. Currently, all descriptors are size zero and nothing goes in the buffer but this commit sets up the framework we will need later. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	4c50b7c92c	anv: Count image param entries rather than images This is what we're actually storing in the descriptor set and consuming when we bind surface states. This commit renames image_count to image_param_count a few places and moves the decision to not count image params on gen9+ into anv_descriptor_set.c when we build the layout. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Jason Ekstrand	65ee5cc0da	anv: Use an actual binding for gl_NumWorkgroups This commit moves our handling of gl_NumWorkgroups over to work like our handling of other special bindings in the Vulkan driver. We give it a magic descriptor set number and teach emit_binding_tables to handle it. This is better than the bias mechanism we were using because it allows us to do proper accounting through the bind map mechanism. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-04 23:56:40 +00:00
Jason Ekstrand	9b202239ba	anv: Silence some compiler warnings in release builds Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-14 16:04:45 -06:00
Jason Ekstrand	2be89cbd82	anv: Implement vkCmdDrawIndirectByteCountEXT Annoyingly, this requires that we implement integer division on the command streamer. Fortunately, we're only ever dividing by constants so we can use the mulh+add+shift trick and it's not as bad as it sounds. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	36ee2fd61c	anv: Implement the basic form of VK_EXT_transform_feedback Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Lionel Landwerlin	ad99c1670a	intel/genxml: add missing MI_PREDICATE compare operations Doesn't save us a great deal of lines but at least they get decoded in aubinators. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-01-19 15:47:36 +00:00
Lionel Landwerlin	3c4c18341a	anv: narrow flushing of the render target to buffer writes In commit `9a7b319903` ("anv/query: flush render target before copying results") we tracked all the render target writes to apply a flushes in the vkCopyQueryResults(). But we can narrow this down to only when we write a buffer (which is the only input of vkCopyQueryResults). v2: Drop newer render target write flags introduce by `1952fd8d2c` ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2019-01-19 15:45:41 +00:00
Danylo Piliaiev	1952fd8d2c	anv: Implement VK_EXT_conditional_rendering for gen 7.5+ Conditional rendering affects next functions: - vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect - vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR - vkCmdDispatch, vkCmdDispatchIndirect, vkCmdDispatchBase - vkCmdClearAttachments Value from conditional buffer is cached into designated register, MI_PREDICATE is emitted every time conditional rendering is enabled and command requires it. v2: by Jason Ekstrand - Use vk_find_struct_const instead of manually looping - Move draw count loading to prepare function - Zero the top 32-bits of MI_ALU_REG15 v3: Apply pipeline flush before accessing conditional buffer (The issue was found by Samuel Iglesias) v4: - Remove support of Haswell due to possible hardware bug - Made TMP_REG_PREDICATE and TMP_REG_DRAW_COUNT defines to define registers in one place. v5: thanks to Jason Ekstrand and Lionel Landwerlin - Workaround the fact that MI_PREDICATE_RESULT is not accessible on Haswell by manually calculating MI_PREDICATE_RESULT and re-emitting MI_PREDICATE when necessary. v6: suggested by Lionel Landwerlin - Instead of calculating the result of predicate once - re-emit MI_PREDICATE to make it easier to investigate error states. v7: suggested by Jason - Make anv_pipe_invalidate_bits_for_access_flag add CS_STALL if VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT is set. v8: suggested by Lionel - Precompute conditional predicate's result to support secondary command buffers. - Make prepare_for_draw_count_predicate more readable. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-18 18:31:44 +00:00

... 3 4 5 6 7 ...

551 commits