fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 13:30:12 +01:00

Author	SHA1	Message	Date
Kenneth Graunke	f7d4ebbf86	iris: add hooks to call INTEL_MEASURE These hooks were written in the initial IRIS_MEASURE implementation. Minor changes by Mark Janes <markjanes@swizzler.org> to adapt to the INTEL_MEASURE reimplementation. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>	2021-02-01 17:24:57 -08:00
Mark Janes	b338bb70e0	iris: add a iris_context reference to iris_batch This eliminates the need to use container_of in error handling code. INTEL_MEASURE will need to access the iris context from each batch. suggested-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>	2021-02-01 17:24:57 -08:00
Kenneth Graunke	9d63547f2f	iris: Properly handle new unbind_num_trailing_slots parameters Commits 0278d1fa323cf1f289..b688ea31fcf7e20436 added a new parameter to set_vertex_buffers(), set_shader_images(), and set_sampler_views() which specifies a number of trailing slots to unbind. They updated the iris functions to do the unbinding, but didn't update the code to mark which things are bound in the bitfields. This meant that later code would assume those unbound slots were bound, and crash on a NULL dereference. All that's needed is to add that slot count when unbinding things in the bitfield. Fixes: `0278d1fa32` ("gallium: add unbind_num_trailing_slots to set_vertex_buffers") Fixes: `72ff66c3d7` ("gallium: add unbind_num_trailing_slots to set_shader_images") Fixes: `b688ea31fc` ("gallium: add unbind_num_trailing_slots to set_sampler_views") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8758>	2021-01-28 00:54:22 -08:00
Marek Olšák	27dcb46629	gallium: add take_ownership param into set_vertex_buffers to eliminate atomics There are a few places (mainly u_threaded_context) that do: set_vertex_buffers(...); for (i = 0; i < count; i++) pipe_resource_reference(&buffers[i].resource.buffer, NULL); set_vertex_buffers increments the reference counts while the loop decrements them. This commit eliminates those reference count changes by adding a parameter into set_vertex_buffers that tells the callee to accept all buffers without incrementing the reference counts. AMD Zen benefits from this because it has slow atomics if they come from different CCXs. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>	2021-01-27 23:53:35 +00:00
Marek Olšák	b688ea31fc	gallium: add unbind_num_trailing_slots to set_sampler_views Instead of calling this functions again to unbind trailing slots, extend it to do it when binding. This reduces CPU overhead. A lot of drivers ignore "start" and always unbind all slots after "count". Such drivers don't need any changes here. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>	2021-01-27 23:53:35 +00:00
Marek Olšák	72ff66c3d7	gallium: add unbind_num_trailing_slots to set_shader_images Instead of calling this function again to unbind trailing slots, extend it to do it when images are being set. This reduces CPU overhead. Only st/mesa benefits. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>	2021-01-27 23:53:34 +00:00
Marek Olšák	0278d1fa32	gallium: add unbind_num_trailing_slots to set_vertex_buffers Instead of calling this functions again to unbind trailing slots, extend it to do it as part of the call that sets vertex buffers. This reduces CPU overhead. Only st/mesa benefits from this. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>	2021-01-27 23:53:34 +00:00
Marek Olšák	a51d4b10f1	gallium: add take_ownership param into set_constant_buffer to eliminate atomics We often do this: pipe->set_constant_buffer(pipe, shader, slot, &cb); pipe_resource_reference(&cb->buffer, NULL); That results in atomic increment in set_constant_buffer followed by atomic decrement after set_constant_buffer. This new interface eliminates those atomics. For the case above, this should be used instead: pipe->set_constant_buffer(pipe, shader, slot, true, &cb); cb->buffer = NULL; // if cb is not a local variable, else do nothing AMD Zen benefits from this. The perf improvement is ~3% for Viewperf13/Catia. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>	2021-01-27 23:53:34 +00:00
Kenneth Graunke	939bc0c588	iris: Reconfigure the URB only if it's necessary or possibly useful Reconfiguring the URB partitioning is likely to cause shader stalls, as the dividing line between each stage's section of memory is moving. (Technically, 3DSTATE_URB_* are pipelined commands, but that mostly means that the command streamer doesn't need to stall.) So it should be beneficial to update the URB configuration less often. If the previous URB configuration already has enough space for our current shader's needs, we can just continue using it, assuming we are able to allocate the maximum number of URB entries per stage. However, if we ran out of URB space and had to limit the number of URB entrties for a stage, and the per-entry size is larger than we need, we should reconfigure it to try and improve concurrency. So, we begin tracking the last URB configuration in the context, and compare against that when updating shader variants. Cuts 36% of the URB reconfigurations (excluding BLORP) from a Shadow of Mordor trace, and 46% from a GFXBench Manhattan 3.0 trace. One nice thing is that this removes the need to look at the old prog_data when updating shaders, which should make it possible to unbind shader variants without causing spurious URB updates. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8721>	2021-01-27 18:30:54 +00:00
Kenneth Graunke	a710145b5b	intel: Produce a "constrained" output from gen_get_urb_config() When calculating a URB configuration, we start with a notion of how much space each stage /wants/ (to achieve the maximum amount of concurrency), but sometimes fall back to giving it less than that, because we don't have enough space. (Typically, this happens when the per-stage size is large, or there are many stages, or both.) We now output a "constrained" boolean which is true if we weren't able to satisfy all the "wants" due to a lack of space. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8721>	2021-01-27 18:30:54 +00:00
Yevhenii Kolesnikov	0c08a66ce5	iris: only set point sprite overrides if actually using points Fixes black screen in some FNA games. Cc: <mesa-stable@lists.freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3431 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7218>	2021-01-14 18:36:15 +00:00
Jason Ekstrand	f4902bb189	intel/genxml,anv,iris: Drop the legacy compute path from gen125.xml Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8342>	2021-01-13 13:10:28 -08:00
Jordan Justen	32857a6350	iris: Add support for COMPUTE_WALKER Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8342>	2021-01-13 13:10:28 -08:00
Marek Olšák	0cf5d1f226	gallium: remove PIPE_CAP_INFO_START_WITH_USER_INDICES and fix all drivers Drivers aren't allowed to ignore start with user index buffers anymore. This is required by the new fast path where mesa/main is using pipe_draw_info. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7679>	2021-01-04 19:22:34 -05:00
Marek Olšák	f2e281c231	iris: don't use index_bias if not indexed index_bias is undefined if index_size == 0. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7679>	2021-01-04 19:22:33 -05:00
Marek Olšák	912ba743b5	gallium: inline pipe_depth_state to decrease DSA state size by 4 bytes Depth and alpha states are now packed together, interleaved somewhat. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>	2020-12-22 12:01:38 +00:00
Marek Olšák	d0534cea7f	gallium: inline pipe_alpha_state to enable better DSA bitfield packing pipe_alpha_state and pipe_depth_state will be packed together because they have only a few bitfields each. This will eventually remove 4 bytes of padding in pipe_depth_stencil_alpha_state. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>	2020-12-22 12:01:38 +00:00
Marek Olšák	b7f12a0452	gallium: pass pipe_stencil_ref by value (it has only 2 bytes) This changes pipe_context::set_stencil_ref to pass the parameter by value. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>	2020-12-22 12:01:38 +00:00
Rob Clark	790144e65a	util+treewide: container_of() cleanup Replace mesa's slightly different container_of() with one more aligned to the linux kernel's version which takes a type as the 2nd param. This avoids warnings like: freedreno_context.c:396:44: warning: variable 'batch' is uninitialized when used within its own initialization [-Wuninitialized] At the same time, we can add additional build-time type-checking asserts Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7941>	2020-12-10 16:48:36 +00:00
Marek Olšák	1cd455b17b	gallium: extend draw_vbo to support multi draws Essentially rename multi_draw to draw_vbo and remove start and count from pipe_draw_info. This is only an interface change. It doesn't add multi draw support anywhere. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>	2020-11-18 01:41:25 +00:00
Marek Olšák	1a717dca04	gallium: move count_from_stream_output into pipe_draw_indirect_info This removes some overhead from tc_draw_vbo and increases the maximum number of draws per batch from 153 to 192 in u_threaded_context. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>	2020-11-18 01:41:24 +00:00
Tapani Pälli	460287adca	iris: initialize shared screen->vtbl only once Screen is shared among contexts, other context might be already using vtbl while another initializes it again. ==45872== Possible data race during write of size 8 at 0x5DDAE78 by thread #549 ==45872== Locks held: 1, at address 0x5D1B6F8 ==45872== at 0x6D66D91: gen9_init_state (iris_state.c:7816) ==45872== by 0x6BA0A31: iris_create_context (iris_context.c:342) ==45872== by 0x621F390: st_api_create_context (st_manager.c:917) ==45872== by 0x620E6F9: dri_create_context (dri_context.c:163) ==45872== by 0x6A40DB1: driCreateContextAttribs (dri_util.c:480) ==45872== by 0x540B963: dri2_create_context (egl_dri2.c:1583) ==45872== by 0x53FB84E: eglCreateContext (eglapi.c:821) ==45872== ==45872== This conflicts with a previous read of size 8 by thread #544 ==45872== Locks held: 1, at address 0x5F6E0E0 ==45872== at 0x6CB779E: blorp_alloc_binding_table (iris_blorp.c:167) ==45872== by 0x6CAEF70: blorp_emit_surface_states (blorp_genX_exec.h:1540) ==45872== by 0x6CB67F9: blorp_exec (blorp_genX_exec.h:2016) ==45872== by 0x6CB7AFE: iris_blorp_exec (iris_blorp.c:307) ==45872== by 0x70F5916: try_blorp_blit (blorp_blit.c:2145) ==45872== by 0x70F5FCA: do_blorp_blit (blorp_blit.c:2273) ==45872== by 0x70F778F: blorp_copy (blorp_blit.c:2803) ==45872== by 0x6BB9EB6: iris_copy_region (iris_blit.c:725) v2: move as genX(init_screen_state) (Lionel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7544>	2020-11-16 05:53:20 +00:00
Anuj Phogat	3c4e43e72b	intel: Pointer to SCISSOR_RECT array should be 64B aligned v2: Apply the workaround to all gen hardawre Ref: GEN:BUG:1409725701 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7463>	2020-11-09 21:29:04 +00:00
Jason Ekstrand	cdc546ae7f	iris: Flush caches based on brw_compiler::indirect_ubos_use_sampler Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7230>	2020-10-20 19:54:29 +00:00
Kenneth Graunke	02fe825a61	isl, anv, iris: Add a centralized helper to select MOCS based on usage On Gen12+, we can enable additional caches in certain usage situations. This routes that decision making to a central place in ISL, based on surface usage flags, and updates both drivers to use it. (i965 doesn't need to change because it doesn't support Gen12.) We continue handling the "external" decision via an anv_mocs() wrapper for now, since we store that flag in anv_bo, which isl doesn't know about. (We could introduce an ISL_SURF_USAGE_EXTERNAL, but I'm not actually sure that would be cleaner.) This patch should not have any functional nor performance effects, as we continue selecting the exact same MOCS values for now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>	2020-10-19 19:18:11 +00:00
Kenneth Graunke	71ed8c5aa6	iris: Fix doubling of shared local memory (SLM) sizes. Commit `67ee9c5f55` added support for using the `pipe_compute_state::req_local_mem` field, because Clover can have a run-time specified size that isn't baked into the shaders. However, it started adding the static size from the shader to the dynamic state-supplied size. The Mesa state tracker fills out req_local_mem to prog->Base.info.cs.shared_size, which is exactly what we fill out prog_data->total_shared to be. Effectively, this meant that we double-counted the same SLM requirements, doubling our space requirements. Fixes a 10% performance regression in Synmark2's OglCSDof test. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7152>	2020-10-14 23:13:41 +00:00
Jason Ekstrand	9df9f940f0	iris: Add support for load_work_dim as a system value Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7047>	2020-10-07 16:01:31 -05:00
Jason Ekstrand	67ee9c5f55	iris: Handle runtime-specified local memory size The value specified in pipe_compute_state is in addition to the implicit value computed by NIR. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7047>	2020-10-07 16:01:31 -05:00
Anuj Phogat	545d852a7a	intel/gen9: Enable MSC RAW Hazard Avoidance Workaround # 22011374674 Applied to i965, iris and anv drivers No performance impact is observed with WA. Cc: mesa-stable Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-10-01 16:57:50 +00:00
Jordan Justen	20a4235c4c	anv, iris: Set MediaSamplerDOPClockGateEnable for gen12+ This has been shown to help performance on TGL and DG1. This could be applied to gen9+, but we still need to show if it helps with those platforms. Rework: * Make change in src/intel/vulkan/genX_cmd_buffer.c too. (Ken) * Keep mask as 3 for gen < 12 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6684>	2020-09-11 17:40:03 -07:00
Jason Ekstrand	bbaa62e4e1	iris: Re-emit push constants if we have a varying workgroup size Fixes: `33c61eb2f1` "iris: Implement ARB_compute_variable_group_size" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6570>	2020-09-02 20:38:22 +00:00
Jason Ekstrand	536727c465	iris: Patch constant data pointers into shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>	2020-09-02 19:48:44 +00:00
Jason Ekstrand	63dd1e980c	iris: Always re-upload sysvals when we have kernel inputs They can change on every dispatch and clover never gives us a heads up. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6405>	2020-08-21 22:49:54 +00:00
Kenneth Graunke	3fed1c75ef	iris: Fix headerless sampler messages in compute shaders with preemption We were failing to set the "Headerless Message for Preemptable Contexts" bit in SAMPLER_MODE in the compute context. Other drivers use a single hardware context, so setting it on the render engine was sufficient to flip it in both pipelines. But iris uses a separate hardware context for compute, so we were only getting these set for the render context. Thanks to Jason Ekstrand for catching this bug. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6380>	2020-08-20 14:57:40 +00:00
Jason Ekstrand	65eeb06a7f	iris: Upload kernel inputs with system values Clover doesn't upload a cbuf0 but instead provides the kernel inputs as part of the pipe_grid. The most obvious thing to do is to upload them along with system values. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>	2020-08-12 10:11:06 +00:00
Jason Ekstrand	baa4cf9b8e	iris: Implement set_global_binding All this has to do is track which globals are bound and make sure the batch references them every time. We use A64 messages to access them so there are no binding table entries to manage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>	2020-08-12 10:11:06 +00:00
Jason Ekstrand	17280a8ef1	iris: no-op implement set_compute_resources Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>	2020-08-12 10:11:06 +00:00
Jordan Justen	7f48c6b6a2	iris/compute: Split out iris_load_indirect_location Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5571>	2020-06-24 00:14:36 +00:00
Jordan Justen	6557c8294d	iris: Split walker and state update into iris_upload_gpgpu_walker Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5571>	2020-06-24 00:14:36 +00:00
Jordan Justen	e2e0521ecb	iris/l3: Enable L3 full way allocation when L3 config is NULL Reworks: * Jordan: Check for cfg == NULL rather than is_dg1 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4956>	2020-06-22 11:41:59 -07:00
Nanley Chery	f8961ea086	iris: Disable sRGB fast-clears for non-0/1 values For texturing and draw calls, HW expects the clear color to be in two different color spaces after sRGB fast-clears - sRGB in the former and linear in the latter. Up until now, iris has stored the clear color in the sRGB color space. Limit the allowable clear colors for sRGB fast-clears to 0/1 so that both color space requirements are satisfied. Makes iris pass the sRGB -> sRGB subtest of the fcc-write-after-clear piglit test on gen9+. v2: * Drop iris_context::blend_enables. (Ken) * Drop some more resolve-related blend-state-tracking code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4972>	2020-06-19 21:03:31 +00:00
Nanley Chery	fbbf79377b	iris: Remove the CCS_D fallback Remove the CCS_D fallback logic so that iris doesn't attempt to use a non-existent surface state for some renders. Also, add an assertion to catch the issue. The fallback in iris_resource_render_aux_usage can lead to this problem because it doesn't account for the fact that surface states created from resources with the Y_TILED_CCS modifier may only have CCS_E or NONE as aux usages (due to iris_resource_create_with_modifiers). Without this change, the next commit would have triggered the fallback and regressed the following tests on gen9: * dEQP-EGL.functional.wide_color.window_888_colorspace_srgb * dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb * dEQP-EGL.functional.wide_color.pbuffer_888_colorspace_srgb * dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4972>	2020-06-19 21:03:31 +00:00
Francisco Jerez	479249bce6	iris/icl+: Report same caching domain as main surface for clear color BO. Even though the clear color BO is bound as a read-only buffer, report the same caching domain as the main BO in use_surface() (typically IRIS_DOMAIN_RENDER_WRITE) in order to avoid ping-ponging back and forth between IRIS_DOMAIN_RENDER_WRITE and IRIS_DOMAIN_OTHER_READ, which leads to increased stall-at-pixel-scoreboard synchronization between draw calls. Fixes a 5%-10% FPS regression in some benchmarks spotted on ICL. Reported-by: Clayton Craft <clayton.a.craft@intel.com> Fixes: `eb5d1c2722` "iris: Annotate all BO uses with domain and sequence number information." Closes: #3097 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5411>	2020-06-11 14:00:49 -07:00
Francisco Jerez	8a6349eb86	iris: Update cache coherency matrix on PIPE_CONTROL. This introduces a batch synchronization boundary at every PIPE_CONTROL command, and updates the cache coherency status tracked during batch construction according to the specified control bits. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>	2020-06-03 23:12:22 +00:00
Francisco Jerez	4b7fd91be6	iris: Report use of any in-flight buffers on first draw call after sync boundary. This is the main performance trade-off of this cache tracking mechanism: In order for the seqno vector of buffer objects to be accurate, they need to be marked as used again every time the batch is split into a new synchronization section if they remain bound to the pipeline. This can be achieved easily by re-using iris_restore_render_saved_bos() and iris_restore_compute_saved_bos(), which currently serve a similar purpose across batch buffer boundaries. The impact on Piglit drawoverhead results seems to be within a standard deviation of the current results. XXX - It might be possible to completely remove the current iris_batch::contains_draw flag at a small additional performance cost. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>	2020-06-03 23:12:22 +00:00
Francisco Jerez	eb5d1c2722	iris: Annotate all BO uses with domain and sequence number information. Probably the most annoying patch to review from the whole series -- Mark every buffer object use as accessed through some caching domain with the sequence number of the current synchronization section of the batch. The additional argument of iris_use_pinned_bo() makes sure I'd have gotten a compile error if I had missed any buffer added to the batch validation list. There are only a few exceptions where a buffer is left untracked while adding it to the validation list, justified below: - Batch buffers: These are strictly read-only for the moment. - BLORP buffer objects: Their seqnos are bumped manually at the end of iris_blorp_exec() instead, in order to avoid plumbing domain information through BLORP address combining. - Scratch buffers: The contents of these are strictly thread-local. - Shader images and SSBOs: Accesses of these buffers are explicitly synchronized at the API level. v2: Opt out of tracking more aggressively (Ken): In addition to the above, surface states, binding tables, instructions and most dynamic states are now left untracked, which means a lot more BO uses marked IRIS_DOMAIN_NONE which need to be reviewed extremely carefully, since the cache tracker won't be able to provide any coherency guarantees for them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>	2020-06-03 23:12:22 +00:00
Francisco Jerez	e81c07de41	iris: Bracket batch operations which access memory within sync regions. This delimits all batch operations which access memory between iris_batch_sync_region_start() and iris_batch_sync_region_end() calls. This makes sure that any buffer objects accessed within the region are considered in use through the same caching domain until the end of the region. Adding any buffer to the batch validation list outside of a sync region will lead to an assertion failure in a future commit, unless the caller explicitly opted out of the cache tracking mechanism. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>	2020-06-03 23:12:22 +00:00
Francisco Jerez	46183a999b	iris: Extend iris_context dirty state flags to 128 bits. We're nearly out of dirty bits, and some patches pending review on GitLab no longer apply due to that. Make room for them by splitting off shader stage-specific bits into a separate stage_dirty mask. An alternative would be to split compute-related bits into a separate mask, but that would prevent the '<< stage' indexing done in various parts of the driver from working. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>	2020-06-03 22:22:19 +00:00
Francisco Jerez	45918e0d8c	iris: Simplify iris_batch_prepare_noop(). This makes iris_batch_prepare_noop() return a boolean instead of passing through the relevant set of dirty flags. It will make it easier to change the representation of dirty flags. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>	2020-06-03 22:22:19 +00:00
Caio Marcelo de Oliveira Filho	bccf2a25a8	intel: Add helper to calculate GPGPU_WALKER::RightExecutionMask Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5142>	2020-05-27 18:16:31 -07:00

1 2 3 4 5 ...

689 commits