fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 17:50:12 +01:00

Author	SHA1	Message	Date
Rafael Antognolli	06438ea7fa	iris: Use 3DSTATE_CONSTANT_ALL when possible. Use this new instruction introduced in Gen12. The instruction itself is smaller, and it also allows us to emit a single instruction to all stages that have the same push constant buffers (e.g. when they don't have constant buffers). There's one restriction to use this instruction, though: the length field is only 5 bits long, so we need to check whether we can use it, and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32. v2 (Suggestions from Caio): - use max_length instead of large_buffers. - remove UNUSED and use #if GEN_GEN >= 12 instead. - inline "buffers" and drop BITSET_RANGE() usage. - add assert(n <= max_pointers) - move emit to outside of the loop. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	1ba9a18911	iris: Rework push constants emitting code. Split into a function the logic to gather the push constant buffers, which now stores them in struct push_bos. Another function is added to emit the packet, using data from the push_bos struct. This will be useful when adding a new function for emitting push constants for newer platforms. v2 (Suggestions from Caio): - rename 'n' -> 'buffer_count' - remove large_buffers (for now) - initialize push_bos - remove assert - change for() condition (i <= 3 -> i < 4) v3: - Add comment about size limit. - Rework "shift" logic and 'for' loop. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Kenneth Graunke	3fdf2bb313	iris: Disable VF cache partial address workaround on Gen11+ The vertex cache uses the full 48-bit address on Gen11+. See the documentation for 3DSTATE_VERTEX_BUFFERS, which describes the workaround and lists it as pre-Icelake. Interestingly, the docs don't mention index buffers as needing a workaround at all. So either we've been overzealous, or the docs never got updated to record that. Which begs the question of whether the issue there was fixed, if there was one... Cuts 40% of the PIPE_CONTROLs from Civilization VI's benchmark; appears that it improves performance by about 1-2% on Icelake 8x8 (not frequency locked).	2019-11-26 12:13:34 -08:00
Kenneth Graunke	f6aa51103b	iris: Update SURFACE_STATE addresses when setting sampler views We may have replaced the backing storage for a texture buffer while it was unbound, at which point iris_rebind_buffer would not have caught it and updated it. We need to ensure that the current resource's address matches the one our SURFACE_STATE points at. If not, update addresses and re-upload the SURFACE_STATE. Shader images and buffers do not suffer from this problem because we re-stream the surface state on every set call, since there isn't a created CSO object for those with a saved SURFACE_STATE. Constant buffers are also currently re-streamed (we pitch the SURFACE_STATE on every set_constant_buffer call). Surfaces would need this treatment (as they're created CSOs) except that we never swap out their backing storage today (we only do it for buffers), so it's OK for now. Fixes misrendering in Unreal 4 demos (Elemental, Matinee Fight Scene). Huge thanks to Andrii Simiklit for tracking down the problem - it was quite difficult to find! Also fixes Andrii's new Piglit test for the bug, 'arb_texture_buffer_object-re-init'. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1365	2019-11-25 15:54:54 -08:00
Kenneth Graunke	060a2c52fa	iris: Maintain CPU-side SURFACE_STATE copies for views and surfaces. When replacing the backing storage for texture buffers, image buffers, and so on, we may need to update the "Surface Base Address" field in any corresponding SURFACE_STATE. This is easier to accomplish if we have a copy on the CPU - we can just compare the current field, update it, and re-upload. This patch adds a CPU-side copy to the new iris_surface_state wrapper struct, and reworks allocation and upload to fill things out on the CPU copy first, then upload that to the GPU when finished. This will be necessary to fix iris_invalidate_resource bugs shortly. Technically, we never replace the backing storage for pipe_surfaces (render targets), so we don't need to make this change there. However, it's nice to have surfaces, sampler views, and image views handled similarly. Plus, if we ever wanted to swap out backing storage for busy textures, we'd need this infrastructure. v2: Properly free memory (caught by Andrii Simiklit)	2019-11-25 15:54:54 -08:00
Kenneth Graunke	2b09e818dc	iris: Create an "iris_surface_state" wrapper struct Today, we only have a state reference to the GPU buffer containing our uploaded SURFACE_STATEs. However, we're going to want a CPU-side copy soon. Making a wrapper struct means we can talk about both together, and also put both in the field called "surface_state".	2019-11-25 15:54:54 -08:00
Kenneth Graunke	4c1f81ad62	iris: Drop 'old_address' parameter from iris_rebind_buffer We can just compare the VERTEX_BUFFER_STATE address field to the current BO's address. When calling rebind, we've already updated the resource to the new buffer, but the state will have the old address.	2019-11-25 15:54:54 -08:00
Kenneth Graunke	518be59c1a	iris: Stop mutating the resource in get_rt_read_isl_surf(). Mutating fields of global resources is generally not safe, and the only reason we were doing it was to avoid passing an extra parameter to the fill_surface_state helper.	2019-11-25 15:54:54 -08:00
Rafael Antognolli	dadb6ebbd1	intel: Add workaround for stencil state. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-11-19 21:43:09 +00:00
Eric Anholt	882ca6dfb0	util: Move gallium's PIPE_FORMAT utils to /util/format/ To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to move their helpers out of gallium. Since u_format used util_copy_rect(), I moved that in there, too. I've put it in a separate directory in util/ because it's a big chunk of related code, and it's not clear to me whether we might want it as a separate library from libmesa_util at some point. Closes: #1905 Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-14 10:47:20 -08:00
Rafael Antognolli	a4da6008b6	iris: Use mocs from isl_dev. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-12 20:41:52 +00:00
Kenneth Graunke	fc7b748086	iris: Fix "Force Zero RTA Index Enable" setting again In `2ca0d913ea`, we began updating cso_fb->layers to the actual layer count, rather than 0. This fixed cases where we were setting "Force Zero RTA Index Enable" even when doing layered rendering. Sadly, it also broke the check entirely: cso_fb->layers is now 1 for non-layered cases, but the Force Zero RTA Index check was still comparing for 0. Fixes: `2ca0d913ea` ("iris: Fix framebuffer layer count")	2019-11-04 08:57:37 -08:00
Jordan Justen	bb0c5c487e	iris/gen11+: Move flush for render target change When starting a BLORP operation, we do the BTI-change flush. However, when ending it and transitioning back to regular drawing, we change the render target again - without a set_framebuffer_state() call. We need to do the BTI flush there too. BLORP flags IRIS_DIRTY_RENDER_BUFFER now, which will cause the next draw to get the BTI flush again. (explanation of fix by Ken) Fixes: `2b956a093a` ("iris: totally untested icelake support") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-31 00:24:25 -07:00
Rafael Antognolli	d3995c19eb	iris: Add Tile Cache Flush for Unified Cache.	2019-10-30 19:51:03 +00:00
Jordan Justen	b529db00ee	iris: Set MOCS for external surfaces to uncached Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 12:42:54 -07:00
Sagar Ghuge	c401186762	intel: Track stencil aux usage on Gen12+ Enable stencil compression enable and control surface enable bit if stencil buffer lossless compression is enabled. v2: Remove unnecessary GEN_GEN check (Nanley Chery) v3: (Nanley Chery) - Change commit subject tag from intel/isl to intel - Keep assignment order correct Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Plamena Manolova	0f610e17bc	iris: Implement new way for setting streamout buffers. For gen12 we set the streamout buffers using 4 separate commands instead of 3DSTATE_SO_BUFFER. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-29 19:20:25 +00:00
Nanley Chery	6020ebf799	iris: Enable HIZ_CCS in depth buffer instructions Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Plamena Manolova	1df871f8ff	iris: Add support for depth bounds testing. In gen12 we use the 3DSTATE_DEPTH_BOUNDS instruction to enable depth bounds testing. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 13:46:06 +00:00
Jordan Justen	2e6a7ced4d	iris/gen12: Write GFX_AUX_TABLE base address register Rework: * Move last_aux_map_state to iris_batch. (Nanley, Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:14 -07:00
Kenneth Graunke	0b7ecfdda5	iris: Implement the Broadwell NP Z PMA Stall Fix This should help avoid stalls in the pixel mask array in certain non-promoted depth cases. It especially helps for Z16, as each bit in the PMA corresponds to two pixels when using Z16, as opposed to the usual one pixel. Improves performance in GFXBench5 TRex by 22% (n=1).	2019-10-08 21:53:12 -07:00
Kenneth Graunke	face221283	iris: Properly unreference extra VBOs for draw parameters bound_vertex_buffers doesn't include extra draw parameters buffers. Tracking this correctly is kind of complicated, and iris_destroy_state isn't exactly in a hot path, so just loop over all VBO bindings. Fixes: `4122665dd9` (iris: Enable ARB_shader_draw_parameters support) Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>	2019-10-08 11:14:21 -07:00
Marek Olšák	732ea0b213	gallium: add PIPE_RESOURCE_FLAG_SINGLE_THREAD_USE to skip util_range lock u_upload_mgr sets it, so that util_range_add can skip the lock. The time spent in tc_transfer_flush_region decreases from 0.8% to 0.2% in torcs on radeonsi. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-07 20:05:00 -04:00
Kenneth Graunke	6d9c1f30e4	iris: Drop vtbl usage for some load_register calls We can just call the actual functions directly.	2019-10-07 14:10:33 -07:00
Jordan Justen	ae9c311b9a	iris/state: Move reg/mem load/store functions earlier in file Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-07 14:10:33 -07:00
Kenneth Graunke	90a35752b4	iris: Drop bonus parameters from iris_init_*_context() Nothing uses vtbl or dbg, and screen is available from the batch.	2019-10-07 13:15:56 -07:00
Kenneth Graunke	bd46dfa889	Revert "iris: Hack up a SKL/Gen9LP PS push constant fifo depth workaround" This reverts commit `4f857423b3`. It caused GPU hangs on all affected platforms, in e.g. Piglit bin/stencil-twoside -auto -fbo.	2019-10-07 09:08:41 -07:00
Kenneth Graunke	4f857423b3	iris: Hack up a SKL/Gen9LP PS push constant fifo depth workaround This is a port of Nanley's `904c2a617d` from i965 to iris. One concern is that iris uses larger batches, and also emits far fewer commands, so we may come closer to the 500 limit within a batch, and could need to supplement this with actual counting. Manhattan 3.0 had 239 3DSTATE_CONSTANT_PS packets in a batch, Unigine Valley had 155. So it seems like we're still in the realm of safety.	2019-10-05 17:18:45 -04:00
Kenneth Graunke	f1bba22f69	iris: Refactor push constant allocation so we can reuse it We'll need this for a workaround shortly. While refactoring, also improve the comment slightly.	2019-10-05 17:18:44 -04:00
Kenneth Graunke	309924c3c9	iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets. We can't just check for the BO base address, we need to check for the full address including any offset we may have applied. When updating the address, we need to include the offset again. Fixes: `5ad0c88dbe` ("iris: Replace buffer backing storage and rebind to update addresses.")	2019-09-30 12:41:03 -07:00
Kenneth Graunke	50c0dd8621	Revert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM" This reverts commit `729de1488f`. It turns out that, although the register is in the logical context, it isn't whitelisted, so we can't actually write it from userspace batch buffers. The write just becomes a noop, which is why we saw no performance changes. I manually whitelisted it, and still observed no performance gains, but it did regress KHR-GL46.texture_cube_map_array.color_depth_attachments on the iris driver. So we might need to fix something before enabling this. To prevent it randomly getting turned on should the kernel ever whitelist this register, we revert the patch for now.	2019-09-23 16:31:23 -07:00
Kenneth Graunke	a16975e615	iris: Rework iris_update_draw_parameters to be more efficient This improves a couple of things: 1. We now only update anything if the shader actually cares. Previously, is_indexed_draw was causing us to flag dirty vertex buffers, elements, and SGVs every time the shader switched between indexed and non-indexed draws. This is a very common situation, but we only need that information if the shader uses gl_BaseVertex. We were also flagging things when switching between indirect/direct draws as well, and now we only bother if it matters. 2. We upload new draw parameters only when necessary. When we detect that the draw parameters have changed, we upload a new copy, and use that. Previously we were uploading it every time the vertex buffers were dirty (for possibly unrelated reasons) and the shader needed that info. Tying these together also makes the code a bit easier to follow. In Civilization VI's benchmark, this code was flagging dirty state many times per frame (49 average, 16 median, 614 maximum). Now it occurs exactly once for the entire run.	2019-09-18 22:50:52 -07:00
Kenneth Graunke	6841f11d14	iris: Use state_refs for draw parameters. iris_state_ref is a <resource, offset> tuple, which is exactly what we need here.	2019-09-18 22:50:52 -07:00
Kenneth Graunke	3da8a8a3d6	iris: Avoid uploading SURFACE_STATE descriptors for UBOs if possible If we can entirely push uniform data, we don't need a SURFACE_STATE descriptor for pulling data. Since constant uploads are a very common operation, and being able to push all data is also very common, we would like to avoid the overhead in this case. This patch defers uploading new descriptors. Instead of handling that at iris_set_constant_buffer, we do it at iris_update_compiled_shaders, where we can see the currently bound shader variants. If any need pull descriptors, and descriptors are missing, we update them and flag that the binding table also needs to be refreshed. Improves performance in GFXBench5 gl_driver2 on an i7-6770HQ by 31.9774% +/- 1.12947% (n=15). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Kenneth Graunke	dd83ef0d1a	iris: Track per-stage bind history, reduce work accordingly We now track per-stage bind history for constant and shader buffers, shader images, and sampler views by adding an extra res->bind_stages field to go with res->bind_history. This lets us flag IRIS_DIRTY_CONSTANTS for only the specific stages involved, and also skip some CPU overhead in iris_rebind_buffer. Cuts 4% of 3DSTATE_CONSTANT_XS packets in a Shadow of Mordor trace on Icelake. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Kenneth Graunke	e7db3577f8	iris: Explicitly emit 3DSTATE_BTP_XS on Gen9 with DIRTY_CONSTANTS_XS Right now, we usually flag both IRIS_DIRTY_{CONSTANTS,BINDINGS}_XS, because we have SURFACE_STATE for constant buffers in case the shaders access them via pull mode. But this flagging is overkill in many cases. Gen8 and Gen11 don't need it at all. Gen9 doesn't need that large of a hammer in all cases. Just handle it explicitly so the right thing happens. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Kenneth Graunke	caa0aebd01	iris: Flag IRIS_DIRTY_BINDINGS_XS on constant buffer rebinds We upload a new SURFACE_STATE for the UBO/SSBO in question, which means that we need new binding tables as well. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Kenneth Graunke	f8c44e4ed7	iris: Skip allocating a null surface when there are 0 color regions. The compiler now sets the "Null Render Target" bit in the RT write extended message descriptor, causing it to write to an implicit null surface without us needing to set one up in the binding table. Together with the last patch, this improves performance in Car Chase on an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 14:27:51 -07:00
Kenneth Graunke	c9fb704f72	iris: Initialize ice->state.prim_mode to an invalid value It was calloc'd to 0 which is PIPE_PRIM_POINTS, which means that we fail to notice an initial primitive of points being new, and fail at updating the "primitive is points or lines" field. We do not need to reset this on device loss because we're tracking the last primitive mode sent to us on the CPU via draw_vbo, not the last primitive mode sent to the GPU. Fixes several tests: - dEQP-GLES3.functional.clipping.point.wide_point_clip - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner Fixes: `dcfca0af7c` ("iris: Set XY Clipping correctly.")	2019-09-13 16:31:29 -07:00
Anuj Phogat	729de1488f	intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM Initial benchmarking didn't show any performance benefits. But it might eventually. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-11 11:29:37 -07:00
Kenneth Graunke	077a1952cc	iris: Fix constant buffer sizes for non-UBOs Since the system value refactor, we've accidentally only been setting cbuf->buffer_size in the UBO case, and not in the uploaded-constants case. We use cbuf->buffer_size to fill out the SURFACE_STATE entry, so it needs to be initialized in both cases. Fixes: `3b6d787e40` ("iris: move sysvals to their own constant buffer")	2019-09-10 10:53:15 -07:00
Kenneth Graunke	7d28e9ddd6	iris: Optimize out redundant sampler state binds This cuts roughly 85% of the 3DSTATE_SAMPLER_STATE_POINTERS_PS calls in the J2DBench images test. For some reason, the state tracker is calling bind_sampler_state with the same sampler state in a bunch of cases.	2019-09-09 11:55:27 -07:00
Kenneth Graunke	9173459b95	iris: Ignore line stipple information if it's disabled The line stipple pattern and factor only matter if line stippling is actually enabled. Otherwise, we can safely ignore it. PBO upload may give us zero for line stipple information, while normal drawing tends to give us an actual stipple pattern such as 0xffff. This was causing us to flag IRIS_DIRTY_LINE_STIPPLE way too often, leading to useless 3DSTATE_LINE_STIPPLE commands, which are non-pipelined and thus very expensive. Improves performance in Manhattan 3.0 on Skylake GT4e by 0.149261% +/- 0.0380796% (n=210). On an Icelake 8x8 with the GPU frequency locked at 700Mhz, improves by 0.423756% +/- 0.222843% (n=3).	2019-09-09 10:55:20 -07:00
Jordan Justen	9790cfcefa	anv,iris: L3ALLOC register replaces L3CNTLREG for gen12 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-06 13:11:25 -07:00
Kenneth Graunke	0d0ae16e8f	intel: Stop redirecting state cache to command streamer cache section This bit redirects the state cache from the unified/RO sections of the L3 cache to the "CS command buffer" section of the cache, which would be set up via TCCNTLREG. The documentation says: "Additionaly, this redirection should be enabled only if there is a non-zero allocation for the CS command buffer section." We don't allocate any cache to the CS command buffer section, so enabling this redirection effectively disabled the state cache. The Windows driver only sets up that section when using POSH, which we do not currently use. So, leave it unallocated and disable the redirection to get a functional state cache again. Improves performance in Civilization VI by 18%, Manhattan 3.0 by 6%, and Car Chase by 2%.	2019-09-06 10:57:55 -07:00
Kenneth Graunke	68be5ff8d0	iris: Invalidate state/texture/constant caches after STATE_BASE_ADDRESS Jason pointed out that the caches likely refer to offsets from dynamic and surface state base addresses, so when we change those, we need to invalidate the caches. Comment borrowed from src/intel/vulkan/genX_cmd_buffer.c.	2019-09-06 10:57:55 -07:00
Caio Marcelo de Oliveira Filho	63f0259aeb	iris: Guard GEN9-only function in Iris state to avoid warning Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-23 13:25:27 -07:00
Kenneth Graunke	9310ae6f68	iris: Set MOCS in all STATE_BASE_ADDRESS commands Rafael Antognolli tracked down a performance gap between i965 and iris in Synmark2's OglCSDof microbenchmark, noting that iris was performing substantially more memory reads and writes, with substantially fewer L3 hits. He suggested that something might be wrong with MOCS, or L3 configs, at which point I came up with a theory... It would appear that the STATE_BASE_ADDRESS command updates the MOCS settings for various base addresses even if you don't specify the "Modify Enable" bit for that address. Until now, we had been setting only the MOCS for bases we intended to change, leaving the others "blank" which is MOCS table entry 0, which is uncached. Most data access has a more specific MOCS (e.g. in SURFACE_STATE), but scratch access uses the Stateless Data Port Access MOCS from STATE_BASE_ADDRESS. So this meant all scratch access was uncached. Improves performance in Synmark2's OglCSDof by 2x, bringing iris on par with the existing i965 driver. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-23 10:21:48 -07:00
Kenneth Graunke	1cd13ccee7	iris: Update fast clear colors on Gen9 with direct immediate writes. Gen11 stores the fast clear color in an "indirect clear buffer", as a packed pixel value. Gen9 hardware stores it as a float or integer value, which is interpreted via the format. We were trying to store that in a buffer, for similarity with Icelake, and MI_COPY_MEM_MEM it from there to the actual SURFACE_STATE bytes where it's stored. This unfortunately doesn't work for blorp_copy(), which does bit-for-bit copies, and overrides the format to a CCS-compatible UINT format. This causes the clear color to be interpreted in the overridden format. Normally, we provide the clear color on the CPU, and blorp_blit.c:2611 converts it to a packed pixel value in the original format, then unpacks it in the overridden format, so the clear color we use expands to the bits we originally desired. However, BLORP doesn't support this pack/unpack with an indirect clear buffer, as it would need to do the math on the GPU. On Gen11+, it isn't necessary, as the hardware does the right thing. This patch changes Gen9 to stop using an indirect clear buffer and simply do PIPE_CONTROLs with post-sync write immediate operations to store the new color over the surface states for regular drawing. BLORP continues streaming out surface states, and handles fast clear colors on the CPU. Fixes: `53c484ba8a` ("iris: blorp using resolve hooks") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-08-22 18:31:14 -07:00
Kenneth Graunke	117a0368b0	iris: Fix broken aux.possible/sampler_usages bitmask handling For renderable surfaces, we allocate SURFACE_STATEs for each bit in res->aux.possible_usages. Sampler views use res->aux.sampler_usages. When pinning buffers, we call surf_state_offset_for_aux() to calculate the offset to the desired surface state. surf_state_offset_for_aux() took an aux_modes parameter, which should be one of those two fields. However...it was not using that parameter. It always used the broader res->aux.possible_usages field directly. One of the callers, update_clear_value(), was passing incorrect masks for this parameter. It iterated through the bits in order, using u_bit_scan(), which destructively modifies the mask. So each time we called it, the count of bits before our selected mode was 0, which would cause us to always update the SURFACE_STATE for ISL_AUX_USAGE_NONE, rather than updating each in turn. This was hidden by the earlier bug where surf_state_offset_for_aux() ignored the parameter. Fixes: `7339660e80` ("iris: Add aux.sampler_usages.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-08-22 18:31:14 -07:00

1 2 3 4 5 ...

587 commits