fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-25 21:18:26 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	5abac85177	intel/fs: Rework scratch handling on Gen9+ The current scratch mechanism uses an MRF hack where we reserve a few GRF registers to treat like the MRF and we collect the data into that MRF region before doing a scratch write. We also use that region for the header for scratch reads. This commit changes things and gets rid of the MRF hack. Instead, we reserve a single register (which RA is free to pick) for the scratch header and uses split sends for scratch writes to avoid having to do the copy. This should provide RA with more freedom in the presence of spilling as well as avoid some unnecessary data moves. In future, the new GEN9_SCRATCH_HEADER opcode gives us a place where we can do our own per-thread scratch base address calculations rather than depending on the scratch base address that gets pushed into g0. Having an opcode for this lets us do it once at the top of the shader rather than repeating it at every read/write. One other noticeable difference is the use of SHADER_OPCODE_SEND. We can get away with this thanks to the fact that we're now using a set to track which instructions are generated by spills and don't rely on the opcodes to find spill/fill instructions. This allows us to avoid adding more virtual opcodes and let the normal code paths handle things like scoreboard dependencies between header setup and the SEND. It also means that post-RA scheduling may be able to space out the header setup MOV and the SEND for better latency hiding. Shader-db results on Skylake: total spills in shared programs: 12137 -> 10604 (-12.63%) spills in affected programs: 6685 -> 5152 (-22.93%) helped: 274 HURT: 2 total fills in shared programs: 13065 -> 11515 (-11.86%) fills in affected programs: 9007 -> 7457 (-17.21%) helped: 275 HURT: 1 Shader-db results on Ice Lake: total spills in shared programs: 12482 -> 10953 (-12.25%) spills in affected programs: 6586 -> 5057 (-23.22%) helped: 275 HURT: 0 total fills in shared programs: 12819 -> 11234 (-12.36%) fills in affected programs: 7867 -> 6282 (-20.15%) helped: 274 HURT: 0 Shader-db results on Tigerlake: total spills in shared programs: 11689 -> 10233 (-12.46%) spills in affected programs: 4740 -> 3284 (-30.72%) helped: 259 HURT: 0 total fills in shared programs: 10840 -> 9443 (-12.89%) fills in affected programs: 6244 -> 4847 (-22.37%) helped: 259 HURT: 0 Fossil-db results on Ice Lake: Spills in all programs: 245249 -> 201633 (-17.8%) Fills in all programs: 366066 -> 314368 (-14.1%) More practically, this seems to give about a 0.5-1% perf boost in Witcher 3 (DXVK) and Shadow of the Tomb Raider (Vulkan native). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	e557af9781	intel/fs/ra: Use a set to track added spill/fill instructions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	f650c4c0c6	intel/fs/ra: Sanity-check our IP counts Starting with `e99081e76d`, we don't re-construct liveness information every time we spill a register. Instead, we're very careful to track which instructions are spill instructions and not contribute those to the IP count so that we can continue to use the old liveness information even though instructions have been added. This commit adds an assert that sanity-checks that we count the same number of instructions as our liveness information is based on. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	d80d0a6ced	intel/fs/ra: Store the last non-spill VGRF node Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	2af6528c33	intel/fs/ra: Refactor handling of Gen7 scratch reads The attempt at de-duplication with the gen7_read Boolean wasn't actually saving us anything. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	74a1843ca0	intel/fs/ra: Increment spill_offset as part of the emit_spill loop This makes it consistent with our handling of src.offset and with our handling of spill_offset in emit_unspill. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	06ebf23283	intel/fs: Add a SCRATCH_HEADER opcode This opcode is responsible for setting up the buffer base address and per-thread scratch space fields of a scratch message header. For the most part, it's a copy of g0 but some messages need us to zero out g0.2 and the bottom bits of g0.5. This may actually fix a bug when nir_load/store_scratch is used. The docs say that the DWORD scattered messages respect the per-thread scratch size specified in gN.3[3:0] in the message header but we've been leaving it zero. This may mean that we've been ignoring any scratch reads/writes from a load/store_scratch intrinsic above the 1KB mark. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	24b64c8408	intel/fs: Copy the PTSS from g0 for scratch reads/writes In theory, this fixes a bug where we were dropping the PTSS bound on the floor. The hardware docs claim that the A32 DWORD and BYTE scattered read/write messages do a PTSS bounds check. However, in practice, it seems that the hardware ignores the bounds check so this doesn't actually matter. I verified this with the following couple of piglit tests: https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/399 In practice, this prevents the next commit from making a subtle behavioral change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	f103012fad	intel/batch_decoder: Don't clame vec4 vs/gs/tcs shaders on Gen11+ Because we hard-coded the default to vec4, any platform where it doesn't have a "Dispatch Mode" field gets vec4 by default. This includes Gen11+ where vec4 is no longer a thing. Change the default so it works on newer hardware. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Rhys Perry	8850a63161	radv/aco,nir/lower_subgroups: don't lower elect ACO can implement this better. fossil-db (Navi): Totals from 33 (0.02% of 135946) affected shaders: SGPRs: 1736 -> 1744 (+0.46%) VGPRs: 1680 -> 1656 (-1.43%) CodeSize: 246160 -> 245916 (-0.10%); split: -0.14%, +0.04% MaxWaves: 449 -> 461 (+2.67%) Instrs: 48301 -> 48266 (-0.07%); split: -0.12%, +0.05% Cycles: 469740 -> 469240 (-0.11%); split: -0.18%, +0.08% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:20 +00:00
Timur Kristóf	f11f4a2a4d	nir: Add ability to count primitives per stream. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	aac5adc3c2	nir: Count vertices per stream. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	2be99012e9	nir: Add ability to count emitted GS primitives. Add an option to nir_lower_gs_intrinsics which tells it to track the number of emitted primitives, not just vertices. Additionally, also make it per-stream. Also rename the set_vertex_count intrinsic to set_vertex_and_primitive_count. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Greg V	73dd86c421	radv,anv: use CLOCK_MONOTONIC_FAST when CLOCK_MONOTONIC_RAW is undefined CLOCK_MONOTONIC_FAST is a similar clock from FreeBSD. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6995>	2020-10-09 09:49:20 +00:00
Nanley Chery	290f3fe897	Revert "anv: Add driconf option to disable compression for 16bpp format" This reverts commit `bcfec61d1e`. The previous patch fixed the underlying issue that the above commit was actually working around. It turns out that the previously observed performance regression was due to invalid aux-map entries for multi-layer HiZ+CCS buffers. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7046>	2020-10-08 20:47:24 +00:00
Nanley Chery	cce6fc3b5c	anv: Enable multi-layer aux-map init for HIZ+CCS Fixes rendering corruption in the shadowmappingcascade Sascha Willems Vulkan demo. To see the corruption, I adjusted the demo options as follows: 1. Enable "Display depth map" 2. Set "Split lambda" to 0.100 3. Make "Cascade" non-zero. Fixes: `80ffbe915f` ("anv: Add support for HiZ+CCS") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7046>	2020-10-08 20:47:24 +00:00
Jason Ekstrand	b54d37a867	anv: Use the data cache for indirect UBO pulls on Gen8+ On Gen7, the data cache is pretty terrible so we'd rather avoid it there. On Gen8+, it should be fine and is less likely to conflict with texturing so we should get less cache thrashing there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>	2020-10-08 01:17:11 -05:00
Jason Ekstrand	89f3d116a8	anv: Plumb the device into *bits_for_access_flags Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>	2020-10-08 01:17:11 -05:00
Jason Ekstrand	3a33560681	anv: Use format_for_descriptor_type for descriptor buffers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>	2020-10-08 01:17:11 -05:00
Jason Ekstrand	d2185f0c3f	anv: Add a device parameter to format_for_descriptor_type Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>	2020-10-08 01:17:11 -05:00
Jason Ekstrand	3d22de05ca	intel/fs: Add an option to use dataport messages for UBOs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>	2020-10-08 01:17:06 -05:00
Jason Ekstrand	0d462dbee5	intel/fs: Add an alignment to VARYING_PULL_CONSTANT_LOAD_LOGICAL Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>	2020-10-08 01:14:46 -05:00
Lionel Landwerlin	caea5a6a20	intel/dev: fix 32bit build issue Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7049>	2020-10-08 05:42:31 +00:00
Jason Ekstrand	dd9c34a907	intel/nir: Lower load_global_constant in lower_mem_access_bit_sizes It's identical to nir_intrinsic_load_global except that it works on data that's guaranteed to be constant throughout the shader invocation. Fixes: `ff2f44d865` "intel/fs: Implement nir_intrinsic_load_global_constant" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6872>	2020-10-08 03:56:01 +00:00
Jason Ekstrand	fd04f858b0	intel/nir: Don't try to emit vector load_scratch instructions In `53bfcdeecf`, we added load/store_scratch instructions which deviate a little bit from most memory load/store instructions in that we can't use the normal untyped read/write instructions which can read and write up to a vec4 at a time. Instead, we have to use the DWORD scattered read/write instructions which are scalar. To handle this, we added code to brw_nir_lower_mem_access_bit_sizes to cause them to be scalarized. However, one case was missing: the load-as-larger-vector case. In this case, we take small bit-sized constant-offset loads replace it with a 32-bit load and shuffle the result around as needed. For scratch, this case is much trickier to get right because it often emits vec2 or wider which we would then have to lower again. We did this for other load and store ops because, for lower bit-sizes we have to scalarize thanks to the byte scattered read/write instructions being scalar. However, for scratch we're not losing as much because we can't vectorize 32-bit loads and stores either. It's easier to just disallow it whenever we have to scalarize. Fixes: `53bfcdeecf` "intel/fs: Implement the new load/store_scratch..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6872>	2020-10-08 03:56:01 +00:00
Jason Ekstrand	9df9f940f0	iris: Add support for load_work_dim as a system value Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7047>	2020-10-07 16:01:31 -05:00
Juan A. Suarez Romero	6a44bda879	intel/uuid: use git-sha1/package for the driver UUID We can't read information from the loaded shared object because we have different objects for Vulkan and OpenGL drivers, but we need to share the same UUID for both. Hence let's use SHA1 from the Git commit and package version. v2: use also package version for the case of building from tarball (Eric) v3: fix typos in comment (Tapani) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Rohan Garg <rohan.garg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7025>	2020-10-07 11:11:34 +03:00
Juan A. Suarez Romero	456fa9b838	iris: plumb device/driver UUID generators Use the same generators as used in anv driver so both Vulkan and OpenGL drivers can share the same external memory objects. v2: removed extra parameter from function gen_uuid_compute_device_id Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Signed-off-by: Eleni Maria Stea <estea@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Rohan Garg <rohan.garg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7025>	2020-10-07 11:11:28 +03:00
Juan A. Suarez Romero	e9a766a8c0	intel: split driver/device UUID generators We need Vulkan and GL to produce the same UUIDs. So move the generator from ANV to a common code that can be shared by ANV and Iris driver. v2: fix android build (Tapani) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Rohan Garg <rohan.garg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7025>	2020-10-07 11:11:23 +03:00
Marcin Ślusarz	9c25689287	intel: drop likely/unlikely around INTEL_DEBUG It's included in declaration of INTEL_DEBUG. Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>	2020-10-06 18:43:07 +00:00
Marcin Ślusarz	e06da554e9	anv: drop likely/unlikely around INTEL_DEBUG It's included in declaration of INTEL_DEBUG. Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>	2020-10-06 18:43:07 +00:00
Marcin Ślusarz	4015e1876a	intel: add INTEL_DEBUG expected value in declaration Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>	2020-10-06 18:43:07 +00:00
Sagar Ghuge	bcfec61d1e	anv: Add driconf option to disable compression for 16bpp format On Fallout4, enabling HIZ_CCS_WT compression for D16_UNORM format regress the performance by 2%, in order to avoid that disable compression via driconf option. The experiment showed that, running Fallout4 with HIZ performs better than HIZ_CCS and HIZ_CCS_WT. Reason behind that is the benchmark uses the depth pass with D16_UNORM surfaces format which fills the L3 cache and next pass doesn't make use of it where we end up clearing cache. v2: - Don't add conditional check in isl (Nanley, Jason) - Move disable_d16unorm_compression flag to instance (Lionel) - Use plane_format.isl_format (Nanley) v3: - Add more descriptive comment (Marcin Ślusarz) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6734>	2020-10-06 18:27:25 +00:00
Sagar Ghuge	49593205b9	anv: Factor out dri option initialization code in separate function Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6734>	2020-10-06 18:27:25 +00:00
Mike Blumenkrantz	c416adfb2d	anv: remove VkPipelineCacheCreateInfo::flags assert flags are handled, so this just crashes for no reason Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7029>	2020-10-06 10:29:33 -04:00
Lionel Landwerlin	9ad4b8b924	intel/dev: add a small non installable tool to print device info Mostly for debug purposes. $ ./build/src/intel/dev/intel_device_info /dev/dri/renderD128: name: Intel(R) UHD Graphics 620 (WHL GT2) gen: 9 PCI id: 0x3ea0 revision: 2 slice0.subslice0: 11111111 slice0.subslice1: 11111111 slice0.subslice2: 11111111 slices: 1 subslices: 3 EUs: 24 EU threads: 168 LLC: 1 threads per EU: 7 L3 banks: 4 max VS threads: 336 max TCS threads: 336 max TES threads: 336 max GS threads: 336 max WM threads: 256 max CS threads: 56 timestamp frequency: 12000000 v2: Missing license (Marcin) Fix stderr usage (Marcin) v3: Reformat topology printing (Marcin) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6957>	2020-10-06 12:31:16 +00:00
Lionel Landwerlin	79f3544412	intel/perf: fix crash when no perf queries are supported Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ec1fa1d51f` ("intel/perf: fix raw query kernel metric selection") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7024>	2020-10-06 10:20:57 +00:00
Jason Ekstrand	d82826ad44	anv: Implement VK_EXT_transform_feedback on Gen7 Things work a little different on Gen7 than they do on Gen8+. In particular, SOBufferEnable lives in 3DSTATE_STREAMOUT but BufferPitch lives in 3DSTATE_SO_BUFFER. This leaves us having to marshal data around a bit more than we did on Gen8. Still, it's not too bad. Normally, I don't spend much time on Gen7 but XFB just became a hard requirement for DXVK so it stopped working for all our Haswell users. Let's get them happily playing their games again. 😸 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3532 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6997>	2020-10-05 22:34:07 +00:00
Vinson Lee	81cd4c8f59	intel/vec4: Remove leftover code from Gen8+ removal. Remove code missed in commit `2a49007411` ("intel/vec4: Remove all support for Gen8+ [v2]"). Fix defect reported by Coverity Scan. Logically dead code (DEADCODE) dead_error_begin: Execution cannot reach this statement: mcs.swizzle = 80U; Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6927>	2020-10-03 03:53:46 +00:00
Eric Anholt	6f3352b6a7	driconf: Stop quoting true/false in boolean option definitions. Now that we're not trying to evade preprocessor macro expansion in preprocessor string concatenation, we can use plain old bools in option setup. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6916>	2020-10-02 23:59:52 +00:00
Eric Anholt	8a05d6ffc6	driconf: Make the driver's declarations be structs instead of XML. We can generate the XML if anybody actually queries it, but this reduces the amount of work in driver setup and means that we'll be able to support driconf option queries on Android without libexpat. This updates the driconf interface struct version for i965, i915, and radeon to use the new getXml entrypoint to call the on-demand xml generation. Note that our loaders (egl, glx) implement the v2 function interface and don't use .xml when that's set, and the X server doesn't use this interface at all. XML generation tested on iris and i965 using adriconf Acked-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6916>	2020-10-02 23:59:52 +00:00
Jason Ekstrand	8427e56067	intel/fs: Don't use NoDDClk/NoDDClr for split SHUFFLEs When I copied and pasted the code from MOV_INDIRECT for handling the dependency controls, I missed a subtle difference between MOV_INDIRECT and SHUFFLE. Specifically, MOV_INDIRECT gets lowered to a narrow instruction on Gen7 by the SIMD width lowering whereas SHUFFLE has to split it in the generator. Therefore, the check safety check for whether or not we can use dependency control has to be based on the lowered width rather than the width of the original instruction. Fixes: `a8ac61b0ee` "intel/fs: NoMask initialize the address..." Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3593 Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6989>	2020-10-02 19:53:56 +00:00
Jason Ekstrand	a8ac61b0ee	intel/fs: NoMask initialize the address register for shuffles Cc: mesa-stable@lists.freedesktop.org Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2979 Tested-by: Iván Briano <ivan.briano@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6825>	2020-10-02 00:42:56 +00:00
Anuj Phogat	545d852a7a	intel/gen9: Enable MSC RAW Hazard Avoidance Workaround # 22011374674 Applied to i965, iris and anv drivers No performance impact is observed with WA. Cc: mesa-stable Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-10-01 16:57:50 +00:00
Sagar Ghuge	b02bef01c8	intel/blorp: Conditionally clear full surface depth and stencil We should set "Full Surface Depth and Stencil Clear" field of WM_HZ_OP 3DSTATE packet, only when application requires the entire depth surface to be cleared. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6549>	2020-10-01 16:23:10 +00:00
Jason Ekstrand	d5849bc840	anv: Skip HiZ and CCS ambiguates which preceed fast-clears This gets rid of multiple HiZ ambiguate operations per frame in Witcher 3. v2: - Fix typo (Tapani) Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6549>	2020-10-01 16:23:10 +00:00
Jason Ekstrand	e9d5ec342d	anv: Use more temp vars in cmd_buffer_begin_subpass This is a mostly cosmetic change but there is one subtle functional issue: If we ever render to a 3D depth image, we are now handling the base layer and number of layers correctly. I'm not sure rendering to 3D depth is even allowed but we can theoretically handle it now. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6549>	2020-10-01 16:23:10 +00:00
Jason Ekstrand	7c92e413af	anv: Allow HiZ clears for multi-view Now that we're enabling HiZ on multi-layer images, there's no reason why we can't enable HiZ clears for multi-view. The only reason I can think of why we didn't before was because no one thought to and the old code didn't. Enabling this means that an attachment will get HiZ cleared if and only if att_state->fast_clear. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6549>	2020-10-01 16:23:10 +00:00
Eric Anholt	618556a8cb	nir: Drop the high_offset argument to the load_store_vectorizer filter. Nothing uses it, and it's not clear to me what it provides over alignment/num_components/bit_size. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>	2020-09-30 19:53:43 +00:00
Eric Anholt	5f757bb95c	nir: Make the load_store_vectorizer provide align_mul + align_offset. It was passing an encoding of the two that wasn't good for ensuring "Don't combine loads that would make us straddle a vec4 boundary" for nir_lower_ubo_vec4. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>	2020-09-30 19:53:43 +00:00

1 2 3 4 5 ...

5916 commits