fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-02-18 21:20:29 +01:00

Author	SHA1	Message	Date
Kenneth Graunke	b6878d456f	st/mesa, iris: Add optional CPU-based ASTC void extent denorm flushing Intel Gen9 GPUs have hardware ASTC support, but have a bug where they don't handle denormalized values in void extent blocks correctly. This isn't that hard to work around - on upload, we can detect such blocks, and flush any denorms to zero. Because we're altering the data behind the application's back, and applications can theoretically ask to download the original unaltered image data, we unfortunately need to maintain shadow copies of the data. To make sure that we don't accidentally skip the void-extent flushing via any fast-upload paths, and support download correctly, we plug this into the st/mesa compressed texture format fallback paths, which store a CPU copy of the original image data, and upload altered data. This is unfortunately common code for what's likely to be a single driver's issue (on a single generation), but it beats replicating an entire framework we already have inside the driver. Fixes dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.* using iris on Intel Gen9 GPUs. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4167 Reviewed-by: Emma Anholt <emma@anholt.net> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21943>	2023-03-17 21:30:48 +00:00
Michel Dänzer	86c6634897	intel/vk/grl: Do not use no_override_init_args for C++ It's only valid for C code. Avoids cc1plus: error: command-line option '-Wno-override-init' is valid for C/ObjC but not for C++ [-Werror] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21781>	2023-03-17 16:08:33 +00:00
Francisco Jerez	76b4255cd8	intel/fs: Fix register coalesce in presence of force_writemask_all copy source writes. This fixes the behavior of register coalesce in cases where the source of a copy is written elsewhere in the program by a force_writemask_all instruction, which could cause the overwrite to be executed for an inactive channel under non-uniform control flow, causing can_coalesce_vars() to give incorrect results. This has been reported in cases like: > while (true) { > x = imageSize(img); > if (non_uniform_condition()) { > y = x; > break; > } > } > use(y); Currently the register coalesce pass would coalesce x and y in the example above, which is invalid since in the example above imageSize() is implemented as a force_writemask_all SEND message, whose result is broadcast to all channels, so when a given channel executes 'y = x' and breaks out of the loop, another divergent channel can execute a subsequent iteration of the loop overwriting 'x' with a different value, hence coalescing y and x into the same register changes the behavior of the program. Note that this is a regression introduced by commit `a4b36cd3dd`. In order to avoid the problem without reverting that patch, we prevent register coalesce if there is an overwrite of the source with force_writemask_all behavior inconsistent with the copy and this occurs anywhere in the intersection of the live ranges of source and destination, even if it occurs lexically before the copy, since it might be physically executed after the copy under divergent loop control flow. Fixes: `a4b36cd3dd` ("intel/fs: Coalesce when the src live range is contained in the dst") Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21351>	2023-03-17 03:05:24 -07:00
Francisco Jerez	d4015bcb38	intel/fs: Fix copy propagation dataflow analysis in presence of force_writemask_all ACP overwrites. This fixes the behavior of copy propagation in cases where either the source or destination of an ACP is overwritten elsewhere in the program by a force_writemask_all instruction, which could cause the overwrite to be executed for an inactive channel under non-uniform control flow, causing the current per-channel dataflow propagation to give incorrect results. This has been reported in cases like: > while (true) { > x = imageSize(img); > if (non_uniform_condition()) { > y = x; > break; > } > } > use(y); Currently the copy propagation pass would propagate copy 'y = x' into 'use(y)', which is invalid since in the example above imageSize() is implemented as a force_writemask_all SEND message, whose result is broadcast to all channels, so when a given channel executes 'y = x' and breaks out of the loop, another divergent channel can execute a subsequent iteration of the loop overwriting 'x' with a different value, hence replacing 'y' with 'x' at 'use(y)' changes the behavior of the program. This patch extends the global dataflow analysis algorithm to determine whether there is any control flow path from a given copy to an overwrite of its source or destination which has force_writemask_all behavior inconsistent with the copy, and in such case prevents copy propagation for that ACP entry at any point of the program which can be reached from the overwrite, even if the copy is statically re-executed along all such control flow paths (as in the example above), since the execution of the overwrite for a given channel i may corrupt other channels j!=i inactive for the subsequently re-executed copy. Note that a simpler solution has been attempted which fully shuts down copy propagation if such a force_writemask_all ACP overwrite is present /anywhere/ in the program regardless of its location in the control flow graph, however that led to large shader-db regressions in some programs from shader-db (like a CS from Car Chase which would emit 53% more instructions). With this solution the only handful of shaders that suffer instruction count regressions seem to be getting misoptimized right now (e.g. some compute shaders from Deus Ex Mankind). This solution doesn't seem to affect the run-time of shader-db significantly, it's less than 1% higher with the fix applied. Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21351>	2023-03-17 03:05:20 -07:00
Francisco Jerez	1c1be23497	intel/fs: Track force_writemask_all behavior of copy propagation ACP entries. force_writemask_all determines whether all channels of the copy are actually valid, and may be required to be set for it to be propagated safely in cases where the destination of the copy is used by another force_writemask_all instruction, or when the copy occurs in a divergent control flow block different from its use. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21351>	2023-03-17 03:05:18 -07:00
Kenneth Graunke	14f9f98dcb	i965/vec4: Implement uclz in the vec4 backend Commit `28311f9d02` moved ufind_msb lowering to NIR and started emitting uclz. Unfortunately, the vec4 backend never actually implemented uclz. It's trivial to do. Now it does. Fixes: `28311f9d02` ("nir: intel/compiler: Move ufind_msb lowering to NIR") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21974>	2023-03-17 09:01:18 +00:00
Kenneth Graunke	e7ea2aa46c	intel/fs: Make bld.F16TO32 actually emit F16TO32 not F32TO16 Ahem, "add builder helpers that work on Gfx7"...now might actually work. Too much copy and paste... Fixes: `966995d911` ("intel/fs: Add builder helpers for F32TO16/F16TO32 that work on Gfx7.x") Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21974>	2023-03-17 09:01:18 +00:00
Kenneth Graunke	84197bc0a4	intel/vec4: Retype texture/sampler indexes to UD generate_tex() asserts that sampler_index.type == UD, but commit `83fd7a5ed1` removed the uint temporary, which caused us to see D at some points. Really, either should be fine, but let's just put the UD retype back. This fixes a ton of things in crocus. Fixes: `83fd7a5ed1` ("intel: Use nir_lower_tex_options::lower_index_to_offset") Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21974>	2023-03-17 09:01:18 +00:00
Anuj Phogat	b4b43aa912	anv: implement TES distribution mode WA 22012785325 Set TEDMODE_RR_STRICT when TEEnable is set. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21899>	2023-03-16 14:42:53 +00:00
Constantine Shablya	d53aba56db	anv: use vk_get_physical_device_features Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21754>	2023-03-16 08:23:29 +00:00
Mark Janes	a2e5e7daa0	intel: use generated helpers for Wa_1409433168/Wa_16011107343 HSD 1306463417 is a hardware defect. The originating software workaround for the issue is Wa_1409433168. Convert all references to the software workaround number, and use generated helpers instead of GFX comparisons. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21914>	2023-03-15 23:31:08 +00:00
José Roberto de Souza	eec5ddd0ed	anv: Handle external objects allocation in Xe External(imported or exported) objects needs to have vm_id set to 0. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21885>	2023-03-15 18:17:11 +00:00
José Roberto de Souza	b2d82c25fb	anv: Properly alloc buffers that will be promoted to framebuffer in Xe KMD Xe KMD does a special caching handling for buffers that will be scanout to display, so that is why it needs a flag set during allocation. Checking if VK_STRUCTURE_TYPE_WSI_MEMORY_ALLOCATE_INFO_MESA is available in AllocateMemory() and marking the buffer as scanout. All WSI code paths but one sets VK_STRUCTURE_TYPE_WSI_MEMORY_ALLOCATE_INFO_MESA. The only one that doesn't requires that WSI is initialize with wsi_device_options.sw_device = true to be executed, what is not the case for ANV. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21885>	2023-03-15 18:17:11 +00:00
José Roberto de Souza	a311c031f6	anv: Implement Xe version of anv_physical_device_get_parameters() Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21885>	2023-03-15 18:17:11 +00:00
Rohan Garg	becc1c5615	anv: break out of the loop when the first color attachment is found Fixes: `2bd304bc` ("anv: Skip the RT flush when doing depth-only rendering") Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21903>	2023-03-15 10:52:50 +00:00
Emma Anholt	a74d2ef17d	ci/iris: Add skips for slow tests on APL. These get reported as flakes for timing out before passing when the shader cache is hot. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21879>	2023-03-15 08:15:37 +00:00
Mohamed Ahmed	5ada09412f	anv: remove GetBufferMemoryRequirements2() Signed-off-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21898>	2023-03-15 00:30:35 +00:00
Dave Airlie	4e0d4aab48	anv: fix image height for field pictures. Fixes: `98c58a16ef` ("anv: add initial video decode support for h264.) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21807>	2023-03-14 13:34:53 +00:00
Lionel Landwerlin	56474fae93	intel/fs: fix subgroup invocation read bounds checking nir->info.subgroup_size can be set to an enum : SUBGROUP_SIZE_VARYING = 0 SUBGROUP_SIZE_UNIFORM = 1 SUBGROUP_SIZE_API_CONSTANT = 2 SUBGROUP_SIZE_FULL_SUBGROUPS = 3 So compute the API subgroup size value and compare it to the dispatch size to determine whether we need some bound checking. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `9ac192d79d` ("intel/fs: bound subgroup invocation read to dispatch size") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21856>	2023-03-14 12:15:48 +00:00
Lionel Landwerlin	bf59cfcee1	intel/fs: prevent large vector ops generated by peephole_ffma Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21782>	2023-03-14 10:38:50 +00:00
Lionel Landwerlin	bc08f43991	intel/fs: add MOV source count validation Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21782>	2023-03-14 10:38:50 +00:00
Lionel Landwerlin	ed3c2f73db	intel/fs: fixup sources number from opt_algebraic Fixes issues with register_coalesce : fossilize-replay: brw_fs_register_coalesce.cpp:297: bool fs_visitor::register_coalesce(): Assertion `mov[i]->sources == 1' failed. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21782>	2023-03-14 10:38:50 +00:00
Lionel Landwerlin	18bdc71459	intel/fs: fix nir_opt_peephole_ffma max vec assumption There can be larger vec than vec4. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21782>	2023-03-14 10:38:50 +00:00
Lionel Landwerlin	efde1917c9	intel/fs: don't SEND messages as partial writes For instance, to load uniform data with the LSC we usually rely on tranpose messages which have to execute in SIMD1. Those end up being considered as partial writes so within loops their life span spread to the whole loop, increasing register pressure. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21867>	2023-03-14 10:10:32 +00:00
Lionel Landwerlin	adcdc38f3b	anv: more formats for acceleration structure vertices Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21821>	2023-03-14 09:34:27 +00:00
Dave Airlie	cb24faf1a6	anv/video: disable picture id reampping. This isn't needed at the hw level with vulkan Fixes: `98c58a16ef` ("anv: add initial video decode support for h264.") Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21433>	2023-03-14 07:32:00 +00:00
Dave Airlie	f85b2cbe33	anv/video: fix chroma qp to be a integer value. This is just a cleanup to the genxml Fixes: `98c58a16ef` ("anv: add initial video decode support for h264.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21433>	2023-03-14 07:32:00 +00:00
Lionel Landwerlin	d8013976c7	anv: export EXT_pipeline_library_group_handles only with RT Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21870>	2023-03-14 02:08:01 +00:00
Jordan Justen	48ff68820e	intel/dev: Enable MTL PCI ids Ref: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/drm/i915_pciids.h?h=v6.0-rc4#n736 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18481>	2023-03-13 10:17:51 +00:00
David Heidelberg	2b00eaaedc	ci/iris: update apl and glk expectations, after enabling Wayland support After enabling the Wayland platform for x86_64, multiple new tests were triggered, some of which timed out. Also wayland-dEQP-EGL.functional.negative_api.create_pixmap_surface now pass. Signed-off-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21786>	2023-03-12 00:11:09 +00:00
José Roberto de Souza	43e21702f6	anv: Integrate gem vm bind and unbind kmd backend functions Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21698>	2023-03-11 17:56:01 +00:00
José Roberto de Souza	37fa2fa30e	anv: Add gem VM bind and unbind to backend Not using it yet, that will be done in the next patch. Xe only supports submission using VM. For i915 the backend functions are just a noop. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21698>	2023-03-11 17:56:01 +00:00
José Roberto de Souza	324d22d684	anv: Implement gem close and mmap for Xe backend Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21698>	2023-03-11 17:56:01 +00:00
José Roberto de Souza	149e945ad4	anv: Implement Xe functions to create and destroy VM Also using the vm_id to create gem buffers. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21698>	2023-03-11 17:56:01 +00:00
José Roberto de Souza	d5f767edf9	anv: Implement gem_create for Xe backend Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21698>	2023-03-11 17:56:01 +00:00
Felix DeGrood	341f1011a6	intel/perf: Hide extended metrics by default XE architecture enables many more metrics, perhaps too many for the average user. Reduce reported metrics to smaller subset, known as non-extended metrics, by default. Can re-enable extended metrics with env var INTEL_EXTENDED_METRICS=1 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21841>	2023-03-11 05:05:06 +00:00
Guilherme Gallo	256e7888fd	ci: Fix release build use for performance jobs This commit ensures that we are using mesa release builds in performance jobs. To achieve that, some modifications were made on top of https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21492. - Append the `BUILDTYPE` variable into the S3 artifact name (MINIO_ARTIFACT_NAME environment variable) to allow for better artifact management. - The ./artifacts directory has been added to the list of artifact directories for build-common. This ensures that the debian-release and debian-arm64-release jobs are the only ones necessary for running performance jobs. These jobs only produce artifacts via prepare-artifacts.sh when we are under performance workflow. - Make lava-submit.sh behave similar to baremetal jobs regarding MINIO_ARTIFACT_NAME variable. For example, users can now easily differentiate between mesa-arm64.tar.zstd and mesa-arm64-release.tar.zstd by looking inside the `Downloading artifacts from s3` Gitlab section. Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21804>	2023-03-10 21:40:23 +00:00
José Roberto de Souza	757e2dd692	intel/perf: Disable it for Xe KMD Xe still don't have support for performance metrics. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21773>	2023-03-10 19:41:14 +00:00
Ian Romanick	28311f9d02	nir: intel/compiler: Move ufind_msb lowering to NIR Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Cycles in all programs: 9098346105 -> 9098333765 (-0.0%) Cycles helped: 6 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	08ca862ef8	intel/compiler: Tighter src and dest size bounds checking for some opcodes Enforce the sizes listed in the Skylake PRM: BFREV: source types: D destination types: D CBIT: source types: UB, UW, UD destination types: UD FBH: source types: D, UD destination types: UD FBL: source types: UD destination types: UD LZD: source types: D, UD destination types: UD v2: Update BFREV commit message documentation. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	0cc7bf63b7	nir: intel/compiler: Move ifind_msb lowering to NIR Unlike ufind_msb, ifind_msb is only defined in NIR for 32-bit values, so no @32 annotation is required. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	15c6c859cf	intel/compiler: Lower find_lsb in NIR No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Eric Engestrom	f5d3d1e7ed	meson: inline gtest_test_protocol now that it's always 'gtest' Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21485>	2023-03-10 07:20:29 +00:00
Sagar Ghuge	9a34b2ab0e	intel/compiler: Add swsb_stall debug option When enabled, on gfx12 plus, we will add the sync nop instruction after each instruction to make sure that current instruction depends on the previous instruction explicitly. This option will help us to get a hint if something is missing or broken in software scoreboard pass. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21797>	2023-03-10 06:55:39 +00:00
Lionel Landwerlin	5aec829f97	iris: trace frames with u_trace Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21648>	2023-03-10 00:36:41 +00:00
Kenneth Graunke	dfe652fb03	intel/eu: Simplify brw_F32TO16 and brw_F16TO32 Now that we aren't using them on Gfx8+ we can drop a lot of cruft. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21783>	2023-03-09 23:26:17 +00:00
Kenneth Graunke	c590a3eadf	intel/fs: Move packHalf2x16 handling to lower_pack() This mainly lets the software scoreboarding pass correctly mark the instructions, without needing to resort to fragile manual handling in the generator. We can also make small improvements. On Gfx 8LP-12.0, we no longer have the restrictions about DWord alignment, so we can simply write each half into its intended location, rather than writing it to the low DWord and then shifting it in place. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21783>	2023-03-09 23:26:17 +00:00
Kenneth Graunke	f5e5705c91	intel/fs: Use F32TO16/F16TO32 helpers in fquantize16 handling I originally thought that we were intentionally emitting the legacy opcodes here to make them opaque to the optimizer, so that it wouldn't eliminate the explicit type conversions, as they're actually required to do the quantization. But...we don't actually optimize those away currently anyway. So...go ahead and use the helpers for consistency. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21783>	2023-03-09 23:26:17 +00:00
Kenneth Graunke	44c6ccb197	Revert "intel/fs: Fix inferred_sync_pipe for F16TO32 opcodes" With the previous patch, we no longer need to special case this, as we emit a MOV with an HF source, rather than F16TO32 with an UW source, on all platforms that need scoreboarding. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21783>	2023-03-09 23:26:17 +00:00
Kenneth Graunke	309ec3725a	intel/fs: Use new F16TO32 helpers for unpack_half_split_* opcodes This gets us a MOV at the IR level on Gfx8+ which should be more optimizable than F16TO32. It also removes confusion about which pipe which the instruction will run on. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21783>	2023-03-09 23:26:17 +00:00

1 2 3 4 5 ...

9249 commits