fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 13:48:06 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	1e242785c3	intel/fs: Implement load/store_scratch on XeHP Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>	2021-06-25 00:18:29 +00:00
Jason Ekstrand	c38812be1d	intel/fs: Implement spilling on XeHP Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>	2021-06-25 00:18:29 +00:00
Francisco Jerez	4dc4284342	intel/fs: Implement Wa_14013745556 on TGL+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	c19cfa9dc2	intel/fs: Fix synchronization of accumulator-clearing W/A move on TGL+. Right now the accumulator-clearing move emitted by the generator for Wa_14010017096 inherits the SWSB field from the previous instruction. This can lead to redundant synchronization, or possibly more serious issues if the previous instruction had a TGL_SBID_SET SWSB synchronization mode. Take the SWSB synchronization information from the IR. Fixes: `a27542c5dd` ("intel/compiler: Clear accumulator register before EOT") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	63abc083ce	intel/fs: Teach IR about EOT instruction writing the accumulator implicitly on TGL+. This is unlikely to have had any negative side effect on the original TGL, but will lead to issues on XeHP+ if the software scoreboard pass isn't able to synchronize the accumulator writes. Fixes: `a27542c5dd` ("intel/compiler: Clear accumulator register before EOT") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	5e7f443de0	intel/fs: Add SWSB dependency annotations for cross-pipeline WaR data hazards on XeHP+. In cases where an in-order instruction is overwriting a register previously read by another in-order instruction, drop the dependency iff the previous read is guaranteed to have occurred from the same in-order pipeline. This should only have an effect on XeHP+ since previous Xe platforms only had one in-order FPU pipeline. The previous workaround we were using for this treated all ordered read dependencies as write dependencies to avoid noise from our simulation environment. Relative to our previous workaround this improves performance of GFXBench5 gl_tess by ~7% on a DG2 system among other single-digit percentual FPS improvements. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	d46bb14d14	intel/fs: Implement Wa_22012725308 for cross-pipe accumulator data hazard. The hardware fails to provide the expected data coherency guarantees for accumulator registers when accessed from multiple FPU pipelines. Fix this by tracking implicit accumulator accesses just like we do for regular GRF registers, but instead of adding synchronization annotations for any dependency we only do it for dependencies with a pipeline mismatch, since the hardware should be able to guarantee proper synchronization for matching pipelines. Note that this workaround handles RaW and WaW dependencies in addition to the WaR dependencies described in the hardware bug report even though cross-pipeline RaW accumulator dependencies should be extremely rare, since chances are the hardware will also hang if we ever hit such a condition. This only affects XeHP+, since all FPU instructions are executed as a single in-order pipeline on earlier Xe platforms. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	385da1fe36	intel/fs: Track single accumulator in scoreboard lowering pass. This change reduces the precision of the scoreboard data structure for accumulator registers, because the rules determining the aliasing of accumulator registers are non-trivial and poorly documented (e.g. acc0 overlaps the storage of acc1 when the former is accessed with an integer type). We could implement those rules but it wouldn't have any practical benefit since we currently only use acc0-1, and for the most part we can rely on the hardware's accumulator dependency tracking. Instead make our lives easier by representing it as a single register. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	231337a13a	intel/fs/xehp: Assert that the compiler is sending all 3 coords for cubemaps. As required by HSDES:14013363432. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Lionel Landwerlin	7ed0aaced7	nir: use a more fitting index for btd_stack_push_intel Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Lionel Landwerlin	423c47de99	nir: drop the btd_resume_intel intrinsic This is now 100% equivalent to the new rt_resume intrinsic. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Lionel Landwerlin	4d9fcf2799	intel/rt: switch to common pass for shader calls lowering v2: rename for new indices Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Jason Ekstrand	b66d3e627a	intel/fs: Don't pull CS push constants if uses_inline_data Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Jason Ekstrand	c92fd35848	intel/rt: Use reloc constants for the resume SBT It's going to be attached to the end of the shader binary, not an arbitrary table somewhere in memory. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Jason Ekstrand	705395344d	intel/fs: Add support for compiling bindless shaders with resume shaders Instead of depending on the driver to compile each resume shader separately, we compile them all in one go in the back-end and build an SBT as part of the shader program. Shader relocs are used to make the entries in the SBT point point to the correct resume shader. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Jason Ekstrand	d055ac9bdf	intel/compiler: Add a U32 reloc type Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Jason Ekstrand	55508bbe66	intel/compiler: Generalize shader relocations a bit This commit adds a delta to be added to the relocated value as well as the possibility of multiple types of relocations. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Jason Ekstrand	f7668d6fe5	anv,iris: Move the SHADER_RELOC enums to brw_compiler.h They're common between the two drivers and we want to add a couple more that get emitted from code in src/intel/compiler. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Vinson Lee	5f771134ad	intel/vec4: Add missing break statement. Fix defect reported by Coverity Scan. Missing break in switch (MISSING_BREAK) unterminated_case: The case for value VEC4_OPCODE_ZERO_OOB_PUSH_REGS is not terminated by a break statement. Fixes: `89fd196f6b` ("intel/vec4: Add support for masking pushed data") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11347>	2021-06-18 05:02:23 +00:00
Dave Airlie	8da92b5c0a	intel/compiler: add flag to indicate edge flags vertex input is last 965 and the mesa st disagree on how vertex elements are ordered when edgeflags are involved. 965 wants them in gl_vert_attrib order, but gallium supplies the edgeflag as the last vertex element regardless. This adds a flag which is enabled for gen4/5 to denote that the edgeflag is at the end. When we reap 965 later we can resolve this better. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11146>	2021-06-14 06:05:18 +10:00
Dave Airlie	5f03570eaa	intel: reorder base program key. This gets hashed a lot, this reduces the size of this, and the other keys by a small amount Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11146>	2021-06-14 06:05:17 +10:00
Jason Ekstrand	e23b55c3f0	i965: Use nir_lower_passthrough_edgeflags Now that there's a common NIR pass, there's no point in us doing this in the back-end anymore. In order to use this pass in i965, we do have to make one tiny change. Gallium runs the pass after assigning input and output locations and so needs the pass to respect those locations and num_inputs. i965, however, runs it before any location assignment or I/O lowering so we don't care. We do, however, need the pass to succeed with num_inputs == 0 because we set that later. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11313>	2021-06-11 21:19:06 +00:00
Caio Marcelo de Oliveira Filho	8af6766062	nir: Move workgroup_size and workgroup_variable_size into common shader_info Move it out the "cs" sub-struct, since these will be used for other shader stages in the future. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>	2021-06-08 09:23:55 -07:00
Rhys Perry	1cbcfb8b38	nir, nir/algebraic: add byte/word insertion instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Caio Marcelo de Oliveira Filho	c8a7bd0dc8	nir: Rename WORK_GROUP (and similar) to WORKGROUP Be consistent with other usages in Vulkan and SPIR-V, and the recently added workgroup_size field. Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho	a71a780598	nir: Rename nir_intrinsic_load_local_group_size to nir_intrinsic_load_workgroup_size Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho	430d2206da	compiler: Rename local_size to workgroup_size Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Jason Ekstrand	9e0fd49858	intel/fs/ra: Fix payload node setup for SIMD16 on Gen4-5 Since `40e1d798c6`, we are now using physical register numbers for everything which makes it all simpler. In particular, we no longer need the special case for setting up the payload for SIMD16 on Gen4-5. This fixes a pile of piglit tests on ILK and similar. Fixes: `40e1d798c6` "intel/fs: Use ra_alloc_contig_reg_class()..." Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11221>	2021-06-07 16:52:19 +00:00
Marcin Ślusarz	2ebf4e984b	intel/disasm: remove useless space after "(" Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11070>	2021-06-07 08:46:11 +00:00
Marcin Ślusarz	daba2894ff	intel/disasm: decode/describe more send messages Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11070>	2021-06-07 08:46:11 +00:00
Eric Anholt	cf33316ec0	intel/vec4: Use ra_alloc_contig_reg_class() to reduce RA overhead. We go from 1672 RA regs to the real 128 HW regs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>	2021-06-04 19:08:57 +00:00
Eric Anholt	40e1d798c6	intel/fs: Use ra_alloc_contig_reg_class() to speed up RA. By using the new class type, we don't need to make 1928 different registers to represent each contigous reg size starting from the actual 128 HW register, or have a mapping between RA regs and HW base regs. With the number of regs reduced, and the fast q computation when using the new classes, we no longer need to compute our own q. This drops the FS RA initialization time on my CFL system from about 1ms to 50us. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>	2021-06-04 19:08:57 +00:00
Eric Anholt	95d41a3525	ra: Use struct ra_class in the public API. All these unsigned ints are awful to keep track of. Use pointers so we get some type checking. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>	2021-06-04 19:08:57 +00:00
Jason Ekstrand	06ae2723d1	intel/vec4: Also use MOV_FOR_SCRATCH for swizzle resolves In `2db8867943`, we introduced a new meta-op MOV_FOR_SCRATCH which is identical to MOV except it lets us identify MOVs emitted during spilling so we know not to re-spill those instructions. We emit them from shuffle_for_64bit_data whenever the new for_scratch parameter is true. Unfortunately, we missed the one used for resolving swizzles. Fixes: `2db8867943` "intel/vec4: Don't spill fp64 registers more..." Tested-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11155>	2021-06-03 06:14:17 +00:00
Jason Ekstrand	f63410eee6	intel/nir,i965: Move HW generation check for UBO pushing to i965 Iris only runs on BDW+ and ANV already handles this by not even trying on anything older than HSW. The only driver benefiting from this common check is i965. Moving it out makes the pass more generic and if some driver comes along which can push UBOs on IVB, it should work for that. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11145>	2021-06-03 05:12:33 +00:00
Dave Airlie	64fa67dd2f	intel/gfx6: move xfb_setup outside the gs compiler into the driver. This remove the use of a GL thing from the backend compiler Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11097>	2021-06-03 04:05:07 +00:00
Jason Ekstrand	f5e58838c2	intel/fs: Handle non-perspective-correct interpolation on gen4-5 Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11125>	2021-06-03 02:36:17 +00:00
Lionel Landwerlin	474eaa25ad	intel/fs: make sure shuffle is lowered to supported types On XeHP there are restrictions on types of source and destinations with float types. As shuffle is implemented using MOV we need to make sure we lower it to supported types. This fixes tests like : dEQP-VK.subgroups.arithmetic.framebuffer.subgroupexclusivemax_vec4_vertex dEQP-VK.subgroups.arithmetic.framebuffer.subgroupexclusivemul_f16vec3_vertex Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10902>	2021-05-22 21:55:33 +00:00
Felix DeGrood	380fa050f2	intel/compiler: balanced tileY/linear friendly LID order for CS Fixes perf regression introduced from tileY LID order for CS shaders that access both textures and buffers. Walks LIDs in X-major fashion, but with blocks of height 4. This maps LIDs per HW thread for SIMD8/16/32 as (2x4/4x4/8x4), which is always good for tileY resources and usually good for linear resources. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10733>	2021-05-22 00:15:25 +00:00
Felix DeGrood	c23e2a662a	intel/compiler: tileY friendly LID order for CS Computer shaders that access tileY resources (textures) benefit from Y-locality accesses. Easiest way to implement this is walk local ids in Y-major fashion, instead of X-major fashion. Y-major local ids will reduce partial writes and increase cache locality for tileY accesses since tileY resources cachelines progress in Y direction. Improves performance on TGL: Borderlands3.dxvk-g2 +1.5% Y-major can introduce a performance drop on CS that use mixture of buffers and images. This should be fixed in next commit. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10733>	2021-05-22 00:15:25 +00:00
Felix DeGrood	bbd6ce6e9d	intel/compile: refactor DERIVATIVE_GROUP logic Minor changes to logic to make following changes easier. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10733>	2021-05-22 00:15:25 +00:00
Felix DeGrood	ca59db9900	intel/compiler: Use switch for DERIVATIVE_GROUP logic Switch statement is more readable. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10733>	2021-05-22 00:15:25 +00:00
Jason Ekstrand	ebba3cad81	intel/vec4: Add support for UBO pushing Shader-db results on Haswell (vec4 only): total instructions in shared programs: 2853928 -> 2726576 (-4.46%) instructions in affected programs: 855840 -> 728488 (-14.88%) helped: 9500 HURT: 18 helped stats (abs) min: 1 max: 359 x̄: 13.54 x̃: 11 helped stats (rel) min: 0.44% max: 53.33% x̄: 19.13% x̃: 17.44% HURT stats (abs) min: 4 max: 124 x̄: 71.00 x̃: 92 HURT stats (rel) min: 3.64% max: 77.86% x̄: 46.43% x̃: 52.12% 95% mean confidence interval for instructions value: -13.78 -12.98 95% mean confidence interval for instructions %-change: -19.21% -18.81% Instructions are helped. total cycles in shared programs: 101822616 -> 60245580 (-40.83%) cycles in affected programs: 93312382 -> 51735346 (-44.56%) helped: 13292 HURT: 4506 helped stats (abs) min: 2 max: 1229260 x̄: 3370.82 x̃: 776 helped stats (rel) min: 0.04% max: 96.70% x̄: 47.56% x̃: 43.76% HURT stats (abs) min: 2 max: 17644 x̄: 716.37 x̃: 82 HURT stats (rel) min: 0.02% max: 491.80% x̄: 41.00% x̃: 11.11% 95% mean confidence interval for cycles value: -3037.07 -1635.03 95% mean confidence interval for cycles %-change: -26.03% -24.25% Cycles are helped. total spills in shared programs: 1080 -> 1314 (21.67%) spills in affected programs: 74 -> 308 (316.22%) helped: 0 HURT: 47 total fills in shared programs: 310 -> 497 (60.32%) fills in affected programs: 71 -> 258 (263.38%) helped: 0 HURT: 47 total sends in shared programs: 239884 -> 151799 (-36.72%) sends in affected programs: 129302 -> 41217 (-68.12%) helped: 9547 HURT: 0 helped stats (abs) min: 1 max: 226 x̄: 9.23 x̃: 8 helped stats (rel) min: 3.12% max: 98.15% x̄: 72.38% x̃: 80.00% 95% mean confidence interval for sends value: -9.48 -8.98 95% mean confidence interval for sends %-change: -72.80% -71.97% Sends are helped. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Jason Ekstrand	89fd196f6b	intel/vec4: Add support for masking pushed data This is the vec4 equivalent of `d0d039a4d3`, required for proper UBO pushing in vertex stages for Vulkan on HSW. Sadly, the implementation requires us to do everything in ALIGN1 mode and the vec4 instruction scheduler doesn't understand HW_GRF <-> UNIFORM interference so it's easier to do the whole thing in the generator. We add an instruction to the top of the program which just means "emit the blob" and all the magic happens in codegen. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Jason Ekstrand	a881f2295f	intel/vec4: Set up push ranges before we emit any code In order to avoid switching pull constants to push constants and then having to back to pull, compute the push ranges up-front. This way we know by the time we emit code exactly what ranges are pushable. This is a bit inefficient in the case where the "normal" push constants get compacted. However, most apps don't use giant piles of dead uniforms combined with substantial UBO use so this should be ok. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Jason Ekstrand	c35501ffe8	intel/vec4: Update nr_params in pack_uniform_registers This is where we re-arrange and re-pack the params. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Jason Ekstrand	3d1ac996d0	intel/vec4: Add some asserts to move_push_to_pull Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Jason Ekstrand	2db8867943	intel/vec4: Don't spill fp64 registers more than once The way we handle spilling for fp64 in vec4 is to emit a series of MOVs which swizzles the data around and then a pair of 32-bit spills. This works great except that the next time we go to pick a spill reg, the compiler isn't smart enough to figure out that the register has already been spilled. Normally we do this by looking at the sources of spill instructions (or destinations of fills) but, because it's separated from the actual value by a MOV, we can't see it. This commit adds a new opcode VEC4_OPCODE_MOV_FOR_SCRATCH which is identical to MOV in semantics except that it lets RA know not to spill again. Fixes: `82c69426a5` "i965/vec4: support basic spilling of 64-bit registers" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Connor Abbott	a40714abf7	nir/lower_phis_to_scalar: Add "lower_all" option We don't want to have to deal with vector phis in freedreno, because vectors are always split/unsplit around vectorized instructions anyways, and the stated reason for not scalarising them (it hurting coalescing) won't apply to us because we won't be using nir_from_ssa. Add this option so that we don't have to do the equivalent thing while translating from NIR. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10809>	2021-05-17 09:59:45 +00:00
Marcin Ślusarz	3340d5ee02	intel: simplify is_haswell checks, part 1 Generated with: files=`git grep is_haswell \| cut -d: -f1 \| sort \| uniq` for file in $files; do cat $file \| \ sed "s/devinfo->ver <= 7 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" \| \ sed "s/devinfo->ver >= 8 \|\| devinfo->is_haswell/devinfo->verx10 >= 75/g" \| \ sed "s/devinfo->is_haswell \|\| devinfo->ver >= 8/devinfo->verx10 >= 75/g" \| \ sed "s/devinfo.is_haswell \|\| devinfo.ver >= 8/devinfo.verx10 >= 75/g" \| \ sed "s/devinfo->ver > 7 \|\| devinfo->is_haswell/devinfo->verx10 >= 75/g" \| \ sed "s/devinfo->ver == 7 && !devinfo->is_haswell/devinfo->verx10 == 70/g" \| \ sed "s/devinfo.ver == 7 && !devinfo.is_haswell/devinfo.verx10 == 70/g" \| \ sed "s/devinfo->ver < 8 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" \| \ sed "s/device->info.ver == 7 && !device->info.is_haswell/device->info.verx10 == 70/g" \ > tmpXXX mv tmpXXX $file done Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10810>	2021-05-17 09:46:45 +00:00

1 2 3 4 5 ...

1747 commits