fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 15:20:10 +01:00

Author	SHA1	Message	Date
Sagar Ghuge	e6db2299a8	intel/compiler: Allow ternary add to promote source to immediate Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Sagar Ghuge	cde9ca616d	intel/compiler: Make decision based on source type instead of opcode This patch restructure code a little bit to check if source can be represented as immediate operand. This is a foundation for next patch which add checks for integer operand as well. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Sagar Ghuge	705285b9f4	intel/compiler: Add support for ternary add instruction on XeHP v2: - Re-arragne opcode in correct order (Matt Turner) - Move ADD3 case closer to LRP (Jason) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Jason Ekstrand	6642749458	intel/dev: Add a max_cs_workgroup_threads field This is distinct form max_cs_threads because it also encodes restrictions about the way we use GPGPU/COMPUTE_WALKER. This gets rid of the MIN2(64, devinfo->max_cs_threads) we have scattered all over the driver and puts it in a central place. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11861>	2021-07-14 23:02:34 +00:00
Ian Romanick	3cb203303c	intel/compiler: Update block IPs once in opt_cmod_propagation No difference proven at 95.0% confidence (n=10) in dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13. v2: Only update each block's IP data once instead of once per block. Suggested by Emma. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11632>	2021-07-14 09:57:06 -07:00
Ian Romanick	8f1052938d	intel/compiler: Update block IPs once in register_coalesce Performance improvement in dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 for n=30: release build (w/Fedora build flags): -0.82% ± 0.23% Meson -Dbuildtype=debugoptimized: -0.74% ± 0.27% The difference in the debugoptimized build is the calls to inst_is_in_block(block, this) still exist on each call to remove(). v2: Only update each block's IP data once instead of once per block. Suggested by Emma. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11632>	2021-07-14 09:57:04 -07:00
Ian Romanick	f3f3817307	intel/compiler: Update block IPs once in dead_code_eliminate Performance improvement in dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 for n=30: release build (w/Fedora build flags): -7.79% ± 0.25% Meson -Dbuildtype=debugoptimized: -5.10% ± 0.40% The difference in the debugoptimized build is the calls to inst_is_in_block(block, this) still exist on each call to remove(). v2: Only update each block's IP data once instead of once per block. Suggested by Emma. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11632>	2021-07-14 09:57:01 -07:00
Ian Romanick	8ca1bc5f94	intel/compiler: Add cfg_t::adjust_block_ips() method Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11632>	2021-07-14 09:56:59 -07:00
Ian Romanick	8206b04d43	intel/compiler: Add the ability to defer IP updates in backend_instruction::remove Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11632>	2021-07-14 09:56:46 -07:00
Jason Ekstrand	3d934ee03f	glsl: Delete lower_texture_projection This is only used by i965 and we've been getting it through nir_lower_tex since forever. Get rid of the GLSL IR pass. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11827>	2021-07-13 14:06:33 +00:00
Marcin Ślusarz	f3742b9c13	intel/compiler: document register types Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11677>	2021-07-12 13:27:41 +00:00
Lionel Landwerlin	91dcbf1f56	intel/compiler: Track latency/perf of LSC fences Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11759>	2021-07-12 11:39:03 +00:00
Andrii Simiklit	57f54bb9cc	Remove redundant assignment Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4957 Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11780>	2021-07-09 09:34:27 +00:00
Jason Ekstrand	624e799cc3	nir: Drop nir_ssa_def::name and nir_register::name We say that they're for debug only but we don't really have a good policy around when to set them and when not to. In particular, nir_lower_system_values and nir_lower_vars_to_ssa which are the chief producers of SSA values which might reasonably have a name do not bother to set one. We have some names set from things like BLORP and RADV's meta shaders but AFAICT, they're setting a name more because it's there than because they actually care. Also, most things other than nir_clone and nir_serialize don't bother to try and preserve them. You can see in the diffstat of this commit exactly what passes attempt to preserve names. Notably missing from the list is opt_algebraic which is the single largest source of SSA def churn and it happily throws names away. These observations lead me to question whether or not names are actually useful at all or if they're just taking up space (8B per instruction) and wasting CPU cycles (to ralloc_strdup on the off chance we do have one). I don't think I can think of a single time in recent history where I've been debugging a shader issue and a SSA value name has been there and been useful. If anything, the few times they are there, they just throw me off because they mess up the indentation in nir_print. iris shader-db on my system gets runtime -2.07734% +/- 1.26933% (n=5) Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5439>	2021-07-08 17:34:41 +00:00
Connor Abbott	e4e79de2a4	nir/subgroups: Support > 1 ballot components Qualcomm has a mode with a subgroup size of 128, so just emitting larger integer operations and then lowering them later isn't an option. This makes the pass able to handle the lowering itself, so that we don't have to go down to 64-thread wavefronts when ballots are used. (The GLSL and legacy SPIR-V extensions only support a maximum of 64 threads, but I guess we'll cross that bridge when we come to it...) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Yevhenii Kolesnikov	974c58b317	intel: fix leaking memory on shader creation ralloc_adopt takes care of all the shader's children, but shader itsel ends up orphaned and never gets free'd. Fixes: `ef5bce9253` ("intel: Drop the last uses of a mem_ctx in nir_builder_init_simple_shader().") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4951 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11651>	2021-06-30 19:34:56 +03:00
Jason Ekstrand	f5876dfdb9	intel/fs: Lower uniform pull constant load message to LSC dataport Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	6362059b6b	intel/fs: Lower varying pull constant load message to LSC dataport Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	4fca64ad4d	intel/fs: Lower A64 atomic messages to LSC dataport Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	07a4bdf1e8	intel/fs: Lower A64 byte scattered r/w messages to LSC dataport Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Mark Janes	22d20dbb02	intel/fs: Lower A64 untyped r/w messages to LSC when available We set the ex_desc to 0, since the address surface type is FLAT. v2 (Sagar Ghuge): - Fix message descriptor encoding v2 (Jason Ekstrand): - Drop support for block messages Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com> Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	621cf9b1df	intel/fs: Lower Byte scattered r/w messages to LSC when available v2 (Jason Ekstrand): - Squash in brw_scheduler changes - Update brw_ir_performance Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	8f82c8aa1a	intel/fs: Lower untyped float atomic messages to LSC when available Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Mark Janes	bd40a1e8c9	intel/fs: Lower untyped atomic messages to LSC when available Bspec programming note metions that "Atomic messages are always forced to "un-cacheable" in the L1 cache". We can make the L1 cache un-cacheable and L3 with write-back policy. v2: (Sagar Ghuge): - Fix caching policy for atomic messages - Fix simd exec size v3: (Sagar Ghuge): - Add atomic messages to brw_schedule_instructions v4: (Jason Ekstrand): - Rebase on lsc_msg_desc reworks Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com> Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Mark Janes	4f86a70599	intel/fs: Lower DW untyped r/w messages to LSC when available This puts the basic infrastructure in place for lowering logical dataport messages to LSC messages. We start with the two most obvious opcodes and add more in later patches. v2 (Sagar Ghuge): - Pass required params to message desc - Remove duplicate mlen calculation - Change commit message. v3 (Jason Ekstrand): - Drop TGM support Co-authored-by: Jason Ekstrand <mark.a.janes@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Mark Janes	32ec0662fd	intel/compiler: Add LSC messages to brw_schedule_instructions v2 (Jason Ekstrand): - Use lsc_msg_desc_opcode() - Drop all opcodes for now and add them in later patches. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Jason Ekstrand	8d3468ad5b	intel/compiler: Add LSC to messages brw_ir_performance This adds framework only. No opcodes. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	634925694d	intel/disasm: Disassemble LSC message extended descriptors v2 (Mark Janes): - changed to lsc convention v3 (Jason Ekstrand): - Use lsc_msg_desc_addr_type Co-authored-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	2605727a80	intel/disasm: Disassmeble LSC messages v2 (Jordan Justen): - Use PRIu64 v3 (Jason Ekstrand): - Drop ranged fence ops, Jason v4: (Mark Janes) - fixed missing parameter to brw_message_desc_cmask_or_vector - changed to use lsc methods to extract fields v5 (Jason Ekstrand): - Squash original disassembler patch and fixes togetherk - Use lsc_opcode_has_cmask - Prefix atomic ops with "atomic_" Co-authored-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Mark Janes	f5541cd4e9	intel/compiler: Add getter helpers for LSC message descriptor fields v2: (Sagar Ghuge): - rename addr_reg_size to src0_len to match with bspec v3 (Jason Ekstrand): - Re-arrange things in increasing bit order Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	4ff00194b7	intel/compiler: Add helpers for LSC message descriptors v2 (Jason Ekstrand): - Squash all the similar patches together - Re-arrange and rename some things to be more consistent - Add a lsc_opcode_has_cmask helper - Drop is_one_addr_reg v3 (Jason Ekstrand): - Add transpose - Re-order arguments to make more logical sense - Switch from `write` to `has_dest` Co-authored-by: Mark Janes <mark.a.janes@intel.com> Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	b67f1ff465	intel/compiler: Add support for LSC fence operations v2 (Jason Ekstrand): - Squash SLM and global fence ops together v3 (Jason Ekstrand): - Rework to use message descriptors instead of instruction fields v4 (Jason Ekstrand): - Don't pass BTI into back-end emit function. Always use FLAT. Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	cf612e4dc1	intel/compiler: Define new LSC data port encodings Xe-HPG comes with a massively reworked dataport. The new thing, called Load/Store Cache or LSC, has a significantly improved interface. Instead of bespoke messages for every case, there's basically one or two messages with different bits to control things like address size, how much data is read/written, etc. It's way nicer but also means we get to rewrite all our dataport encoding/decoding code. This patch kicks off the party with all of the new enums. v2 (Jason Ekstrand, Mark Janes): - Rename to LSC v3 (Jason Ekstrand): - Add numbers to all enums Co-authored-by: Mark Janes <mark.a.janes@intel.com> Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Emma Anholt	b18cf54f0d	intel: Early exit from inst_is_in_block(). Surely the compiler would sort that out, you would think. But no, my debugoptimized build improves dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 runtime by 25% on my SKL from this change. This was the slowest test in the GLES31 tests on APL in CI, at 22s. And yes, we were spending around half of our runtime in this function. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11631>	2021-06-29 16:48:40 +00:00
Marcin Ślusarz	2cf189cc88	intel/fs: use stack for temporary array "regs" is an array of 2 -> "m" must be <= 2 -> "components" array can be allocated on the stack Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11575>	2021-06-28 09:44:40 +00:00
Jason Ekstrand	1e242785c3	intel/fs: Implement load/store_scratch on XeHP Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>	2021-06-25 00:18:29 +00:00
Jason Ekstrand	c38812be1d	intel/fs: Implement spilling on XeHP Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>	2021-06-25 00:18:29 +00:00
Francisco Jerez	4dc4284342	intel/fs: Implement Wa_14013745556 on TGL+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	c19cfa9dc2	intel/fs: Fix synchronization of accumulator-clearing W/A move on TGL+. Right now the accumulator-clearing move emitted by the generator for Wa_14010017096 inherits the SWSB field from the previous instruction. This can lead to redundant synchronization, or possibly more serious issues if the previous instruction had a TGL_SBID_SET SWSB synchronization mode. Take the SWSB synchronization information from the IR. Fixes: `a27542c5dd` ("intel/compiler: Clear accumulator register before EOT") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	63abc083ce	intel/fs: Teach IR about EOT instruction writing the accumulator implicitly on TGL+. This is unlikely to have had any negative side effect on the original TGL, but will lead to issues on XeHP+ if the software scoreboard pass isn't able to synchronize the accumulator writes. Fixes: `a27542c5dd` ("intel/compiler: Clear accumulator register before EOT") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	5e7f443de0	intel/fs: Add SWSB dependency annotations for cross-pipeline WaR data hazards on XeHP+. In cases where an in-order instruction is overwriting a register previously read by another in-order instruction, drop the dependency iff the previous read is guaranteed to have occurred from the same in-order pipeline. This should only have an effect on XeHP+ since previous Xe platforms only had one in-order FPU pipeline. The previous workaround we were using for this treated all ordered read dependencies as write dependencies to avoid noise from our simulation environment. Relative to our previous workaround this improves performance of GFXBench5 gl_tess by ~7% on a DG2 system among other single-digit percentual FPS improvements. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	d46bb14d14	intel/fs: Implement Wa_22012725308 for cross-pipe accumulator data hazard. The hardware fails to provide the expected data coherency guarantees for accumulator registers when accessed from multiple FPU pipelines. Fix this by tracking implicit accumulator accesses just like we do for regular GRF registers, but instead of adding synchronization annotations for any dependency we only do it for dependencies with a pipeline mismatch, since the hardware should be able to guarantee proper synchronization for matching pipelines. Note that this workaround handles RaW and WaW dependencies in addition to the WaR dependencies described in the hardware bug report even though cross-pipeline RaW accumulator dependencies should be extremely rare, since chances are the hardware will also hang if we ever hit such a condition. This only affects XeHP+, since all FPU instructions are executed as a single in-order pipeline on earlier Xe platforms. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	385da1fe36	intel/fs: Track single accumulator in scoreboard lowering pass. This change reduces the precision of the scoreboard data structure for accumulator registers, because the rules determining the aliasing of accumulator registers are non-trivial and poorly documented (e.g. acc0 overlaps the storage of acc1 when the former is accessed with an integer type). We could implement those rules but it wouldn't have any practical benefit since we currently only use acc0-1, and for the most part we can rely on the hardware's accumulator dependency tracking. Instead make our lives easier by representing it as a single register. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Francisco Jerez	231337a13a	intel/fs/xehp: Assert that the compiler is sending all 3 coords for cubemaps. As required by HSDES:14013363432. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433>	2021-06-23 07:34:22 +00:00
Lionel Landwerlin	7ed0aaced7	nir: use a more fitting index for btd_stack_push_intel Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Lionel Landwerlin	423c47de99	nir: drop the btd_resume_intel intrinsic This is now 100% equivalent to the new rt_resume intrinsic. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Lionel Landwerlin	4d9fcf2799	intel/rt: switch to common pass for shader calls lowering v2: rename for new indices Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Jason Ekstrand	b66d3e627a	intel/fs: Don't pull CS push constants if uses_inline_data Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Jason Ekstrand	c92fd35848	intel/rt: Use reloc constants for the resume SBT It's going to be attached to the end of the shader binary, not an arbitrary table somewhere in memory. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00
Jason Ekstrand	705395344d	intel/fs: Add support for compiling bindless shaders with resume shaders Instead of depending on the driver to compile each resume shader separately, we compile them all in one go in the back-end and build an SBT as part of the shader program. Shader relocs are used to make the entries in the SBT point point to the correct resume shader. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>	2021-06-22 21:09:25 +00:00

1 2 3 4 5 ...

1832 commits