fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 04:58:08 +02:00

Author	SHA1	Message	Date
Eric Anholt	5f992802f5	nir/builder: Drop the mem_ctx arg from nir_builder_init_simple_shader(). This looks a lot more simple now! Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>	2020-11-11 08:50:29 -08:00
Eric Anholt	ef5bce9253	intel: Drop the last uses of a mem_ctx in nir_builder_init_simple_shader(). These two consumers were the only ones out of the ~65 calls to init_simple_shader, so there's a pretty clear consensus on how to allocate simple shaders. I suspect that actually these would be just fine with b.shader being the mem_ctx, but that would take a bit more rework. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>	2020-11-11 08:50:27 -08:00
Eric Anholt	4e9328e3b6	nir_builder: Return a new builder from nir_builder_init_simple_shader(). It's a little inline function, so we can just RAII it for better ergonomics. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>	2020-11-11 08:49:49 -08:00
Jason Ekstrand	68092df8d8	intel/nir: Lower 8-bit ops to 16-bit in NIR on Gen11+ Intel hardware supports 8-bit arithmetic but it's tricky and annoying: - Byte operations don't actually execute with a byte type. The execution type for byte operations is actually word. (I don't know if this has implications for the HW implementation. Probably?) - Destinations are required to be strided out to at least the execution type size. This means that B-type operations always have a stride of at least 2. This means wreaks havoc on the back-end in multiple ways. - Thanks to the strided destination, we don't actually save register space by storing things in bytes. We could, in theory, interleave two byte values into a single 2B-strided register but that's both a pain for RA and would lead to piles of false dependencies pre-Gen12 and on Gen12+, we'd need some significant improvements to the SWSB pass. - Also thanks to the strided destination, all byte writes are treated as partial writes by the back-end and we don't know how to copy-prop them. - On Gen11, they added a new hardware restriction that byte types aren't allowed in the 2nd and 3rd sources of instructions. This means that we have to emit B->W conversions all over to resolve things. If we emit said conversions in NIR, instead, there's a chance NIR can get rid of some of them for us. We can get rid of a lot of this pain by just asking NIR to get rid of 8-bit arithmetic for us. It may lead to a few more conversions in some cases but having back-end copy-prop actually work is probably a bigger bonus. There is still a bit we have to handle in the back-end. In particular, basic MOVs and conversions because 8-bit load/store ops still require 8-bit types. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>	2020-11-09 18:58:51 +00:00
Jason Ekstrand	b98f0d3d7c	intel/nir: Lower 8-bit scan/reduce ops to 16-bit We can't really support these directly on any platform. May as well let NIR lower them. The NIR lowering is potentially one more instruction for scan/reduce ops thanks to not being able to do the B->W conversion as part of SEL_EXEC. For imax/imin exclusive scan, it's yet another instruction thanks to the extra imax/imin NIR has to insert to deal with the fact that the first live channel will contain the identity value which, when signed, will cast wrong. However, it does let us drop some complexity from our back-end so it's probably worth it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>	2020-11-09 18:58:51 +00:00
Jason Ekstrand	3ad2d85995	intel/nir: Refactor lower_bit_size_callback We want to use it for more than just ALU. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>	2020-11-09 18:58:51 +00:00
Jason Ekstrand	2c4b47184d	nir/lower_bit_size: Pass a nir_instr to the callback This way we can start supporting more than just ALU ops. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>	2020-11-09 18:58:51 +00:00
Caio Marcelo de Oliveira Filho	5d5f3e3a47	intel/fs: Implement nir_intrinsic_{load,store}_shared_block_intel Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448>	2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho	9fe158e1d1	intel/fs: Implement nir_intrinsic_{load,store}_ssbo_block_intel Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448>	2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho	d372abe397	intel/fs: Add surface OWORD BLOCK opcodes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448>	2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho	296137df53	intel/fs: Implement nir_intrinsic_{load,store}_global_block_intel Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448>	2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho	d3d2b73fa3	intel/fs: Add A64 OWORD BLOCK opcodes Based on a patch for OWORD BLOCK READ from Jason Ekstrand. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448>	2020-11-04 20:24:48 +00:00
Marcin Ślusarz	21ffacff8c	intel/compiler: remove branch weight heuristic As a result of this patch, compiler chooses SIMD32 shaders more frequently. Current logic is designed to avoid regressions from enabling SIMD32 at all cost, even though the cases where regression can happen are probably for smaller draw calls (far away from the camera and though smaller). In Intel perf CI this patch improves FPS in: - gfxbench5 alu2: 21.92% (gen9), 23.7% (gen11) - synmark OglShMapVsm: 3.26% (gen9), 4.52% (gen11) - gfxbench5 car chase: 1.34% (gen9), 1.32% (gen11) No observed regressions there. In my testing, it also improves FPS in: - The Talos Principle: 2.9% (gen9) The other 16 games I tested had very minor changes in performance (2/3 positive, but not significant enough to list here). Note: this patch harms synmark OglDrvState (which is not in Intel perf CI) by ~2.9%, but this benchmark renders multiple scenes from other workloads (including OglShMapVsm, which is helped in standalone mode) in tiny rectangles. Rendering so small drastically changes branching statistics, which favors smaller SIMD modes. I assume this matters only in micro-benchmarks, as in real workloads more expensive (with more uniform branching behavior) draw calls dominate. Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7137>	2020-11-03 10:49:04 +00:00
Marcin Ślusarz	06764e0e5d	intel/compiler: use C++ template instead of preprocessor Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7382>	2020-11-03 10:42:29 +00:00
Marcin Ślusarz	e3f6a9ea36	intel: remove dead code Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7353>	2020-11-02 19:58:56 +00:00
Caio Marcelo de Oliveira Filho	ce0b72a13a	intel/fs: Don't emit_uniformize when getting a constant SSBO index Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7340>	2020-10-29 21:54:01 +00:00
Ian Romanick	67956689bb	nir: Rename replicated-result dot-product instructions All these instructions replicate the result of a N-component dot-product to a vec4. Naming them fdot_replicatedN gives the impression that are some sort of abstract dot-product that replicates the result to a vecN. They also deviate from fdph_replicated... which nobody would reasonably consider naming fdot_replicatedh. Naming these opcodes fdotN_replicated more closely matches what they are, and it matches the pattern of fdph_replicated. I believe that the only reason these opcodes were named this way was because it simplified the implementation of the binop_reduce function in nir_opcodes.py. I made some fairly simple changes to that function, and I think the end result is ok. The bulk of the changes come from the sed rename: sed --in-place -e 's/fdot_replicated$[234]$/fdot\1_replicated/g' \ $(grep -r 'fdot_replicated[234]' src/) v2: Use a named parameter to binop_reduce instead of using isinstance(name, str). Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5725>	2020-10-22 18:00:19 +00:00
Caio Marcelo de Oliveira Filho	e7e24d5039	intel/fs: Handle nir_intrinsic_terminate For terminate operation, jump the invocation without predicating on the rest of the quad being disabled -- which is what is done for demote and discard. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7150>	2020-10-15 21:40:09 +00:00
Ian Romanick	262ca98b3a	intel/compiler: Remove Gen10-specific code Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899>	2020-10-15 09:29:53 -07:00
Kenneth Graunke	341f5bffb7	intel/compiler, anv: Delete cs_prog_data->slm_size cs_prog_data->slm_size is basically redundant with prog_data->total_shared, which is the field that we actually use for controlling the shared local memory size in all drivers. We were still using it in one place for VK_EXT_pipeline_executable_properties, but we should just fix that and delete the field. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7152>	2020-10-14 23:13:41 +00:00
Jason Ekstrand	f8117f7051	intel/fs: Allow constant-propagation into SAMPLEINFO and IMAGE_SIZE Without this, we end up with indirect sampler messages all the time because we don't propagate the texture/image BTI. This makes debugging shaders with imageSize or textureSamples in them a pain. Shader-db results on Ice Lake: total instructions in shared programs: 19720612 -> 19720564 (<.01%) instructions in affected programs: 4998 -> 4950 (-0.96%) helped: 12 HURT: 0 All affected shaders were compute shaders in Deus Ex: Mankind Divided. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6794>	2020-10-14 21:35:30 +00:00
Jason Ekstrand	5abac85177	intel/fs: Rework scratch handling on Gen9+ The current scratch mechanism uses an MRF hack where we reserve a few GRF registers to treat like the MRF and we collect the data into that MRF region before doing a scratch write. We also use that region for the header for scratch reads. This commit changes things and gets rid of the MRF hack. Instead, we reserve a single register (which RA is free to pick) for the scratch header and uses split sends for scratch writes to avoid having to do the copy. This should provide RA with more freedom in the presence of spilling as well as avoid some unnecessary data moves. In future, the new GEN9_SCRATCH_HEADER opcode gives us a place where we can do our own per-thread scratch base address calculations rather than depending on the scratch base address that gets pushed into g0. Having an opcode for this lets us do it once at the top of the shader rather than repeating it at every read/write. One other noticeable difference is the use of SHADER_OPCODE_SEND. We can get away with this thanks to the fact that we're now using a set to track which instructions are generated by spills and don't rely on the opcodes to find spill/fill instructions. This allows us to avoid adding more virtual opcodes and let the normal code paths handle things like scoreboard dependencies between header setup and the SEND. It also means that post-RA scheduling may be able to space out the header setup MOV and the SEND for better latency hiding. Shader-db results on Skylake: total spills in shared programs: 12137 -> 10604 (-12.63%) spills in affected programs: 6685 -> 5152 (-22.93%) helped: 274 HURT: 2 total fills in shared programs: 13065 -> 11515 (-11.86%) fills in affected programs: 9007 -> 7457 (-17.21%) helped: 275 HURT: 1 Shader-db results on Ice Lake: total spills in shared programs: 12482 -> 10953 (-12.25%) spills in affected programs: 6586 -> 5057 (-23.22%) helped: 275 HURT: 0 total fills in shared programs: 12819 -> 11234 (-12.36%) fills in affected programs: 7867 -> 6282 (-20.15%) helped: 274 HURT: 0 Shader-db results on Tigerlake: total spills in shared programs: 11689 -> 10233 (-12.46%) spills in affected programs: 4740 -> 3284 (-30.72%) helped: 259 HURT: 0 total fills in shared programs: 10840 -> 9443 (-12.89%) fills in affected programs: 6244 -> 4847 (-22.37%) helped: 259 HURT: 0 Fossil-db results on Ice Lake: Spills in all programs: 245249 -> 201633 (-17.8%) Fills in all programs: 366066 -> 314368 (-14.1%) More practically, this seems to give about a 0.5-1% perf boost in Witcher 3 (DXVK) and Shadow of the Tomb Raider (Vulkan native). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	e557af9781	intel/fs/ra: Use a set to track added spill/fill instructions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	f650c4c0c6	intel/fs/ra: Sanity-check our IP counts Starting with `e99081e76d`, we don't re-construct liveness information every time we spill a register. Instead, we're very careful to track which instructions are spill instructions and not contribute those to the IP count so that we can continue to use the old liveness information even though instructions have been added. This commit adds an assert that sanity-checks that we count the same number of instructions as our liveness information is based on. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	d80d0a6ced	intel/fs/ra: Store the last non-spill VGRF node Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	2af6528c33	intel/fs/ra: Refactor handling of Gen7 scratch reads The attempt at de-duplication with the gen7_read Boolean wasn't actually saving us anything. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	74a1843ca0	intel/fs/ra: Increment spill_offset as part of the emit_spill loop This makes it consistent with our handling of src.offset and with our handling of spill_offset in emit_unspill. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	06ebf23283	intel/fs: Add a SCRATCH_HEADER opcode This opcode is responsible for setting up the buffer base address and per-thread scratch space fields of a scratch message header. For the most part, it's a copy of g0 but some messages need us to zero out g0.2 and the bottom bits of g0.5. This may actually fix a bug when nir_load/store_scratch is used. The docs say that the DWORD scattered messages respect the per-thread scratch size specified in gN.3[3:0] in the message header but we've been leaving it zero. This may mean that we've been ignoring any scratch reads/writes from a load/store_scratch intrinsic above the 1KB mark. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Jason Ekstrand	24b64c8408	intel/fs: Copy the PTSS from g0 for scratch reads/writes In theory, this fixes a bug where we were dropping the PTSS bound on the floor. The hardware docs claim that the A32 DWORD and BYTE scattered read/write messages do a PTSS bounds check. However, in practice, it seems that the hardware ignores the bounds check so this doesn't actually matter. I verified this with the following couple of piglit tests: https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/399 In practice, this prevents the next commit from making a subtle behavioral change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>	2020-10-13 21:59:27 +00:00
Rhys Perry	8850a63161	radv/aco,nir/lower_subgroups: don't lower elect ACO can implement this better. fossil-db (Navi): Totals from 33 (0.02% of 135946) affected shaders: SGPRs: 1736 -> 1744 (+0.46%) VGPRs: 1680 -> 1656 (-1.43%) CodeSize: 246160 -> 245916 (-0.10%); split: -0.14%, +0.04% MaxWaves: 449 -> 461 (+2.67%) Instrs: 48301 -> 48266 (-0.07%); split: -0.12%, +0.05% Cycles: 469740 -> 469240 (-0.11%); split: -0.18%, +0.08% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:20 +00:00
Timur Kristóf	f11f4a2a4d	nir: Add ability to count primitives per stream. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	aac5adc3c2	nir: Count vertices per stream. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	2be99012e9	nir: Add ability to count emitted GS primitives. Add an option to nir_lower_gs_intrinsics which tells it to track the number of emitted primitives, not just vertices. Additionally, also make it per-stream. Also rename the set_vertex_count intrinsic to set_vertex_and_primitive_count. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Jason Ekstrand	3d22de05ca	intel/fs: Add an option to use dataport messages for UBOs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>	2020-10-08 01:17:06 -05:00
Jason Ekstrand	0d462dbee5	intel/fs: Add an alignment to VARYING_PULL_CONSTANT_LOAD_LOGICAL Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>	2020-10-08 01:14:46 -05:00
Jason Ekstrand	dd9c34a907	intel/nir: Lower load_global_constant in lower_mem_access_bit_sizes It's identical to nir_intrinsic_load_global except that it works on data that's guaranteed to be constant throughout the shader invocation. Fixes: `ff2f44d865` "intel/fs: Implement nir_intrinsic_load_global_constant" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6872>	2020-10-08 03:56:01 +00:00
Jason Ekstrand	fd04f858b0	intel/nir: Don't try to emit vector load_scratch instructions In `53bfcdeecf`, we added load/store_scratch instructions which deviate a little bit from most memory load/store instructions in that we can't use the normal untyped read/write instructions which can read and write up to a vec4 at a time. Instead, we have to use the DWORD scattered read/write instructions which are scalar. To handle this, we added code to brw_nir_lower_mem_access_bit_sizes to cause them to be scalarized. However, one case was missing: the load-as-larger-vector case. In this case, we take small bit-sized constant-offset loads replace it with a 32-bit load and shuffle the result around as needed. For scratch, this case is much trickier to get right because it often emits vec2 or wider which we would then have to lower again. We did this for other load and store ops because, for lower bit-sizes we have to scalarize thanks to the byte scattered read/write instructions being scalar. However, for scratch we're not losing as much because we can't vectorize 32-bit loads and stores either. It's easier to just disallow it whenever we have to scalarize. Fixes: `53bfcdeecf` "intel/fs: Implement the new load/store_scratch..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6872>	2020-10-08 03:56:01 +00:00
Jason Ekstrand	9df9f940f0	iris: Add support for load_work_dim as a system value Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7047>	2020-10-07 16:01:31 -05:00
Marcin Ślusarz	9c25689287	intel: drop likely/unlikely around INTEL_DEBUG It's included in declaration of INTEL_DEBUG. Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>	2020-10-06 18:43:07 +00:00
Vinson Lee	81cd4c8f59	intel/vec4: Remove leftover code from Gen8+ removal. Remove code missed in commit `2a49007411` ("intel/vec4: Remove all support for Gen8+ [v2]"). Fix defect reported by Coverity Scan. Logically dead code (DEADCODE) dead_error_begin: Execution cannot reach this statement: mcs.swizzle = 80U; Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6927>	2020-10-03 03:53:46 +00:00
Jason Ekstrand	8427e56067	intel/fs: Don't use NoDDClk/NoDDClr for split SHUFFLEs When I copied and pasted the code from MOV_INDIRECT for handling the dependency controls, I missed a subtle difference between MOV_INDIRECT and SHUFFLE. Specifically, MOV_INDIRECT gets lowered to a narrow instruction on Gen7 by the SIMD width lowering whereas SHUFFLE has to split it in the generator. Therefore, the check safety check for whether or not we can use dependency control has to be based on the lowered width rather than the width of the original instruction. Fixes: `a8ac61b0ee` "intel/fs: NoMask initialize the address..." Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3593 Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6989>	2020-10-02 19:53:56 +00:00
Jason Ekstrand	a8ac61b0ee	intel/fs: NoMask initialize the address register for shuffles Cc: mesa-stable@lists.freedesktop.org Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2979 Tested-by: Iván Briano <ivan.briano@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6825>	2020-10-02 00:42:56 +00:00
Eric Anholt	618556a8cb	nir: Drop the high_offset argument to the load_store_vectorizer filter. Nothing uses it, and it's not clear to me what it provides over alignment/num_components/bit_size. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>	2020-09-30 19:53:43 +00:00
Eric Anholt	5f757bb95c	nir: Make the load_store_vectorizer provide align_mul + align_offset. It was passing an encoding of the two that wasn't good for ensuring "Don't combine loads that would make us straddle a vec4 boundary" for nir_lower_ubo_vec4. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>	2020-09-30 19:53:43 +00:00
Connor Abbott	b2ede6280c	intel/nir: Use nir control flow helpers Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6866>	2020-09-30 15:47:51 +00:00
Ian Romanick	1d71b1a311	intel/vec4: Remove everything related to VS_OPCODE_SET_SIMD4X2_HEADER_GEN9 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>	2020-09-28 11:43:10 -07:00
Ian Romanick	2a49007411	intel/vec4: Remove all support for Gen8+ [v2] v2: Restore the gen == 10 hunk in brw_compile_vs (around line 2940). This function is also used for scalar VS compiles. Squash in: intel/vec4: Reindent after removing Gen8+ support intel/vec4: Silence unused parameter warning in try_immediate_source Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>	2020-09-28 11:43:10 -07:00
Ian Romanick	60e1d0f028	intel/compiler: Remove INTEL_SCALAR_... env variables Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>	2020-09-28 11:43:10 -07:00
Ian Romanick	d0ce24c8ca	intel/vec4: Remove inline lowering of LRP Since `dd7135d55d` ("intel/compiler: Use the flrp lowering pass for all stages on Gen4 and Gen5"), it's not possible to get to this function on GPUs that don't have a LRP instruction. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>	2020-09-28 11:43:10 -07:00
Ian Romanick	86bab92aa4	intel/compiler: Don't fallback to vec4 when scalar GS compile fails [v2] v2: Add missing error string handling. Noticed by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>	2020-09-28 11:43:04 -07:00

1 2 3 4 5 ...

1528 commits