fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 20:10:14 +01:00

Author	SHA1	Message	Date
Rhys Perry	69f9a96af1	nir: add nir_src_components_read() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12472>	2021-09-24 18:41:18 +00:00
Emma Anholt	aed4c0b5a9	nir: Drop the unused instr arg for src/dest copy functions. Now that we don't use ralloc, we don't need this arg to get at the right ralloc ctx. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:06 +00:00
Emma Anholt	d1a2870f78	nir: Add all allocated instructions to a GC list. Right now we're using ralloc to GC our NIR instructions, but ralloc has significant overhead for its recursive nature so it would be nice to use a simpler mechanism for GCing instructions. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:06 +00:00
Emma Anholt	b99efb8af0	nir: Pull the instr list free function out to a helper. With the de-rallocing, we're going to have some more places that free a list of instrs. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:05 +00:00
Emma Anholt	36d9bdca0b	nir: Add a nir_instr_free() to replace ralloc_free(instr). This will gain another step shortly. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:05 +00:00
Qiang Yu	7054c1b7fd	nir/linker: pack varyings with different interpolation qualifier Driver like radeonsi load varying in a scalar manner, so prefer to pack varying with different interpolation qualifier into same slot to save space. But driver like panfrost/bifrost can load varying in vector manner, so prefer to pack varying with same interpolation qualifier. Driver can add interpolation qualifiers which are able to be packed into same varying slot to pack_varying_options nir option. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12537>	2021-09-09 06:00:58 +00:00
Rhys Perry	41ecef7855	nir: add sdot_2x16 and udot_2x16 opcodes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:27 +00:00
Rhys Perry	ae00f5af61	nir: separate lower_add_sat Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:27 +00:00
Emma Anholt	01759d3fb2	nir: Set .driver_location for GLSL UBO/SSBOs when we lower to block indices. Without this, there's no way to match the UBO nir_variable declarations to the load_ubo intrinsics referencing their data. Reviewed-by: Adam Jackson <ajax@redhat.com> Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12175>	2021-08-31 20:12:16 +00:00
Caio Marcelo de Oliveira Filho	f95daad3a2	nir: Add a way to identify per-primitive variables Per-primitive is similar to per-vertex attributes, but applies to all fragments of the primitive without any interpolation involved. Because they are regular input and outputs, keep track in shader_info of which I/O is per-primitive so we can distinguish them after deref lowering. These fields can be used combined with the regular `inputs_read`, `outputs_written` and `outputs_read`. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>	2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho	927584fa67	nir: Update documentation for location to mention Task/Mesh Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>	2021-08-28 03:56:42 +00:00
Lionel Landwerlin	a13e79843e	nir: prevent peephole from generating invalid NIR We can't append instructions following a return/halt instruction because the control flow helpers will modify the successor of the block containing the return/halt. And the NIR validator enforces that the return/halt must have the end of the function as successor. This tends to happen following lower_shader_calls lowering which inserts halts. This probably doesn't prevent the optimization, it'll just happen in one of the return shaders after the halt has been removed. v2: Move prev block ending check earlier in the function (Daniel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12506>	2021-08-25 11:38:21 +00:00
Ian Romanick	6c18a3b497	nir/opcodes: Add integer dot-product opcodes Six opcodes are added: sdot_4x8_iadd, udot_4x8_uadd, sudot_4x8_iadd, sdot_4x8_iadd_sat, udot_4x8_uadd_sate, and sudot_4x8_iadd_sat. These represent the combinations of integer dot-product and add that operate on packed source vectors. That is, the four 8-bit values for each vector is stored in a single 32-bit integer. Some hardware may prefer to operate on unpacked byte vectors. When such hardware comes to Mesa, we'll have to figure out how to name things. v2: Add nir_op_iudp4a and nir_op_iudp4a_sat instructions. These opcodes are not 2-source commutative. v3: Rename all opcodes to be more like some existing 4x8 opcodes. Suggested by Timur. Change type of packed vector sources to uint32, change types of constant folding variables to have explicit size, and delete some extra casts. All suggested by Jason. v4: Fix typo previously noticed by Alyssa but missed in v2. v5: Add has_sudot_4x8 flag. Requested by Rhys. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Qiang Yu	0b9639c35d	nir/loop_analyze: record induction variables for each loop For being used by uniform inline lowering pass. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>	2021-08-19 02:17:35 +00:00
Ian Romanick	f0a8a9816a	nir: intel/compiler: Add and use nir_op_pack_32_4x8_split A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO. This results in a lot of shifts and MOVs. When that pattern can be recognized, the individual 8-bit components can be packed much more efficiently. v2: Rebase on `b4369de27f` ("nir/lower_packing: use shader_instructions_pass") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>	2021-08-18 22:03:37 +00:00
Rhys Perry	ed70b256ce	nir: add ffma creation helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>	2021-08-16 17:19:45 +00:00
Emma Anholt	673cc9323a	nir: Move phi src setup to a helper. Cleans up the ralloc/list push code all over the tree. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11772>	2021-08-13 16:11:57 +00:00
Pierre-Eric Pelloux-Prayer	7684d57a05	nir: add a pass to optimize "gl_FragDepth = gl_FragCoord.z" away gl_FragDepth default value is gl_FragCoord.z so if a shader does: gl_FragDepth = gl_FragCoord.z we can drop this assignment. v2: use nir_ssa_scalar_resolved and don't do this is gl_FragDepth is wrote multiple times (Jason) v3: - move to its own pass (Jason) - handle var = NULL (Rhys) v4: refactoring (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10697>	2021-08-11 11:00:11 +02:00
Dave Airlie	ad92c2b253	nir: add fisnormal lowering just lower the 32-bit version for now. Thanks to alyssa for this suggested lowering. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12207>	2021-08-06 14:27:48 +10:00
Jason Ekstrand	0ddac113f8	nir: Removing uses of SSA defs destroys SSA liveness The liveness information will be a superset of real liveness so it's unlikely something will explode if it tries to use it. However, it is out-of-date and should be re-run if someone really wants it. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12186>	2021-08-03 21:36:53 +00:00
Timothy Arceri	a7f2e683de	nir: move nir_block_ends_in_break() to nir.h Will be used in a following commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>	2021-08-03 10:54:50 +00:00
Timothy Arceri	a9ed4538ab	nir: add indirect loop unrolling to compiler options This is where it should be rather than having to pass it into the optimisation pass every time. It also allows us to call the loop analysis pass without having to duplicate these options which we will do later in this series. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>	2021-08-03 10:54:50 +00:00
Emma Anholt	9ffd00bcf1	nir_to_tgsi: Pack our tex coords into vec4 nir_tex_src_backend[12]. For TGSI, we need the coordinate, comparator, bias, and LOD all together in the first two vec4 args, and by doing it in the backend we were generating extra MOVs. softpipe shader-db results: total instructions in shared programs: 2985416 -> 2953625 (-1.06%) instructions in affected programs: 499937 -> 468146 (-6.36%) total temps in shared programs: 544769 -> 565869 (3.87%) temps in affected programs: 105469 -> 126569 (20.01%) i915g shader-db: total instructions in shared programs: 371625 -> 369594 (-0.55%) instructions in affected programs: 24903 -> 22872 (-8.16%) total tex_indirect in shared programs: 11381 -> 11365 (-0.14%) tex_indirect in affected programs: 43 -> 27 (-37.21%) LOST: 7 GAINED: 16 The temps increase is the pre-existing issue that we never release temps for NIR regs, which doesn't matter much for softpipe (just memory/cache footprint) but does for i915g as seen by shaders that no longer compile (though overall we seem to win). Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11912>	2021-07-29 09:05:05 -07:00
Jason Ekstrand	74ec2b12be	nir/lower_tex: Rework invalid implicit LOD lowering Only fragment and some compute shaders support implicit derivatives. They're totally meaningless without helper invocations and some understanding of the dispatch pattern. We've got code to lower nir_texop_tex in these shader stages to use an explicit derivative of 0 but it was pretty badly broken: 1. It only handled nir_texop_tex, not nir_texop_txb or nir_texop_lod. 2. It didn't take min_lod into account 3. It was conflated with adding a missing LOD parameter to opcodes which expect one such as nir_texop_txf. While not really a bug, this does make it way harder to reason about the code. 4. Unless you set a flag (which most drivers don't), it left the opcode nir_texop_tex instead of nir_texop_txl which it should have been. This reworks it to go through roughly the same path as other LOD lowering only with a constant lod of 0 instead of calling out to nir_texop_lod. We also get rid of the lower_tex_without_implicit_lod flag because most drivers set it and those that don't are probably subtly broken. If someone really wants to get nir_texop_tex in their vertex shaders, they can write a new patch to add the flag back in. Fixes: `e382890e25` "nir: set default lod to texture opcodes that..." Fixes: `d5ac5d6e83` "nir: Add option to lower tex to txl when..." Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>	2021-07-23 15:53:57 +00:00
Jason Ekstrand	fa717a202c	docs,nir: Document NIR texture instructions Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>	2021-07-23 15:53:57 +00:00
Jason Ekstrand	4465ca296d	nir: Suffix all the MCS texture stuff _intel It's intel-specific, used to get at MSAA compression information. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>	2021-07-23 15:53:57 +00:00
Jason Ekstrand	60b5faf572	nir/lower_tex: Add a lower_txs_cube_array option Several bits of hardware require the division by 6 to happen in the shader. May as well have common lowering for it. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12005>	2021-07-22 14:22:35 -05:00
Jordan Justen	6898549d56	nir: Add nir_lower_image() to lower cube image sizes Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9466>	2021-07-21 11:02:15 -07:00
Sagar Ghuge	06ab737686	nir: Add optimizations for iadd3 This patch also adds has_iadd3 bit to give more control if backend supports ternary add instruction or not. v2: - Add patterns in late optimization (Connor Abbott) Suggested-by: Alyssa/Jason Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Jason Ekstrand	624e799cc3	nir: Drop nir_ssa_def::name and nir_register::name We say that they're for debug only but we don't really have a good policy around when to set them and when not to. In particular, nir_lower_system_values and nir_lower_vars_to_ssa which are the chief producers of SSA values which might reasonably have a name do not bother to set one. We have some names set from things like BLORP and RADV's meta shaders but AFAICT, they're setting a name more because it's there than because they actually care. Also, most things other than nir_clone and nir_serialize don't bother to try and preserve them. You can see in the diffstat of this commit exactly what passes attempt to preserve names. Notably missing from the list is opt_algebraic which is the single largest source of SSA def churn and it happily throws names away. These observations lead me to question whether or not names are actually useful at all or if they're just taking up space (8B per instruction) and wasting CPU cycles (to ralloc_strdup on the off chance we do have one). I don't think I can think of a single time in recent history where I've been debugging a shader issue and a SSA value name has been there and been useful. If anything, the few times they are there, they just throw me off because they mess up the indentation in nir_print. iris shader-db on my system gets runtime -2.07734% +/- 1.26933% (n=5) Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5439>	2021-07-08 17:34:41 +00:00
Connor Abbott	cc514bfa0e	nir: Add read_invocation_cond_ir3 intrinsic On qualcomm, we have shared registers similar to SGPR's on AMD. However, there is no readlane or readfirstlane primitive. shared registers can only be written to when just one lane is active. This means that we have to lower readInvocation(val, id) to something like: if (gl_SubgroupInvocation == id) { scalar_reg = val; } return scalar_reg; However it's a bit difficult to actually get the value of gl_SubgroupInvocation in the backend, because for compute it requires some calculations and we don't have any CSE support in the backend. This intrinsic lets us turn it into "readInvocationCond(val, id == gl_SubgroupInvocation)" in NIR at which point the backend code generation is a lot easier. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Connor Abbott	e4e79de2a4	nir/subgroups: Support > 1 ballot components Qualcomm has a mode with a subgroup size of 128, so just emitting larger integer operations and then lowering them later isn't an option. This makes the pass able to handle the lowering itself, so that we don't have to go down to 64-thread wavefronts when ballots are used. (The GLSL and legacy SPIR-V extensions only support a maximum of 64 threads, but I guess we'll cross that bridge when we come to it...) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Connor Abbott	90819b9b0e	nir/subgroups: Replace lower_vote_eq_to_ballot with lower_vote_eq Lower it to a vote instead of a ballot. This was only used for AMD, and in that case they're pretty much the same. However Qualcomm has a vote builtin, which we want to use instead of ballots. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Emma Anholt	4118264643	nir: Free the instructions in a DCE instr removal. No significant change in shader-db time (n=11), but should be a little win for memory usage by the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11628>	2021-07-06 11:24:48 -07:00
Emma Anholt	5618445d45	nir: Use remove_and_dce for nir_shader_lower_instructions(). Reduces the work that other shader passes have to do to look at dead code, and possibly extra rounds around the optimization loop if dce wasn't the last pass in it. shader-db runtime -1.12919% +/- 0.264337% (n=49) on SKL. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11628>	2021-07-06 11:24:45 -07:00
Emma Anholt	5251548572	nir: Add a nir_instr_remove that recursively removes dead code. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11628>	2021-07-06 11:24:43 -07:00
Rob Clark	c7b935962b	nir: Add pass to lower phi precision In addition to register pressure benefits from getting more fp16/int16, this avoids i2imp's from standing in the way of loop unrolling. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11545>	2021-06-29 23:27:28 +00:00
Emma Anholt	0afab39af9	nir: Add a helper for chasing movs with nir_ssa_scalar(). Sometimes you might want to find a constant source without going through all the copy prop and constant folding to make your source be a load_const. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11613>	2021-06-28 16:26:24 +00:00
Enrico Galli	8a5333c105	nir: Add modes filter to nir_sort_variables Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10989>	2021-06-24 20:05:13 +00:00
Jason Ekstrand	81cb20bd17	nir: Add a function for sorting variables Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10989>	2021-06-24 20:05:13 +00:00
Bas Nieuwenhuizen	8dfb240b1f	nir: Add raytracing shader call lowering pass. Really copying Jason's pass. Changes: - Instead of all the intel lowering introduce rt_{execute_callable,trace_ray,resume} - Add the ability to use scratch intrinsics directly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10339>	2021-06-21 21:23:51 +00:00
Jason Ekstrand	73188c6954	nir,docs: Add docs for NIR ALU instructions About half or more of the text here is actually from Connor Abbot. I've edited it a bit to bring it up-to-date and make a few things more clear. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11438>	2021-06-21 16:46:59 +00:00
Rhys Perry	ea68d4a676	nir/propagate_invariant: add invariant_prim option Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11035>	2021-06-21 15:13:05 +00:00
Emma Anholt	990c232603	nir: Add an interface for logging shaders with mesa_log*. For debug on Android, it's useful to be able to print shaders to the android log interface, since you don't usually have stdout/stderr. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9262>	2021-06-18 18:18:35 +00:00
Rhys Perry	1cbcfb8b38	nir, nir/algebraic: add byte/word insertion instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Caio Marcelo de Oliveira Filho	c8a7bd0dc8	nir: Rename WORK_GROUP (and similar) to WORKGROUP Be consistent with other usages in Vulkan and SPIR-V, and the recently added workgroup_size field. Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Hoe Hao Cheng	90a5fef85c	nir: define NIR_ALU_MAX_INPUTS Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11172>	2021-06-04 19:33:13 +00:00
Ian Romanick	880b00dc59	nir/lower_tex: Add support for lowering YUYV formats v2: Rebase on `bc438c91d9` ("nir/lower_tex: ignore texture_index if tex_instr has deref src") Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9610>	2021-05-21 01:40:22 +00:00
Ian Romanick	1358d93650	nir/lower_tex: Add support for lowering Y41x formats These are similar to AYUV, but the channel ordering is different... in such a way that there's no RGBA format that will make the channels line up right. v2: Rebase on `bc438c91d9` ("nir/lower_tex: ignore texture_index if tex_instr has deref src") Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9610>	2021-05-21 01:40:22 +00:00
Jason Ekstrand	b447f5049b	nir: Add a discard optimization pass Many fragment shaders do a discard using relatively little information but still put the discard fairly far down in the shader for no good reason. If the discard is moved higher up, we can possibly avoid doing some or almost all of the work in the shader. When this lets us skip texturing operations, it's an especially high win. One of the biggest offenders here is DXVK. The D3D APIs have different rules for discards than OpenGL and Vulkan. One effective way (which is what DXVK uses) to implement DX behavior on top of GL or Vulkan is to wait until the very end of the shader to discard. This ends up in the pessimal case where we always do all of the work before discarding. This pass helps some DXVK shaders significantly. v2 (Jason Ekstrand): - Fix a couple of typos (Grazvydas, Ian) - Use the new nir_instr_move helper - Find all movable discards before moving anything so we don't accidentally re-order anything and break dependencies v3 (Pierre-Eric): remove the call to nir_opt_conditional_discard based on Daniel Schürmann comment. v4 (Pierre-Eric): - handle demote intrinsics and drop derivatives_safe_after_discard - add early return if discards/demotes aren't used v5 (Pierre-Eric): - use pass_flags instead of instr set (Daniel Schürmann) v6 (Daniel Schürmann): - cleanup and fix pass_flags handling Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10522>	2021-05-19 18:04:44 +00:00

... 2 3 4 5 6 ...

1005 commits