fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 05:00:09 +01:00

Author	SHA1	Message	Date
Timur Kristóf	7aa61c84fe	nir: Add new linking helper to set linked driver locations. This commit introduces a new function nir_assign_linked_io_var_locations which is intended to help with assigning driver locations to shaders during linking, primarily aimed at the VS->TCS->TES->GS stages. It ensures that the linked shaders have the same driver locations, and it also packs these as close to each other as possible. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4388>	2020-04-29 11:51:04 +00:00
Jonathan Marek	42093bb694	nir: add pack_32_2x16_split/unpack_32_2x16_split lowering The new option replaces the two other _split lowering options, since there's no need for separate options. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4738>	2020-04-27 18:40:03 +00:00
Alyssa Rosenzweig	42c9bbaeed	nir: Move nir_lower_mediump_outputs from ir3 (Original code from ir3) Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4716>	2020-04-27 16:32:24 +00:00
Gert Wollny	49ce749d0e	nir: Add umad24 and umul24 opcodes So far only the singed versions are defined. v2: Make umad24 and umul24 non-driver specific (Eric Anholt) v3: Take care of nir_builder and automatic lowering of the opcodes if they are not supported by the backend. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4610>	2020-04-23 18:23:04 +00:00
Rhys Perry	32d871b48f	nir/algebraic: don't undo lowering of 8/16-bit comparisons to 32-bit Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4387>	2020-04-23 10:57:38 +00:00
Alejandro Piñeiro	9fd180394b	nir: add nir_tex_instr_need_sampler helper That is basically nir_tex_instr sampler_index documentation comment expressed as a helper. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>	2020-04-22 23:43:18 +02:00
Jason Ekstrand	4d083b52c0	nir/dominance: Better handle unreachable blocks v2: Fix minor comments (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>	2020-04-20 03:46:29 +00:00
Timothy Arceri	c19ebca308	nir: add matrix_layout to nir_variable data This will be used by the following patch. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4623>	2020-04-18 11:50:44 +00:00
Caio Marcelo de Oliveira Filho	5dc85abc4f	nir: Add per_view attribute to nir_variable If a nir_variable is tagged with per_view, it must be an array with size corresponding to the number of views. For slot-tracking, it is considered to take just the slot for a single element -- drivers will take care of expanding this appropriately. This will be used to implement the ability of having per-view position in a vertex shader in Intel platforms. Acked-by: Rafael Antognolli <rafael.antognolli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>	2020-04-07 17:16:09 +00:00
Rob Clark	4638a16a93	nir: add some swizzle helpers Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4455>	2020-04-06 18:00:17 +00:00
Ian Romanick	62795475e8	nir/algebraic: Distribute source modifiers into instructions There are three main classes of cases that are helped by this change: 1. When the negation is applied to a value being type converted (e.g., float(-x)). This could possibly also be handled with more clever code generation. 2. When the negation is applied to a phi node source (e.g., x = -(...); at the end of a basic block). This was the original case that caught my attention while looking at shader-db dumps. 3. When the negation is applied to the source of an instruction that cannot have source modifiers. This includes texture instructions and math box instructions on pre-Gen7 platforms (see more details below). In many these cases the negation can be propagated into the instructions that generate the value (e.g., -(ab) = (-a)b). In addition to the operations implemtned in this patch, I also tried: - frcp - Helped 6 or fewer shaders on Gen7+, and hurt just as many on pre-Gen7. On Gen6 and earlier, frcp is a math box instruction, and math box instructions cannot have source modifiers. I suspect this is why so many more shaders are helped on Gen6 than on Gen5 or Gen7. Gen6 supports OpenGL 3.3, so a lot more shaders compile on it. A lot of these shaders may have things like cos(-x) or rcp(-x) that could result in an explicit negation instruction. - bcsel - Hurt a few shaders with none helped. bcsel operates on integer sources, so the fabs or fneg cannot be a source modifier in the bcsel itself. - Integer instructions - No changes on any Intel platform. Some notes about the shader-db results below. - On Tiger Lake, a single Deus Ex fragment shader is hurt for both spills and fills. - On Haswell, a different Deus Ex fragment shader is hurt for both spills and fills. - On GM45, the "LOST: 1" and "GAINED: 1" is a single Left4Dead 2 (very high graphics settings, lol) fragment shader that upgrades from SIMD8 to SIMD16. v2: Add support for fsign. Add some patterns that remove redundant negations and redundant absolute value rather than trying to push them down the tree. Tiger Lake total instructions in shared programs: 17611333 -> 17586465 (-0.14%) instructions in affected programs: 3033734 -> 3008866 (-0.82%) helped: 10310 HURT: 632 helped stats (abs) min: 1 max: 35 x̄: 2.61 x̃: 1 helped stats (rel) min: 0.04% max: 16.67% x̄: 1.43% x̃: 1.01% HURT stats (abs) min: 1 max: 47 x̄: 3.21 x̃: 2 HURT stats (rel) min: 0.04% max: 5.08% x̄: 0.88% x̃: 0.63% 95% mean confidence interval for instructions value: -2.33 -2.21 95% mean confidence interval for instructions %-change: -1.32% -1.27% Instructions are helped. total cycles in shared programs: 338365223 -> 338262252 (-0.03%) cycles in affected programs: 125291811 -> 125188840 (-0.08%) helped: 5224 HURT: 2031 helped stats (abs) min: 1 max: 5670 x̄: 46.73 x̃: 12 helped stats (rel) min: <.01% max: 34.78% x̄: 1.91% x̃: 0.97% HURT stats (abs) min: 1 max: 2882 x̄: 69.50 x̃: 14 HURT stats (rel) min: <.01% max: 44.93% x̄: 2.35% x̃: 0.74% 95% mean confidence interval for cycles value: -18.71 -9.68 95% mean confidence interval for cycles %-change: -0.80% -0.63% Cycles are helped. total spills in shared programs: 8942 -> 8946 (0.04%) spills in affected programs: 8 -> 12 (50.00%) helped: 0 HURT: 1 total fills in shared programs: 9399 -> 9401 (0.02%) fills in affected programs: 21 -> 23 (9.52%) helped: 0 HURT: 1 Ice Lake total instructions in shared programs: 16124348 -> 16102258 (-0.14%) instructions in affected programs: 2830928 -> 2808838 (-0.78%) helped: 11294 HURT: 2 helped stats (abs) min: 1 max: 12 x̄: 1.96 x̃: 1 helped stats (rel) min: 0.07% max: 17.65% x̄: 1.32% x̃: 0.93% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.45% max: 4.00% x̄: 3.72% x̃: 3.72% 95% mean confidence interval for instructions value: -1.99 -1.93 95% mean confidence interval for instructions %-change: -1.34% -1.29% Instructions are helped. total cycles in shared programs: 335393932 -> 335325794 (-0.02%) cycles in affected programs: 123834609 -> 123766471 (-0.06%) helped: 5034 HURT: 2128 helped stats (abs) min: 1 max: 3256 x̄: 43.39 x̃: 11 helped stats (rel) min: <.01% max: 35.79% x̄: 1.98% x̃: 1.00% HURT stats (abs) min: 1 max: 2634 x̄: 70.63 x̃: 16 HURT stats (rel) min: <.01% max: 49.49% x̄: 2.73% x̃: 0.62% 95% mean confidence interval for cycles value: -13.66 -5.37 95% mean confidence interval for cycles %-change: -0.69% -0.48% Cycles are helped. LOST: 0 GAINED: 2 Skylake total instructions in shared programs: 14949240 -> 14927930 (-0.14%) instructions in affected programs: 2594756 -> 2573446 (-0.82%) helped: 11000 HURT: 2 helped stats (abs) min: 1 max: 12 x̄: 1.94 x̃: 1 helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76% 95% mean confidence interval for instructions value: -1.97 -1.91 95% mean confidence interval for instructions %-change: -1.42% -1.37% Instructions are helped. total cycles in shared programs: 324829346 -> 324821596 (<.01%) cycles in affected programs: 121566087 -> 121558337 (<.01%) helped: 4611 HURT: 2147 helped stats (abs) min: 1 max: 3715 x̄: 33.29 x̃: 10 helped stats (rel) min: <.01% max: 36.08% x̄: 1.94% x̃: 1.00% HURT stats (abs) min: 1 max: 2551 x̄: 67.88 x̃: 16 HURT stats (rel) min: <.01% max: 53.79% x̄: 3.69% x̃: 0.89% 95% mean confidence interval for cycles value: -4.25 1.96 95% mean confidence interval for cycles %-change: -0.28% -0.02% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 14971203 -> 14949957 (-0.14%) instructions in affected programs: 2635699 -> 2614453 (-0.81%) helped: 10982 HURT: 2 helped stats (abs) min: 1 max: 12 x̄: 1.93 x̃: 1 helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76% 95% mean confidence interval for instructions value: -1.97 -1.90 95% mean confidence interval for instructions %-change: -1.42% -1.37% Instructions are helped. total cycles in shared programs: 336215033 -> 336086458 (-0.04%) cycles in affected programs: 127383198 -> 127254623 (-0.10%) helped: 4884 HURT: 1963 helped stats (abs) min: 1 max: 25696 x̄: 51.78 x̃: 12 helped stats (rel) min: <.01% max: 58.28% x̄: 2.00% x̃: 1.05% HURT stats (abs) min: 1 max: 3401 x̄: 63.33 x̃: 16 HURT stats (rel) min: <.01% max: 39.95% x̄: 2.20% x̃: 0.70% 95% mean confidence interval for cycles value: -29.99 -7.57 95% mean confidence interval for cycles %-change: -0.89% -0.71% Cycles are helped. total fills in shared programs: 24905 -> 24901 (-0.02%) fills in affected programs: 117 -> 113 (-3.42%) helped: 4 HURT: 0 LOST: 0 GAINED: 16 Haswell total instructions in shared programs: 13148927 -> 13131528 (-0.13%) instructions in affected programs: 2220941 -> 2203542 (-0.78%) helped: 8017 HURT: 4 helped stats (abs) min: 1 max: 12 x̄: 2.17 x̃: 1 helped stats (rel) min: 0.07% max: 15.25% x̄: 1.40% x̃: 0.93% HURT stats (abs) min: 1 max: 7 x̄: 2.50 x̃: 1 HURT stats (rel) min: 0.33% max: 4.76% x̄: 2.73% x̃: 2.91% 95% mean confidence interval for instructions value: -2.21 -2.13 95% mean confidence interval for instructions %-change: -1.43% -1.37% Instructions are helped. total cycles in shared programs: 321221791 -> 321079870 (-0.04%) cycles in affected programs: 126886055 -> 126744134 (-0.11%) helped: 4674 HURT: 1729 helped stats (abs) min: 1 max: 23654 x̄: 56.47 x̃: 16 helped stats (rel) min: <.01% max: 53.22% x̄: 2.13% x̃: 1.05% HURT stats (abs) min: 1 max: 3694 x̄: 70.58 x̃: 18 HURT stats (rel) min: <.01% max: 63.06% x̄: 2.48% x̃: 0.90% 95% mean confidence interval for cycles value: -33.31 -11.02 95% mean confidence interval for cycles %-change: -0.99% -0.78% Cycles are helped. total spills in shared programs: 19872 -> 19874 (0.01%) spills in affected programs: 21 -> 23 (9.52%) helped: 0 HURT: 1 total fills in shared programs: 20941 -> 20941 (0.00%) fills in affected programs: 62 -> 62 (0.00%) helped: 1 HURT: 1 LOST: 0 GAINED: 8 Ivy Bridge total instructions in shared programs: 11875553 -> 11853839 (-0.18%) instructions in affected programs: 1553112 -> 1531398 (-1.40%) helped: 7304 HURT: 3 helped stats (abs) min: 1 max: 16 x̄: 2.97 x̃: 2 helped stats (rel) min: 0.07% max: 15.25% x̄: 1.62% x̃: 1.15% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.05% max: 3.33% x̄: 2.44% x̃: 2.94% 95% mean confidence interval for instructions value: -3.04 -2.90 95% mean confidence interval for instructions %-change: -1.65% -1.59% Instructions are helped. total cycles in shared programs: 178246425 -> 178184484 (-0.03%) cycles in affected programs: 13702146 -> 13640205 (-0.45%) helped: 4409 HURT: 1566 helped stats (abs) min: 1 max: 531 x̄: 24.52 x̃: 13 helped stats (rel) min: <.01% max: 38.67% x̄: 2.14% x̃: 1.02% HURT stats (abs) min: 1 max: 356 x̄: 29.48 x̃: 10 HURT stats (rel) min: <.01% max: 64.73% x̄: 1.87% x̃: 0.70% 95% mean confidence interval for cycles value: -11.60 -9.14 95% mean confidence interval for cycles %-change: -1.19% -0.99% Cycles are helped. LOST: 0 GAINED: 10 Sandy Bridge total instructions in shared programs: 10695740 -> 10667483 (-0.26%) instructions in affected programs: 2337607 -> 2309350 (-1.21%) helped: 10720 HURT: 1 helped stats (abs) min: 1 max: 49 x̄: 2.64 x̃: 2 helped stats (rel) min: 0.07% max: 20.00% x̄: 1.54% x̃: 1.13% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.04% max: 1.04% x̄: 1.04% x̃: 1.04% 95% mean confidence interval for instructions value: -2.69 -2.58 95% mean confidence interval for instructions %-change: -1.57% -1.51% Instructions are helped. total cycles in shared programs: 153478839 -> 153416223 (-0.04%) cycles in affected programs: 22050900 -> 21988284 (-0.28%) helped: 5342 HURT: 2200 helped stats (abs) min: 1 max: 1020 x̄: 20.34 x̃: 16 helped stats (rel) min: <.01% max: 24.05% x̄: 1.51% x̃: 0.86% HURT stats (abs) min: 1 max: 335 x̄: 20.93 x̃: 6 HURT stats (rel) min: <.01% max: 20.18% x̄: 1.03% x̃: 0.30% 95% mean confidence interval for cycles value: -9.18 -7.42 95% mean confidence interval for cycles %-change: -0.82% -0.71% Cycles are helped. Iron Lake total instructions in shared programs: 8114882 -> 8105574 (-0.11%) instructions in affected programs: 1232504 -> 1223196 (-0.76%) helped: 4109 HURT: 2 helped stats (abs) min: 1 max: 6 x̄: 2.27 x̃: 1 helped stats (rel) min: 0.05% max: 8.33% x̄: 0.99% x̃: 0.66% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.94% max: 4.35% x̄: 2.65% x̃: 2.65% 95% mean confidence interval for instructions value: -2.31 -2.21 95% mean confidence interval for instructions %-change: -1.01% -0.96% Instructions are helped. total cycles in shared programs: 188504036 -> 188466296 (-0.02%) cycles in affected programs: 31203798 -> 31166058 (-0.12%) helped: 3447 HURT: 36 helped stats (abs) min: 2 max: 92 x̄: 11.03 x̃: 8 helped stats (rel) min: <.01% max: 5.41% x̄: 0.21% x̃: 0.13% HURT stats (abs) min: 2 max: 30 x̄: 7.33 x̃: 6 HURT stats (rel) min: 0.01% max: 1.65% x̄: 0.18% x̃: 0.10% 95% mean confidence interval for cycles value: -11.16 -10.51 95% mean confidence interval for cycles %-change: -0.22% -0.20% Cycles are helped. LOST: 0 GAINED: 1 GM45 total instructions in shared programs: 4989697 -> 4984531 (-0.10%) instructions in affected programs: 703952 -> 698786 (-0.73%) helped: 2493 HURT: 2 helped stats (abs) min: 1 max: 6 x̄: 2.07 x̃: 1 helped stats (rel) min: 0.05% max: 8.33% x̄: 1.03% x̃: 0.66% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.95% max: 4.35% x̄: 2.65% x̃: 2.65% 95% mean confidence interval for instructions value: -2.13 -2.01 95% mean confidence interval for instructions %-change: -1.07% -0.99% Instructions are helped. total cycles in shared programs: 128929136 -> 128903886 (-0.02%) cycles in affected programs: 21583096 -> 21557846 (-0.12%) helped: 2214 HURT: 17 helped stats (abs) min: 2 max: 92 x̄: 11.44 x̃: 8 helped stats (rel) min: <.01% max: 5.41% x̄: 0.24% x̃: 0.13% HURT stats (abs) min: 2 max: 8 x̄: 4.24 x̃: 4 HURT stats (rel) min: 0.01% max: 1.65% x̄: 0.20% x̃: 0.09% 95% mean confidence interval for cycles value: -11.75 -10.88 95% mean confidence interval for cycles %-change: -0.25% -0.22% Cycles are helped. LOST: 1 GAINED: 1 Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>	2020-04-01 00:28:38 +00:00
Jason Ekstrand	842338e2f0	nir: Add a nir_op_is_vec helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>	2020-03-31 00:18:05 +00:00
Iago Toral Quiroga	467c9a0faa	nir: add a bool bitsize lowering pass The pass lowers 1-bit booleans produced by NIR to the native bitsize of the operations that produce them. v2: change on lower_load_const_instr after upstream changes. Added TODO2 to explain it, as it was not properly tested yet (see already existing TODO) (Neil) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3885>	2020-03-24 23:21:21 +00:00
Caio Marcelo de Oliveira Filho	bf432cd831	nir: Add pass to combine adjacent scoped memory barriers SPIR-V generates very granular barriers, however HW and backends might not necessarily take advantage of those. This pass provides a general mechanism to combine such barriers. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224>	2020-03-12 19:21:36 +00:00
Caio Marcelo de Oliveira Filho	d31a8ed8fd	nir: Reorder nir_scopes so wider scope has larger numeric value Makes code comparing and combining scopes slightly more readable. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224>	2020-03-12 19:21:36 +00:00
Caio Marcelo de Oliveira Filho	67fc88fbb9	nir: Don't skip a bit in nir_memory_semantics There was another enum entry in the draft versions of nir_memory_semantics, but when it got dropped the entries were not updated. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224>	2020-03-12 19:21:36 +00:00
Timur Kristóf	ec16535b49	nir: Add ability to lower non-const quad broadcasts to const ones. Some hardware doesn't support subgroup shuffle, and on such hardware it makes no sense to lower quad broadcasts to shuffle. Instead, let's lower them to four const quad broadcasts, paired with bcsel instructions. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4147>	2020-03-12 13:16:07 +00:00
Daniel Schürmann	ce87da71e9	nir: add pass to lower discard() to demote() This pass is intended to work around game bugs, only! It also lowers nir_intrinsic_load_helper_invocation to nir_intrinsic_is_helper_invocation for consistency. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047>	2020-03-09 12:29:32 +00:00
Jason Ekstrand	349898a967	nir: Drop nir_tex_instr::texture_array_size It's set by lots of things and we spend a lot of time maintaining it but no one actually uses the value for anything useful. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3940> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3940>	2020-02-26 18:29:49 +00:00
Caio Marcelo de Oliveira Filho	956e4b2d37	nir, intel: Move use_scoped_memory_barrier to nir_options This option will be used later by GLSL, so move to a common struct. Because nir_options is filled in the compiler instead of the Vulkan driver, fix that up. GLSL will ignore that for now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913>	2020-02-24 19:12:11 +00:00
Caio Marcelo de Oliveira Filho	6ff898a653	nir: Add the alias NIR_MEMORY_ACQ_REL This will help upcoming C++ code that will have to combine those two semantics. In C++ it is not possible to do this without a cast or adding an operator\| to the enum. Since having the short form will also be convient to C, we picked the former solution. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913>	2020-02-24 19:12:11 +00:00
Eric Anholt	3e16434acd	nir: Move intel's intrinsic_image_coordinate_components() to core nir. This is a query that both Intel and freedreno need to do. We can simplify it a lot with the new glsl_get_sampler_dim_coordinate_components() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>	2020-02-24 18:25:02 +00:00
Alyssa Rosenzweig	7ab4e4dd96	nir: Add SSBO->global lowering pass To facilitate lowering SSBOs to globals, we need a load_ssbo_address intrinsic. This intrinsic takes an SSBO index and loads the address in global memory of the SSBO (likely implemented via a uniform in the driver). In the future, we'll support bounds checking, but at the moment this is not supported (this pass should only be used for trusted contexts at the moment, i.e. contexts without robustness extensions). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2753>	2020-02-21 13:06:22 +00:00
Arcady Goldmints-Orlov	e9f83185a2	Rename nir_lower_constant_initializers to nir_lower_variable_initalizers This is naming is more clear as nir_variables can be initializes not just with a nir_constant but with a pointer to another nir_variable. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047>	2020-02-12 15:41:49 +00:00
Arcady Goldmints-Orlov	7acc81056f	compiler/nir: Add support for variable initialization from a pointer Add a pointer_initializer field to nir_variable analogous to constant_initializer, which can be used to initialize the nir_variable to a pointer to another nir_variable. Just like the constant_initializer, the pointer_initializer gets eliminated in the nir_lower_constant_initializers pass. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047>	2020-02-12 15:41:49 +00:00
Eric Anholt	8d07d66180	glsl,nir: Switch the enum representing shader image formats to PIPE_FORMAT. This means you can directly use format utils on it without having to have your own GL enum to number-of-components switch statement (or whatever) in your vulkan backend. Thanks to imirkin for fixing up the nouveau driver (and a couple of core details). This fixes the computed qualifiers for EXT_shader_image_load_store's non-integer sizeNxM qualifiers, which we don't have tests for. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3d) Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>	2020-02-05 10:31:14 -08:00
Samuel Pitoiset	746e9e5d66	compiler: add a new explicit interpolation mode This introduces one more interpolation mode INTERP_MODE_EXPLICIT, which is needed for AMD_shader_explicit_vertex_parameter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	603e6ba972	nir: add two new texture ops for multisample fragment color/mask fetches This introduces: - nir_texop_fragment_mask_fetch (fetch a fragment mask from a compressed multisampled color surface) - nir_texop_fragment_fetch (fetch a color fragment for a particular sample at corresponding fragment mask index). These two texture operations are necessary for implementing SPV_AMD_shader_fragment_mask. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Ian Romanick	1bdfc6d7cb	nir/algebraic: Add lowering for 64-bit usub_sat v2: Rebase on `272e927d0e` ("nir/spirv: initial handling of OpenCL.std extension opcodes") v3: Add a new lower_usub_sat64 flag that only applies to the 64-bit version of the nir_op_usub_sat instruction. v4: Also enable the lowering when nir_lower_iadd64 is set. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	a483771045	nir/algebraic: Add lowering for 64-bit hadd and rhadd v2: Rebase on `272e927d0e` ("nir/spirv: initial handling of OpenCL.std extension opcodes") v3: Add a new lower_hadd64 flag that only applies to the 64-bit versions of the instructions. v4: Also enable the lowering when nir_lower_iadd64 is set. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Eric Anholt	d0975bfc4a	nir: Drop the ssbo_offset to atomic lowering. The arguments passed in were: - prog->info.num_ssbos - prog->nir->info.num_ssbos - arbitrary values for standalone compilers The num_ssbos should match between the prog's info and prog->nir's info until this lowering happens. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>	2020-01-21 10:06:23 -08:00
Jason Ekstrand	721666e52a	anv,nir: Lower quad_broadcast with dynamic index in NIR This is required for the subgroupBroadcastDynamicId feature that was added in Vulkan 1.2. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-15 08:34:57 -06:00
Rhys Perry	d8e05edbd9	nir/sink,nir/move: move/sink nir_op_mov Can uncover opportunities to move other instructions. This can increase register usage, but that doesn't seem to actually happen. This optimizes a pattern of a load_per_vertex_input followed by several moves and then a store_output in a different block. v2: add nir_move_copies to make it optional Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Acked-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>	2020-01-14 13:56:45 +00:00
Rhys Perry	1ffacc3ce1	nir/lower_gs_intrinsics: add option for per-stream counts Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2422> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2422>	2020-01-14 12:11:14 +00:00
Erik Faye-Lund	d9ff5f0414	nir/zink: move clip_halfz-lowering to common code Etnaviv also does the same thing, so let's try to avoid repetition here, and use the same for it code as well. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Paul Cercueil <paul@crapouillou.net>	2020-01-03 22:48:19 +00:00
Kenneth Graunke	19ed12afd1	st/nir: Optionally unify inputs_read/outputs_written when linking. i965 and iris use inputs_read/outputs_written for a shader stage to determine the layout of input and output storage. Adjacent stages must agree on the layout, so adjacent input/output bitfields must match. This patch adds a new nir_shader_compiler_options::unify_interfaces flag which asks the linker to unify the input/output interfaces between adjacent stages. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249>	2020-01-03 00:41:50 +00:00
Rob Clark	a8ec4082a4	nir+vtn: vec8+vec16 support This introduces new vec8 and vec16 instructions (which are the only instructions taking more than 4 sources), in order to construct 8 and 16 component vectors. In order to avoid fixing up the non-autogenerated nir_build_alu() sites and making them pass 16 src args for the benefit of the two instructions that take more than 4 srcs (ie vec8 and vec16), nir_build_alu() is has nir_build_alu_tail() split out and re-used by nir_build_alu2() (which is used for the > 4 src args case). v2 (Karol Herbst): use nir_build_alu2 for vec8 and vec16 use python's array multiplication syntax add nir_op_vec helper simplify nir_vec nir_build_alu_tail -> nir_builder_alu_instr_finish_and_insert use nir_build_alu for opcodes with <= 4 sources v3 (Karol Herbst): fix nir_serialize v4 (Dave Airlie): fix serialization of glsl_type handle vec8/16 in lowering of bools v5 (Karol Herbst): fix load store vectorizer Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-21 11:00:17 +00:00
Jonathan Marek	004797002f	nir: add option to lower half packing opcodes Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>	2019-12-16 19:20:07 -05:00
Timothy Arceri	56c25b938c	nir: add some fields to nir_variable_data These will be used to provide NIR linking functionality to GLSL. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Eric Anholt	8afab607ac	nir: Add a scheduler pass to reduce maximum register pressure. This is similar to a scheduler I've written for vc4 and i965, but this time written at the NIR level so that hopefully it's reusable. A notable new feature it has is Goodman/Hsu's heuristic of "once we've started processing the uses of a value, prioritize processing the rest of their uses", which should help avoid the heuristic otherwise making such systematically bad choices around getting texture results consumed. Results for v3d: total instructions in shared programs: 6497588 -> 6518242 (0.32%) total threads in shared programs: 154000 -> 152828 (-0.76%) total uniforms in shared programs: 2119629 -> 2068681 (-2.40%) total spills in shared programs: 4984 -> 472 (-90.53%) total fills in shared programs: 6418 -> 1546 (-75.91%) Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> (v1) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v2) v2: Use the DAG datastructure, fold in the scheduling-for-parallelism patch, include SSA defs in live values so we can switch to bottom-up if we want. v3: Squash in improvements from Alejandro Piñeiro for getting V3D to successfully register allocate on GLES3.1 dEQP. Make sure that discards don't move after store_output. Comment spelling fix.	2019-11-25 21:12:21 +00:00
Rhys Perry	ce9205c03b	nir: add a load/store vectorization pass This pass combines intersecting, adjacent and identical loads/stores into potentially larger ones and will be used by ACO to greatly reduce the number of memory operations. v2: handle nir_deref_type_ptr_as_array v3: assume explicitly laid out types for derefs v4: create less deref casts v4: fix shared boolean vectorization v4: fix copy+paste error in resources_different v4: fix extract_subvector() to pass nir_load_store_vectorize_test.ssbo_load_intersecting_32_32_64 v4: rebase v5: subtract from deref/offset instead of scheduling offset calculations v5: various non-functional changes/cleanups v5: require less metadata and preserve more v5: rebase v6: cleanup and improve dependency handling v6: emit less deref casts v6: pass undef to components not set in the write_mask for new stores v7: fix 8-bit extract_vector() with 64-bit input v7: cleanup creation of store write data v7: update align correctly for when the bit size of load/store increases v7: rename extract_vector to extract_component and update comment v8: prevent combining of row-major matrix column acceses v9: rework process_block() to be able to vectorize more v9: rework the callback function v9: update alignment on all loads/stores, even if they're not vectorized v9: remove entry::store_value, since it will not be updated if it's was from a vectorized load v9: fix bug in subtract_deref(), causing artifacts in Dishonored 2 v9: handle nir_intrinsic_scoped_memory_barrier v10: use nir_ssa_scalar v10: handle non-32-bit offsets v10: use signed offsets for comparison v10: improve create_entry_key_from_offset() v10: support load_shared/store_shared v10: remove strip_deref_casts() v10: don't ever pass NULL to memcmp v10: remove recursion in gcd() v10: fix outdated comment v11: use the new nir_extract_bits() v12: remove use of nir_src_as_const_value in resources_different v13: make entry key hash function deterministic v13: simplify mask_sign_extend() v14: add comment in hash_entry_key() about hashing pointers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v9)	2019-11-25 13:59:11 +00:00
Rhys Perry	c14f823ee5	nir: add nir_num_variable_modes and nir_var_mem_push_const These will be useful in the upcoming load/store vectorizer. v11: rebase Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-25 13:59:11 +00:00
Marek Olšák	ff71fae440	nir: strip as we serialize to remove the nir_shader_clone call Serializing stripped NIR is faster now. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-21 18:49:57 -05:00
Dave Airlie	d0d96053e6	nir: add 64-bit ufind_msb lowering support. (v2) This adds the option to lower 64-bit ufind_msb opcodes. v2: use split_x/y removes component loops (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-22 04:37:37 +10:00
Rhys Perry	9f92e8b721	nir: add nir_variable::index and nir_index_vars This will be useful as a deterministic identifier/index for the variable. v2: fix comment style Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1)	2019-11-20 15:05:42 +00:00
Rhys Perry	45a0b53490	nir: make nir_variable::{num_members,num_state_slots} a uint16_t Doesn't shrink it (at least, on x86-64) and leaves space for more members. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-20 15:05:42 +00:00
Neil Roberts	634eb9c04b	nir: Add a 8-bit bool type Adds nir_type_bool8 as well as 8-bit versions of all the bool opcodes. Reviewed-by: Rob Clark <robdclark@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	0f5640c577	nir: Add a 16-bit bool type Adds nir_type_bool16 as well as 16-bit versions of all the bool opcodes. Reviewed-by: Rob Clark <robdclark@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 14:09:43 +01:00
Marek Olšák	654efd38bb	nir: don't use GLenum16 in nir.h Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-19 18:20:12 -05:00
Marek Olšák	ec7d37c9c0	nir: move data.descriptor_set above data.index for better packing 4 bytes down Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-19 18:20:10 -05:00

... 3 4 5 6 7 ...

773 commits