fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-04-21 09:50:36 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	adc077797a	spirv: Add initial subgroup support Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	5162a1d884	nir: Add new SPIR-V ballot intrinsics and lowering Someone can make the lowering optional later if they want something different for their hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	752e969703	compiler: Add two new system values for subgroups This will be required for SPIR-V subgroup support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	34c60ea02b	nir: Add new SPIR-V ballot ALU intrinsics and lowering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	cc587ee9a7	spirv: Handle the new OpModuleProcessed instruction Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	ff9db1a4cc	nir/spirv: Add support for device groups Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	0128187335	spirv: Update the SPIR-V headers and json to 1.3.1 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	68af9f04a4	spirv: Rework barriers Our previous handling of barriers always used the big hammer and didn't correctly emit memory barriers when specified along with a control barrier. This commit completely reworks the way we emit barriers to make things both more precise and more correct. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	de518f38e5	spirv: Add a vtn_constant_value helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Simon Hausmann	fb5825e7ce	glsl: Fix memory leak with known glsl_type instances When looking up known glsl_type instances in the various hash tables, we end up leaking the key instances used for the lookup, as the glsl_type constructor allocates memory on the global mem_ctx. This patch changes glsl_type to manage its own memory, which fixes the leak and also allows getting rid of the global mem_ctx and its mutex. v2: remove lambda usage (Tapani) (+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Simon Hausmann <simon.hausmann@qt.io> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-07 14:33:34 +02:00
Caio Marcelo de Oliveira Filho	c17808562e	spirv: Add SpvCapabilityShaderViewportIndexLayerEXT This capability allows gl_ViewportIndex and gl_Layer to also be used as outputs in Vertex and Tesselation shaders. v2: Make conditional to the capability, add gl_Layer, add tesselation shaders. (Iago) v3: Don't export to tesselation control shader. v4: Add Reviewd-by tag. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 07:04:20 +01:00
Timothy Arceri	1fdb21541e	Revert "nir: bump loop unroll limit to 96." This reverts commit `2d36efdb7f`. This raised limit turns out to harmful for more complex shaders, it causes excessive spilling in some Bioshock Infinite shaders. The fps for the ssao demo on radv remains unchanged when reverting this. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-07 15:10:05 +11:00
Ian Romanick	e3ea166a2c	nir: Simplify some comparisons like a+b < a All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14514555 -> 14514547 (<.01%) instructions in affected programs: 1972 -> 1964 (-0.41%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.41% -0.40% Instructions are helped. total cycles in shared programs: 533141444 -> 533136780 (<.01%) cycles in affected programs: 164728 -> 160064 (-2.83%) helped: 181 HURT: 3 helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30 helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80% HURT stats (abs) min: 4 max: 54 x̄: 24.00 x̃: 14 HURT stats (rel) min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68% 95% mean confidence interval for cycles value: -27.12 -23.58 95% mean confidence interval for cycles %-change: -3.54% -3.16% Cycles are helped. Sandy Bridge total instructions in shared programs: 10533667 -> 10533539 (<.01%) instructions in affected programs: 10148 -> 10020 (-1.26%) helped: 124 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04% 95% mean confidence interval for instructions value: -1.06 -1.00 95% mean confidence interval for instructions %-change: -2.46% -1.95% Instructions are helped. total cycles in shared programs: 146136887 -> 146132122 (<.01%) cycles in affected programs: 206382 -> 201617 (-2.31%) helped: 171 HURT: 0 helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30 helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67% 95% mean confidence interval for cycles value: -29.19 -26.54 95% mean confidence interval for cycles %-change: -3.20% -2.76% Cycles are helped. Iron Lake total instructions in shared programs: 7886515 -> 7886507 (<.01%) instructions in affected programs: 3016 -> 3008 (-0.27%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.27% -0.26% Instructions are helped. total cycles in shared programs: 178100396 -> 178100388 (<.01%) cycles in affected programs: 156128 -> 156120 (<.01%) helped: 4 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -3.68 1.68 95% mean confidence interval for cycles %-change: -0.03% <.01% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857872 -> 4857868 (<.01%) instructions in affected programs: 1544 -> 1540 (-0.26%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.28% -0.24% Instructions are helped. total cycles in shared programs: 122167654 -> 122167662 (<.01%) cycles in affected programs: 96248 -> 96256 (<.01%) helped: 0 HURT: 4 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: <.01% 0.02% Cycles are HURT. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:30 -08:00
Ian Romanick	d1ed4ffe0b	nir: Use De Morgan's Law on logic compounded comparisons The replacement of the comparison operators must happen during this step. If it does not, the next pass of nir_opt_algebraic will reapply De Morgan's Law in the "opposite direction" before performing dead code elimination. The resulting infinite loop will eventually get OOM killed. Haswell, Broadwell, and Skylake had similar results. (Broadwell shown) total instructions in shared programs: 14808185 -> 14808036 (<.01%) instructions in affected programs: 13758 -> 13609 (-1.08%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3 helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01% 95% mean confidence interval for instructions value: -4.67 -2.97 95% mean confidence interval for instructions %-change: -1.09% -0.88% Instructions are helped. total cycles in shared programs: 559438333 -> 559435832 (<.01%) cycles in affected programs: 199160 -> 196659 (-1.26%) helped: 42 HURT: 3 helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51 helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40% HURT stats (abs) min: 2 max: 40 x̄: 27.33 x̃: 40 HURT stats (rel) min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74% 95% mean confidence interval for cycles value: -71.47 -39.69 95% mean confidence interval for cycles %-change: -1.64% -0.93% Cycles are helped. Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) total instructions in shared programs: 11811776 -> 11811553 (<.01%) instructions in affected programs: 15201 -> 14978 (-1.47%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6 helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26% 95% mean confidence interval for instructions value: -7.21 -4.23 95% mean confidence interval for instructions %-change: -1.48% -1.12% Instructions are helped. total cycles in shared programs: 257617270 -> 257614589 (<.01%) cycles in affected programs: 212107 -> 209426 (-1.26%) helped: 45 HURT: 0 helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54 helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32% 95% mean confidence interval for cycles value: -74.02 -45.14 95% mean confidence interval for cycles %-change: -1.59% -1.01% Cycles are helped. Iron Lake total instructions in shared programs: 7886648 -> 7886515 (<.01%) instructions in affected programs: 14106 -> 13973 (-0.94%) helped: 29 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4 helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81% 95% mean confidence interval for instructions value: -5.65 -3.52 95% mean confidence interval for instructions %-change: -1.03% -0.76% Instructions are helped. total cycles in shared programs: 178100812 -> 178100396 (<.01%) cycles in affected programs: 67970 -> 67554 (-0.61%) helped: 29 HURT: 0 helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12 helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54% 95% mean confidence interval for cycles value: -18.30 -10.39 95% mean confidence interval for cycles %-change: -0.71% -0.45% Cycles are helped. GM45 total instructions in shared programs: 4857939 -> 4857872 (<.01%) instructions in affected programs: 7426 -> 7359 (-0.90%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4 helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77% 95% mean confidence interval for instructions value: -6.06 -2.87 95% mean confidence interval for instructions %-change: -1.06% -0.67% Instructions are helped. total cycles in shared programs: 122167930 -> 122167654 (<.01%) cycles in affected programs: 43118 -> 42842 (-0.64%) helped: 15 HURT: 0 helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16 helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54% 95% mean confidence interval for cycles value: -25.03 -11.77 95% mean confidence interval for cycles %-change: -0.82% -0.41% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	52607658ff	nir: Replace fmin(b2f(a), b) with a bcsel All of the affected shaders are HDR mappers from Serious Sam 3. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14516285 -> 14516273 (<.01%) instructions in affected programs: 348 -> 336 (-3.45%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -5.55% -3.06% Instructions are helped. total cycles in shared programs: 533163876 -> 533163808 (<.01%) cycles in affected programs: 1144 -> 1076 (-5.94%) helped: 4 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94% 95% mean confidence interval for cycles value: -18.84 -15.16 95% mean confidence interval for cycles %-change: -6.20% -5.68% Cycles are helped. Sandy Bridge total instructions in shared programs: 10533321 -> 10533309 (<.01%) instructions in affected programs: 372 -> 360 (-3.23%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -4.96% -2.86% Instructions are helped. total cycles in shared programs: 146136632 -> 146136428 (<.01%) cycles in affected programs: 11668 -> 11464 (-1.75%) helped: 12 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29% 95% mean confidence interval for cycles value: -17.66 -16.34 95% mean confidence interval for cycles %-change: -2.82% -1.58% Cycles are helped. Iron Lake total instructions in shared programs: 7886301 -> 7886277 (<.01%) instructions in affected programs: 576 -> 552 (-4.17%) helped: 12 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -5.30% -3.72% Instructions are helped. total cycles in shared programs: 178113176 -> 178113176 (0.00%) cycles in affected programs: 2116 -> 2116 (0.00%) helped: 2 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58% 95% mean confidence interval for cycles value: -3.25 3.25 95% mean confidence interval for cycles %-change: -0.93% 0.94% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857756 -> 4857744 (<.01%) instructions in affected programs: 294 -> 282 (-4.08%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -5.71% -3.09% Instructions are helped. total cycles in shared programs: 122178730 -> 122178722 (<.01%) cycles in affected programs: 700 -> 692 (-1.14%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	b974dfee11	nir: Pull b2f out of bcsel All platforms had similar results. (Skylake shown) total instructions in shared programs: 14516592 -> 14516586 (<.01%) instructions in affected programs: 500 -> 494 (-1.20%) helped: 2 HURT: 0 total cycles in shared programs: 533167044 -> 533166998 (<.01%) cycles in affected programs: 6988 -> 6942 (-0.66%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	f50400cc80	nir: Replace an odd comparison involving fmin of -b2f I noticed the fge version while looking at a shader for an unrelated reason. The feq version prevents a regression in a later change that performs strength reduction of some compares. Broadwell and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14514808 -> 14514796 (<.01%) instructions in affected programs: 750 -> 738 (-1.60%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.83% max: 1.96% x̄: 1.40% x̃: 1.40% 95% mean confidence interval for instructions value: -6.67 0.67 95% mean confidence interval for instructions %-change: -2.43% -0.36% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533144939 -> 533144853 (<.01%) cycles in affected programs: 8911 -> 8825 (-0.97%) helped: 4 HURT: 0 helped stats (abs) min: 16 max: 32 x̄: 21.50 x̃: 19 helped stats (rel) min: 0.60% max: 1.89% x̄: 1.28% x̃: 1.31% 95% mean confidence interval for cycles value: -32.94 -10.06 95% mean confidence interval for cycles %-change: -2.30% -0.26% Cycles are helped. Haswell total instructions in shared programs: 13093785 -> 13093775 (<.01%) instructions in affected programs: 924 -> 914 (-1.08%) helped: 4 HURT: 2 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.82% max: 1.95% x̄: 1.39% x̃: 1.39% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19% 95% mean confidence interval for instructions value: -4.53 1.20 95% mean confidence interval for instructions %-change: -2.02% 0.97% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 409580553 -> 409580118 (<.01%) cycles in affected programs: 10909 -> 10474 (-3.99%) helped: 5 HURT: 1 helped stats (abs) min: 6 max: 222 x̄: 89.60 x̃: 18 helped stats (rel) min: 0.16% max: 24.72% x̄: 9.54% x̃: 1.78% HURT stats (abs) min: 13 max: 13 x̄: 13.00 x̃: 13 HURT stats (rel) min: 0.39% max: 0.39% x̄: 0.39% x̃: 0.39% 95% mean confidence interval for cycles value: -180.68 35.68 95% mean confidence interval for cycles %-change: -19.55% 3.79% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 11811851 -> 11811840 (<.01%) instructions in affected programs: 1032 -> 1021 (-1.07%) helped: 5 HURT: 1 helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 1 helped stats (rel) min: 0.63% max: 1.95% x̄: 1.13% x̃: 0.97% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19% 95% mean confidence interval for instructions value: -4.17 0.51 95% mean confidence interval for instructions %-change: -1.86% 0.36% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 257618403 -> 257618168 (<.01%) cycles in affected programs: 10784 -> 10549 (-2.18%) helped: 4 HURT: 2 helped stats (abs) min: 4 max: 220 x̄: 64.50 x̃: 17 helped stats (rel) min: 0.50% max: 24.34% x̄: 7.07% x̃: 1.72% HURT stats (abs) min: 9 max: 14 x̄: 11.50 x̃: 11 HURT stats (rel) min: 0.24% max: 0.42% x̄: 0.33% x̃: 0.33% 95% mean confidence interval for cycles value: -133.11 54.78 95% mean confidence interval for cycles %-change: -14.79% 5.59% Inconclusive result (value mean confidence interval includes 0). GM45, Iron Lake, and Sandy Bridge had similar results. (Sandy Bridge shown) total instructions in shared programs: 10533871 -> 10533859 (<.01%) instructions in affected programs: 865 -> 853 (-1.39%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.63% max: 1.83% x̄: 1.22% x̃: 1.21% 95% mean confidence interval for instructions value: -6.67 0.67 95% mean confidence interval for instructions %-change: -2.16% -0.29% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 146139904 -> 146139852 (<.01%) cycles in affected programs: 15213 -> 15161 (-0.34%) helped: 4 HURT: 0 helped stats (abs) min: 3 max: 18 x̄: 13.00 x̃: 15 helped stats (rel) min: 0.15% max: 0.84% x̄: 0.39% x̃: 0.29% 95% mean confidence interval for cycles value: -23.79 -2.21 95% mean confidence interval for cycles %-change: -0.88% 0.09% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	380136e998	nir: Mark bcsel-to-fmin (or fmax) transformations as inexact These transformations are inexact because section 4.7.1 (Range and Precision) says: Operations and built-in functions that operate on a NaN are not required to return a NaN as the result. The fmin or fmax might not return NaN in cases where the original expression would be required to return NaN. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-06 11:17:14 -08:00
Ian Romanick	4addd34b04	nir: Recognize some more open-coded fmin / fmax This transformation is inexact because section 4.7.1 (Range and Precision) says: Operations and built-in functions that operate on a NaN are not required to return a NaN as the result. The fmin or fmax might not return NaN in cases where the original expression would be required to return NaN. v2: Reorder operands and mark as inexact. The latter suggested by Jason. shader-db results: Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14514817 -> 14514808 (<.01%) instructions in affected programs: 229 -> 220 (-3.93%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4 helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12% total cycles in shared programs: 533145211 -> 533144939 (<.01%) cycles in affected programs: 37268 -> 36996 (-0.73%) helped: 8 HURT: 0 helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2 helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05% Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) total cycles in shared programs: 257618409 -> 257618403 (<.01%) cycles in affected programs: 12582 -> 12576 (-0.05%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05% No changes on Iron Lake or GM45. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-06 11:17:14 -08:00
Ian Romanick	3a944316c4	nir: Silence unused parameter warnings in generated nir_constant_expressions code Reduces my build from 2075 warnings to 2023 warnings by silencing 52 instances of things like src/compiler/nir/nir_constant_expressions.c: In function ‘evaluate_bfi’: src/compiler/nir/nir_constant_expressions.c:1812:61: warning: unused parameter ‘bit_size’ [-Wunused-parameter] evaluate_bfi(MAYBE_UNUSED unsigned num_components, unsigned bit_size, ^~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Jose Maria Casanova Crespo	4420d8866c	nir/search: Include 8 and 16-bit support in construct_value Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-01 09:16:03 -08:00
Jason Ekstrand	99ee40fb54	nir/search: Support 8 and 16-bit constants in match_value Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-03-01 09:15:01 -08:00
Alejandro Piñeiro	e72fb4e611	nir/serialize: handle var->name being NULL var->name could be NULL under ARB_gl_spirv for example. And in any case, the code is already handing var name being NULL when reading a variable, so it is consistent to do it writing a variable too. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-01 08:23:33 +01:00
Jose Maria Casanova Crespo	02266f9ba1	spirv/i965/anv: Relax push constant offset assertions being 32-bit aligned The introduction of 16-bit types with VK_KHR_16bit_storages implies that push constant offsets could be multiple of 2-bytes. Some assertions are updated so offsets should be just multiple of size of the base type but in some cases we can not assume it as doubles aren't aligned to 8 bytes in some cases. For 16-bit types, the push constant offset takes into account the internal offset in the 32-bit uniform bucket adding 2-bytes when we access not 32-bit aligned elements. In all 32-bit aligned cases it just becomes 0. v2: Assert offsets to be aligned to the dest type size. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	23ffb7c2d1	spirv: Calculate properly 16-bit vector sizes Range in 16-bit push constants load was being calculated wrongly using 4-bytes per element instead of 2-bytes as it should be. v2: Use glsl_get_bit_size instead of if statement (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Samuel Iglesias Gonsálvez	e207b2e2c8	glsl/linker: fix bug when checking precision qualifier According to GLSL ES 3.2 spec, see table in 9.2.1 "Linked Shaders" section, the precision qualifier should match for uniform variables. This also applies to previous GLSL ES 3.x specs. This 'if' checks the condition for uniform variables, while for UBOs it is checked in link_interface_blocks.cpp. Fixes: `b50b82b8a5` ("glsl/es31: precision qualifier doesn't need to match in shader interface block members") Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-28 07:04:13 +01:00
Timothy Arceri	0c1f37cc2d	nir: fix interger divide by zero crash during constant folding From the GLSL 4.60 spec Section 5.9 (Expressions): "Dividing by zero does not cause an exception but does result in an unspecified value." Fixes: `89285e4d47` "nir: add new constant folding infrastructure" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105271	2018-02-28 15:55:39 +11:00
Timothy Arceri	a050ea60ee	nir: add lower_ldexp to nir compiler options Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Francisco Jerez	69b4a9d21d	util/bitset: Make C++ wrapper trivially constructible. In order to fix a build failure on compilers not implementing unrestricted unions, which is a C++11 feature. v2: Provide signed integer comparison and assignment operators instead of BITSET_WORD ones to avoid spurious ambiguity warnings on comparisons with a signed integer literal. Fixes: `ba79a90fb5` "glsl: Switch ast_type_qualifier to a 128-bit bitset." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105238 Tested-by: Roland Scheidegger <sroland@vmware.com> Tested-By: George Kyriazis <george.kyriazis@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-27 11:38:18 -08:00
Francisco Jerez	c6c64d4d6a	glsl: Silence warnings when reading from a framebuffer fetch output. Framebuffer fetch outputs are implicitly initialized upon entry to the fragment shader. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	537bb1da98	glsl: Specify framebuffer fetch coherency mode in lower_blend_equation_advanced(). This requires passing an extra argument to the lowering pass because the KHR_blend_equation_advanced specification doesn't seem to define any mechanism for the implementation to determine at compile-time whether coherent blending can ever be used (not even an "#extension KHR_blend_equation_advanced_coherent" directive seems to be required in the shader source AFAICT). In the long run we'll probably want to do state-dependent recompiles based on the value of ctx->Color.BlendCoherent, but right now there would be no benefit from that because the only driver that supports coherent framebuffer fetch is i965 on SKL+ hardware, which are unable to support the non-coherent path for the moment because of texture layout issues, so framebuffer fetch coherency is always enabled for them. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	ef9e3f63ca	glsl: Add support for the framebuffer fetch layout(noncoherent) qualifier. This allows the application to request framebuffer fetch coherency with per-fragment output granularity. Coherent framebuffer fetch outputs (which is the default if no qualifier is present for compatibility with older versions of the EXT_shader_framebuffer_fetch extension) will have ir_variable_data::memory_coherent set to true. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	0aeec504b4	glsl: Allow layout token for EXT_shader_framebuffer_fetch_non_coherent. EXT_shader_framebuffer_fetch_non_coherent requires layout qualifiers even on GL(ES) 2. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	1bc01db95f	glsl: Initialize ir_variable_data::fb_fetch_output earlier for GL(ES) 2. At the same point where it is initialized on GL(ES) 3.0+ so we can implement some common layout qualifier handling in a future commit. Until now the fb_fetch_output flag would be inherited from the original implicit gl_LastFragData declaration at a later point in the AST to GLSL IR translation. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	6ebefb0fd5	glsl: Replace MESA_shader_framebuffer_fetch extension flags with EXT ones. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	ba79a90fb5	glsl: Switch ast_type_qualifier to a 128-bit bitset. This should end the drought of bits in the ast_type_qualifier object. The bitset_t type works pretty much as a drop-in replacement for the current uint64_t bitset. The only catch is that the bitset_t type as defined in the previous commit doesn't have a trivial constructor (because it has a user-defined constructor), so it cannot be used as union member without providing a user-defined constructor for the union (which causes it in turn to be non-trivially constructible). This annoyance could be easily addressed in C++11 by declaring the default constructor of bitset_t to be the implicitly defined one -- IMO one more reason to drop support for GCC 4.2-4.3. The other minor change was required because glsl_parser_extras.cpp was hard-coding the type of bitset temporaries as uint64_t, which (unlike would have been the case if the uint64_t had been replaced with e.g. an __int128) would otherwise have caused a build failure, because the boolean conversion operator of bitset_t is marked explicit (if C++11 is available), so the bitset won't be silently truncated down to 1 bit in order to use it to initialize the uint64_t temporaries (yikes). Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Marek Olšák	605a7f6db5	mesa: implement ARB_compatibility Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:15 +01:00
Samuel Pitoiset	63fb30c674	nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a) Similar for the 4 case. Suggested by Bas. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:45 +01:00
Samuel Pitoiset	b18997876f	nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b)) Otherwise the code size increases because the original fexp2() instructions can't be deleted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:43 +01:00
Samuel Pitoiset	3c40be126f	spirv: apply memory qualifiers to images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:39:53 +01:00
Kenneth Graunke	183ce5e629	glsl: Parse 'layout' as a token with advanced blending or bindless Both KHR_blend_equation_advanced and ARB_bindless_texture provide layout qualifiers, and are exposed in compatibility contexts. We need to parse the layout qualifier as a token in order for those to work, but forgot to extend this check. ARB_shader_image_load_store would need a similar treatment, but we don't expose that in legacy OpenGL contexts. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-02-21 17:50:57 -08:00
Timothy Arceri	cdeac00267	nir: remove old assert This was originally intended to make sure the remap location was not -1. However the code has changed alot since then, the location is now never set to -1 and we also handle components meaning this old assert has been doing comparisions with the pointer to the array of component data. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183	2018-02-22 09:31:00 +11:00
Eric Anholt	4636ce362d	glsl/tests: Fix a compiler warning about signed/unsigned loop comparison. Fixes: `d32956935e` ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	1b313eedb5	glsl: Silence warnings in the uniform initializer test about 16-bit types They should probably get unit tests implemented, but this cleans up a bunch of warnings in my build for now. Fixes: `59f458cd87` ("glsl: Add 16-bit types") Cc: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Timothy Arceri	347038baa9	glsl/nir: add pixel_center_integer to shader info Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-19 08:47:48 +11:00
Eric Engestrom	a176b053b6	glsl: fix sizeof(pointer) bug Doesn't really change anything to the test though ¯\_(ツ)_/¯ CID: 1429511 Fixes: `e8495646af` "glsl/tests: changes to test_disk_cache_create test" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-16 12:04:29 +00:00
Marek Olšák	6b1e26e181	mesa: move STATE_LENGTH to shader_enums.h and use it everywhere Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	e149a0253c	mesa,glsl,nir: reduce gl_state_index size to 2 bytes Let's use the new gl_state_index16 type everywhere and remove the typecasts. This helps reduce the size of gl_program_parameter. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	65ed98839b	mesa: reduce the size of gl_program gl_program: 1456 -> 976 bytes Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Eric Anholt	21670f8208	glsl/tests: Fix strict aliasing warning about int64/double. Fixes: `4bf9862747` ("glsl/tests: Add UINT64 and INT64 types") Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2018-02-12 20:48:43 +00:00

1 2 3 4 5 ...

2373 commits