Commit graph

111543 commits

Author SHA1 Message Date
Iago Toral Quiroga
09d230c6cf v3d: fix scheduling dependency tracking for ALU with small immediates
We were not accountint for small immediates in the B mux so the scheduler
was interpreting these are regular register file accesses, which could
lead to additional (incorrect) write-read dependencies.

Shader-db changes:

total instructions in shared programs: 9163664 -> 9137263 (-0.29%)
instructions in affected programs: 3931035 -> 3904634 (-0.67%)
helped: 12457
HURT: 2563

total max-temps in shared programs: 1325787 -> 1325597 (-0.01%)
max-temps in affected programs: 5746 -> 5556 (-3.31%)
helped: 186
HURT: 16
helped stats (abs) min: 1 max: 4 x̄: 1.12 x̃: 1
helped stats (rel) min: 1.45% max: 22.22% x̄: 4.42% x̃: 3.28%
HURT stats (abs)   min: 1 max: 3 x̄: 1.12 x̃: 1
HURT stats (rel)   min: 2.86% max: 10.00% x̄: 5.76% x̃: 5.88%
95% mean confidence interval for max-temps value: -1.04 -0.84
95% mean confidence interval for max-temps %-change: -4.16% -3.07%
Max-temps are helped.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-06 08:16:43 +02:00
Vasily Khoruzhick
b412e05751 lima/ppir: add missing handling of min/max ops for vec4 add slot
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-06 04:30:36 +00:00
Vasily Khoruzhick
5980565a37 lima/ppir: fix crash when program uses no registers at all
Program may need no regalloc at all, e.g. in case when program consists
of single discard op.

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-06 04:30:36 +00:00
Jason Ekstrand
b38dab101c util/hash_table: Assert that keys are not reserved pointers
If we insert a NULL key, it will appear to succeed but will mess up
entry counting.  Similar errors can occur if someone accidentally
inserts the deleted key.  The later is highly unlikely but technically
possible so we should guard against it too.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-06 00:27:53 +00:00
Jason Ekstrand
8306dabc03 util/set: Assert that keys are not reserved pointers
If we insert a NULL key, it will appear to succeed but will mess up
entry counting.  Similar errors can occur if someone accidentally
inserts the deleted key.  The later is highly unlikely but technically
possible so we should guard against it too.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-06 00:27:53 +00:00
Jason Ekstrand
7a18ce0b91 glsl/loop_analysis: Don't search for NULL variables in the hash table
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-06 00:27:53 +00:00
Jason Ekstrand
d96878a66a nir/propagate_invariant: Don't add NULL vars to the hash table
Fixes: 8410cf66d "nir/propagate_invariant: Skip unknown vars"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-06 00:27:53 +00:00
Ian Romanick
1c30d26d89 intel/compiler: Treat b32csel as potentially producing a Boolean result for resolve analysis
If the 2nd and 3rd source are both Boolean values, we can potentially
avoid a resolve by only resolving the result of the b32csel.

No changes on any Gen6+ Intel platform.

v2: Use ?: instead of cast from bool to unsigned.  Suggested by Caio.

Iron Lake
total instructions in shared programs: 8142729 -> 8142677 (<.01%)
instructions in affected programs: 12890 -> 12838 (-0.40%)
helped: 26
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.25% max: 0.74% x̄: 0.45% x̃: 0.38%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -0.52% -0.39%
Instructions are helped.

total cycles in shared programs: 188549632 -> 188549394 (<.01%)
cycles in affected programs: 60754 -> 60516 (-0.39%)
helped: 25
HURT: 1
helped stats (abs) min: 2 max: 26 x̄: 9.92 x̃: 8
helped stats (rel) min: 0.07% max: 2.23% x̄: 0.59% x̃: 0.27%
HURT stats (abs)   min: 10 max: 10 x̄: 10.00 x̃: 10
HURT stats (rel)   min: 0.70% max: 0.70% x̄: 0.70% x̃: 0.70%
95% mean confidence interval for cycles value: -12.91 -5.40
95% mean confidence interval for cycles %-change: -0.84% -0.23%
Cycles are helped.

GM45
total instructions in shared programs: 5013119 -> 5013093 (<.01%)
instructions in affected programs: 6764 -> 6738 (-0.38%)
helped: 13
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.24% max: 0.68% x̄: 0.43% x̃: 0.36%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -0.52% -0.34%
Instructions are helped.

total cycles in shared programs: 128977804 -> 128977700 (<.01%)
cycles in affected programs: 37738 -> 37634 (-0.28%)
helped: 13
HURT: 0
helped stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8
helped stats (rel) min: 0.18% max: 0.46% x̄: 0.30% x̃: 0.26%
95% mean confidence interval for cycles value: -8.00 -8.00
95% mean confidence interval for cycles %-change: -0.36% -0.24%
Cycles are helped.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-05 17:04:17 -07:00
Ian Romanick
0ba9497e66 intel/fs: Improve discard_if code generation
Previously we would blindly emit an sequence like:

        mov(1)          f0.1<1>UW       g1.14<0,1,0>UW
        ...
        cmp.l.f0(16)    g7<1>F          g5<8,8,1>F      0x41700000F  /* 15F */
(+f0.1) cmp.z.f0.1(16)  null<1>D        g7<8,8,1>D      0D

The first move sets the flags based on the initial execution mask.
Later discard sequences contain a predicated compare that can only
remove more SIMD channels.  Often times the only user of the result from
the first compare is the second compare.  Instead, generate a sequence
like

        mov(1)          f0.1<1>UW       g1.14<0,1,0>UW
        ...
        cmp.l.f0(16)    g7<1>F          g5<8,8,1>F      0x41700000F  /* 15F */
(+f0.1) cmp.ge.f0.1(8)  null<1>F        g5<8,8,1>F      0x41700000F  /* 15F */

If the results stored in g7 and f0.0 are not used, the comparison will
be eliminated.  This removes an instruction and potentially reduces
register pressure.

v2: Major re-write of the commit message (including fixing the assembly
code).  Suggested by Matt.

All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 17224434 -> 17198659 (-0.15%)
instructions in affected programs: 2908125 -> 2882350 (-0.89%)
helped: 18891
HURT: 5
helped stats (abs) min: 1 max: 12 x̄: 1.38 x̃: 1
helped stats (rel) min: 0.03% max: 25.00% x̄: 1.76% x̃: 1.02%
HURT stats (abs)   min: 9 max: 105 x̄: 51.40 x̃: 35
HURT stats (rel)   min: 0.43% max: 4.92% x̄: 2.34% x̃: 1.56%
95% mean confidence interval for instructions value: -1.39 -1.34
95% mean confidence interval for instructions %-change: -1.79% -1.73%
Instructions are helped.

total cycles in shared programs: 361468458 -> 361170679 (-0.08%)
cycles in affected programs: 38470116 -> 38172337 (-0.77%)
helped: 16202
HURT: 1456
helped stats (abs) min: 1 max: 4473 x̄: 26.24 x̃: 18
helped stats (rel) min: <.01% max: 28.44% x̄: 2.90% x̃: 2.18%
HURT stats (abs)   min: 1 max: 5982 x̄: 87.51 x̃: 28
HURT stats (rel)   min: <.01% max: 51.29% x̄: 5.48% x̃: 1.64%
95% mean confidence interval for cycles value: -18.24 -15.49
95% mean confidence interval for cycles %-change: -2.26% -2.14%
Cycles are helped.

total spills in shared programs: 12147 -> 12176 (0.24%)
spills in affected programs: 175 -> 204 (16.57%)
helped: 8
HURT: 5

total fills in shared programs: 25262 -> 25292 (0.12%)
fills in affected programs: 269 -> 299 (11.15%)
helped: 8
HURT: 5

Haswell
total instructions in shared programs: 13530316 -> 13502647 (-0.20%)
instructions in affected programs: 2507824 -> 2480155 (-1.10%)
helped: 18859
HURT: 10
helped stats (abs) min: 1 max: 12 x̄: 1.48 x̃: 1
helped stats (rel) min: 0.03% max: 27.78% x̄: 2.38% x̃: 1.41%
HURT stats (abs)   min: 5 max: 39 x̄: 25.70 x̃: 31
HURT stats (rel)   min: 0.22% max: 1.66% x̄: 1.09% x̃: 1.31%
95% mean confidence interval for instructions value: -1.49 -1.44
95% mean confidence interval for instructions %-change: -2.42% -2.34%
Instructions are helped.

total cycles in shared programs: 377865412 -> 377639034 (-0.06%)
cycles in affected programs: 40169572 -> 39943194 (-0.56%)
helped: 15550
HURT: 1938
helped stats (abs) min: 1 max: 2482 x̄: 25.67 x̃: 18
helped stats (rel) min: <.01% max: 37.77% x̄: 3.00% x̃: 2.25%
HURT stats (abs)   min: 1 max: 4862 x̄: 89.17 x̃: 35
HURT stats (rel)   min: <.01% max: 67.67% x̄: 6.16% x̃: 2.75%
95% mean confidence interval for cycles value: -14.42 -11.47
95% mean confidence interval for cycles %-change: -2.05% -1.91%
Cycles are helped.

total spills in shared programs: 26769 -> 26814 (0.17%)
spills in affected programs: 826 -> 871 (5.45%)
helped: 9
HURT: 10

total fills in shared programs: 38383 -> 38425 (0.11%)
fills in affected programs: 834 -> 876 (5.04%)
helped: 9
HURT: 10

LOST:   5
GAINED: 10

Ivy Bridge
total instructions in shared programs: 12079250 -> 12044139 (-0.29%)
instructions in affected programs: 2409680 -> 2374569 (-1.46%)
helped: 16135
HURT: 0
helped stats (abs) min: 1 max: 23 x̄: 2.18 x̃: 2
helped stats (rel) min: 0.07% max: 37.50% x̄: 2.72% x̃: 1.68%
95% mean confidence interval for instructions value: -2.21 -2.14
95% mean confidence interval for instructions %-change: -2.76% -2.67%
Instructions are helped.

total cycles in shared programs: 180116747 -> 179900405 (-0.12%)
cycles in affected programs: 25439823 -> 25223481 (-0.85%)
helped: 13817
HURT: 1499
helped stats (abs) min: 1 max: 1886 x̄: 26.40 x̃: 18
helped stats (rel) min: <.01% max: 38.84% x̄: 2.57% x̃: 1.97%
HURT stats (abs)   min: 1 max: 3684 x̄: 98.99 x̃: 52
HURT stats (rel)   min: <.01% max: 97.01% x̄: 6.37% x̃: 3.42%
95% mean confidence interval for cycles value: -15.68 -12.57
95% mean confidence interval for cycles %-change: -1.77% -1.63%
Cycles are helped.

LOST:   8
GAINED: 10

Sandy Bridge
total instructions in shared programs: 10878990 -> 10863659 (-0.14%)
instructions in affected programs: 1806702 -> 1791371 (-0.85%)
helped: 13023
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 1.18 x̃: 1
helped stats (rel) min: 0.07% max: 13.79% x̄: 1.65% x̃: 1.10%
95% mean confidence interval for instructions value: -1.18 -1.17
95% mean confidence interval for instructions %-change: -1.68% -1.62%
Instructions are helped.

total cycles in shared programs: 154082878 -> 153862810 (-0.14%)
cycles in affected programs: 20199374 -> 19979306 (-1.09%)
helped: 12048
HURT: 510
helped stats (abs) min: 1 max: 323 x̄: 20.57 x̃: 18
helped stats (rel) min: 0.03% max: 17.78% x̄: 2.05% x̃: 1.52%
HURT stats (abs)   min: 1 max: 448 x̄: 54.39 x̃: 16
HURT stats (rel)   min: 0.02% max: 37.98% x̄: 4.13% x̃: 1.17%
95% mean confidence interval for cycles value: -17.97 -17.08
95% mean confidence interval for cycles %-change: -1.84% -1.75%
Cycles are helped.

LOST:   1
GAINED: 0

Iron Lake
total instructions in shared programs: 8155075 -> 8142729 (-0.15%)
instructions in affected programs: 949495 -> 937149 (-1.30%)
helped: 5810
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.12 x̃: 2
helped stats (rel) min: 0.10% max: 16.67% x̄: 2.53% x̃: 1.85%
95% mean confidence interval for instructions value: -2.14 -2.11
95% mean confidence interval for instructions %-change: -2.59% -2.48%
Instructions are helped.

total cycles in shared programs: 188584610 -> 188549632 (-0.02%)
cycles in affected programs: 17274446 -> 17239468 (-0.20%)
helped: 3881
HURT: 90
helped stats (abs) min: 2 max: 168 x̄: 9.08 x̃: 6
helped stats (rel) min: <.01% max: 23.53% x̄: 0.83% x̃: 0.30%
HURT stats (abs)   min: 2 max: 10 x̄: 2.80 x̃: 2
HURT stats (rel)   min: <.01% max: 0.60% x̄: 0.10% x̃: 0.07%
95% mean confidence interval for cycles value: -9.35 -8.27
95% mean confidence interval for cycles %-change: -0.85% -0.77%
Cycles are helped.

GM45
total instructions in shared programs: 5019308 -> 5013119 (-0.12%)
instructions in affected programs: 489028 -> 482839 (-1.27%)
helped: 2912
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.13 x̃: 2
helped stats (rel) min: 0.10% max: 16.67% x̄: 2.46% x̃: 1.81%
95% mean confidence interval for instructions value: -2.14 -2.11
95% mean confidence interval for instructions %-change: -2.54% -2.39%
Instructions are helped.

total cycles in shared programs: 129002592 -> 128977804 (-0.02%)
cycles in affected programs: 12669152 -> 12644364 (-0.20%)
helped: 2759
HURT: 37
helped stats (abs) min: 2 max: 168 x̄: 9.03 x̃: 4
helped stats (rel) min: <.01% max: 21.43% x̄: 0.75% x̃: 0.31%
HURT stats (abs)   min: 2 max: 10 x̄: 3.62 x̃: 4
HURT stats (rel)   min: <.01% max: 0.41% x̄: 0.10% x̃: 0.04%
95% mean confidence interval for cycles value: -9.53 -8.20
95% mean confidence interval for cycles %-change: -0.79% -0.70%
Cycles are helped.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-05 17:04:13 -07:00
Ian Romanick
a288708506 intel/fs: Add need_dest parameter to fs_visitor::nir_emit_alu
This is the same as the need_dest parameter to
prepare_alu_destination_and_sources.  This allows us to not change the
register that is expected to hold an result if an instruction is
re-emitted.  This is particularly a problem if the re-emitted
instruction is a partial write.  A later patch will use this feature.

No shader-db changes on any Intel platform.

v2: Don't do the Boolean resolve when there is no destination.  If the
ALU instruction didn't write a register, there's nothing to resolve.
This replaces an earlier patch "intel/fs: Allocate dummy destination
register when need_dest is false".

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-05 17:04:08 -07:00
Ian Romanick
e13a5c7d67 intel/fs: Allow cmod propagation across reads and writes of different flags
This also helps a later patch (intel/fs: Improve discard_if code
generation) on about 200 shaders.

v2: Document that other instruction sequences are also valid in
subtract_merge_with_compare_intervening_mismatch_flag_write.  Suggested
by Caio.

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 17224438 -> 17224434 (<.01%)
instructions in affected programs: 296 -> 292 (-1.35%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.99% max: 1.92% x̄: 1.43% x̃: 1.40%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -2.04% -0.81%
Instructions are helped.

total cycles in shared programs: 361468455 -> 361468458 (<.01%)
cycles in affected programs: 2862 -> 2865 (0.10%)
helped: 2
HURT: 2
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.24% max: 0.39% x̄: 0.31% x̃: 0.31%
HURT stats (abs)   min: 3 max: 4 x̄: 3.50 x̃: 3
HURT stats (rel)   min: 0.32% max: 0.70% x̄: 0.51% x̃: 0.51%
95% mean confidence interval for cycles value: -4.34 5.84
95% mean confidence interval for cycles %-change: -0.70% 0.90%
Inconclusive result (value mean confidence interval includes 0).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-05 17:03:45 -07:00
Ian Romanick
8030cb75c1 intel/fs: Fix flag_subreg handling in cmod propagation
There were two errors.  First, the pass could propagate conditional
modifiers from an instruction that writes on flag register to an
instruction that writes a different flag register.  For example,

    cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F
    cmp.nz.f0.1(16) null:F, vgrf6:F, vgrf5:F

could be come

    cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F

Second, if an instruction writes f0.1 has it's condition propagated, the
modified instruction will incorrectly write flag f0.0.  For example,

    linterp(16) vgrf6:F, g2:F, attr0:F
    cmp.z.f0.1(16) null:F, vgrf6:F, vgrf5:F
    (-f0.1) discard_jump(16) (null):UD

could become

    linterp.z.f0.0(16) vgrf6:F, g2:F, attr0:F
    (-f0.1) discard_jump(16) (null):UD

None of these cases will occur currently.  The only time we use f0.1 is
for generating discard intrinsics.  In all those cases, we generate a
squence like:

    cmp.nz.f0.0(16) vgrf7:F, vgrf6:F, vgrf5:F
    (+f0.1) cmp.z(16) null:D, vgrf7:D, 0d
    (-f0.1) discard_jump(16) (null):UD

Due to the mixed types and incompatible conditions, this sequence would
never see any cmod propagation.  The next patch will change this.

No shader-db changes on any Intel platform.

v2: Fix typo in comment in test case subtract_delete_compare_other_flag.
Noticed by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-05 17:03:40 -07:00
Ian Romanick
2dd6013933 intel/fs: Add missing tests for cmod_propagate_not
Tests like this should have been added in 4467040cb6 ("i965/fs:
Propagate conditional modifiers from not instructions").

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-05 17:03:31 -07:00
Kenneth Graunke
6a7d387394 i965: Allow signed/unsigned integer conversions in miptree up/download
BLORP now handles this so there's no reason to fall back.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-05 16:58:07 -07:00
Kenneth Graunke
f06c86358c intel/blorp: Handle SINT/UINT clamping on blits.
This patch makes blorp_blit handle SINT<->UINT blit value clamping.
After reading the source's integer data (which is expanded to 32-bit),
we either IMAX with 0 (for SINT -> UINT, to clamp negative numbers) or
UMIN with (1 << 31) - 1 (for UINT -> SINT, to clamp positive numbers
outside of the representable range).

Such blits are not allowed by the OpenGL or Vulkan APIs directly:

   The Vulkan 1.1 spec for vkCmdBlitImage says:

   "Integer formats can only be converted to other integer formats with
    the same signedness."

   The GL 4.5 spec for glBlitFramebuffer says:

   "An INVALID_OPERATION error is generated if format conversions are
    not supported, which occurs under any of the following conditions:
    [...]
    * The read buffer contains unsigned integer values and any draw
      buffer does not contain unsigned integer values.
    * The read buffer contains signed integer values and any draw buffer
      does not contain signed integer values."

However, they are useful for other operations, such as texture upload
and download, which typically are implemented via blorp_blit().  i965
has code to fall back in this case (which the next commit will delete),
and Gallium expects blit() to handle this case for texture upload.

Fixes the following tests on iris:
- GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels
- GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo
- GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-05 16:58:07 -07:00
Caio Marcelo de Oliveira Filho
1aea4cd0d9 anv/pipeline: Move lowering of nir_var_mem_global later
This let deref optimizations apply to globals before lowering them.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-05 16:57:09 -07:00
Kenneth Graunke
4f3c82c72c st/nir: Don't use GLSL IR's MOD_TO_FLOOR lowering when using NIR.
Both GLSL IR and NIR perform the same mod -> floor lowering for 32-bit
types.  But nir_lower_double_ops is slightly more defensive against
lowered drcp precision loss, and handles mod(x, x) = 0 directly.  This
works well...assuming nir_lower_double_ops actually gets an fmod op to
lower in the first place.

The previous patches enabled NIR-based lowering for the remaining
drivers, so we can stop using the GLSL IR lowering when using NIR.

Fixes KHR-GL45.gpu_shader_fp64.builtin.mod_dvec[234] on iris.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-05 16:45:12 -07:00
Kenneth Graunke
f4d4c42608 radeonsi: Enable NIR's lower_fmod option.
Currently, st/mesa is always calling the GLSL IR lower_instructions()
pass with MOD_TO_FLOOR set, so mod operations will be lowered before
ever reaching NIR.  This enables the same lowering at the NIR level,
which will let me shut off the GLSL IR path for NIR-based drivers.

The AMD NIR backend also has code to handle fmod, so we could
potentially skip this and still be fine.  I don't have an opinion
on that.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-05 16:45:12 -07:00
Kenneth Graunke
e0641e0728 vc4: Enable NIR's lower_fmod option.
Currently, st/mesa is always calling the GLSL IR lower_instructions()
pass with MOD_TO_FLOOR set, so mod operations will be lowered before
ever reaching NIR.  This enables the same lowering at the NIR level,
which will let me shut off the GLSL IR path for NIR-based drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-06-05 16:45:12 -07:00
Kenneth Graunke
b0e3bd79dc v3d: Enable NIR's lower_fmod option.
Currently, st/mesa is always calling the GLSL IR lower_instructions()
pass with MOD_TO_FLOOR set, so mod operations will be lowered before
ever reaching NIR.  This enables the same lowering at the NIR level,
which will let me shut off the GLSL IR path for NIR-based drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-06-05 16:45:12 -07:00
Kenneth Graunke
c7d1b52a2c nir: Combine lower_fmod16/32 back into a single lower_fmod.
We originally had a single lower_fmod option.  In commit 2ab2d2e5, Sam
split 32 and 64-bit lowering into separate flags, with the rationale
that some drivers might want different options there.  This left 16-bit
unhandled, so Iago added a lower_fmod16 option in commit ca31df6f.

Now that lower_fmod64 is gone (in favor of nir_lower_doubles and
nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a
single lower_fmod flag again.  I'm not aware of any hardware which
need lowering for one bitsize and not the other.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-05 16:45:12 -07:00
Kenneth Graunke
edd45af9ba nir: Drop lower_fmod64 option.
nir_lower_doubles offers a wide variety of fp64 lowering, including
lowering fmod@64.  The version there also better handles imprecisions
due to lowered frcp@64.  Let's consolidate on one version.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-05 16:45:12 -07:00
Kenneth Graunke
dfb18f0a28 panfrost: Switch to nir_lower_doubles instead of lower_fmod64.
I don't think panfrost actually does doubles yet, but it at least
claims to support PIPE_CAP_DOUBLES, so at least pretend to switch
to the new lowering.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-05 16:45:12 -07:00
Kenneth Graunke
d13059f4d5 nouveau: Use nir_lower_doubles instead of lower_fmod64 on nvc0.
We currently have two duplicate mechanisms for lowering fmod@64.
One is a nir_opt_algebraic rule keyed off of options->lower_fmod64,
and the other is nir_lower_doubles, which offers a full gamut of
fp64 lowering.  The latter works slightly better in some corner cases,
so I'm trying to eliminate lower_fmod64 and drop the redundancy.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-05 16:45:12 -07:00
Kenneth Graunke
fa56a3795f gallium: Drop lower_fmod64 from drivers that don't support doubles.
Neither freedreno nor nv50 expose PIPE_CAP_DOUBLES, so there's no
fmod64 to be lowered.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-05 16:45:12 -07:00
Dylan Baker
e8a60e5d50 docs: update calendar, and news item and link release notes for 19.0.6 2019-06-05 16:42:36 -07:00
Dylan Baker
fccd44940d docs: Add SHA256 sums for 19.0.6 2019-06-05 16:40:57 -07:00
Dylan Baker
fe79d75ccf docs: Add relnotes for 19.0.6 2019-06-05 16:40:55 -07:00
Erik Faye-Lund
ff41ac7292 docs: add day of month to all news-entries
This makes it easier to batch-convert them to other structured
markup-formats.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:46 +02:00
Erik Faye-Lund
382776d241 docs: add MD5 checksums for 9.2.2 files
These checksums were obtained by downloading the releases from
ftp://ftp.freedesktop.org/pub/mesa/older-versions/9.x/9.2.2/ and
running md5sum on them.

Hopefully the server wasn't compromised since release.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:46 +02:00
Erik Faye-Lund
1941e642bc docs: use pre-block for showing commit-note
Having a single-item list for this seems odd. Let's just use a pre-block
in stead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:46 +02:00
Erik Faye-Lund
b16e593f79 docs: switch to definition list and code-tags
A definition list is a better semantic match for what this list is
supposed to convey, so let's use that instead. And while we're at it,
let's add some code-tags around filenames, as they stand a bit more out
that way.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:46 +02:00
Erik Faye-Lund
3b0d48e219 docs: combine headings
This is more in line with how we mark-up other definition lists, and
avoids portability issues with other markup-formats.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:46 +02:00
Erik Faye-Lund
942c4daac9 docs: more code-tags in llvmpipe.html
This makes the article a bit easier to read.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:46 +02:00
Erik Faye-Lund
52667f990e docs: use more code-tags in envvars.html
This wraps code, identifiers, values and paths in code-tags, which makes
them appear in a monospace-font for readability.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:46 +02:00
Erik Faye-Lund
d311d8f424 docs: use code-tags for envvars and options
This makes it a bit easier to tell what's what.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
5639f0d5ee docs: use dl instead of ul
A HTML definition-list is more semantically strong than just some
unordered list, and renders a bit cleaner by default. So let's use that
instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
0bca0f1aa2 docs: remove pointlessly repeated list
The examples listed above are exactly the same ones are we're about to
list, so let's just keep the list that defines what they do.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
aed4ac6da8 docs: remove stray whitespace
There's some stray whitespace in these files that doesn't do anything
useful. Let's get rid of if.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
20c56e18c2 docs: use proper links instead of code-tags
These links are a bit odd in that the URLs are simply placed in
code-tags. This makes them harder to work with. Let's use proper
links instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
c59c793ae5 docs: update doxygen-links
One of these URLs are dead these days, and the other one forwards to the
current one, doxygen.nl. Let's get these links up to date.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
7c4a4fb09a docs: remove some noisy spacing in pre-blocks
These newlines caused the blocks to have trailing newlines in them,
which renders a bit noisily.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
412046f74e docs: improve quoting slightly
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
8620f53212 docs: do not use br-tag for non-significant breaks
According to the W3C, we shouldn't use the br-tag unless the line-break
is part of the content:

https://www.w3.org/TR/2011/WD-html5-author-20110809/the-br-element.html

All of these instances are for non-content usage, and is as such technically
out-of-spec. So let's either remove them, or split paragraphs, based on
how related the content are.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
d5e273aad2 docs: remove pointless line-break
Line-breaks at the end of a paragraph doesn't do anything useful,
so let's just get rid of it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
db8211a883 docs: remove pointless trailing hard-breaks
Line-break at the end of an article is quite pointless, and doesn't do
much to increase the readability. Let's get rid of them.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
74a6a68196 docs: rewrite paragraph to be free-form
These half-way structured sections are needlessly problematic to
translate cleanly to other markup-languages, so let's just make this
into a free-form paragraph instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
9e5bc2c868 docs: use h4 instead of free-standing paragraphs and br-tags
This makes this document a bit more structured, which is generally
considered a good thing for HTML. It will also translate a bit better
into other markup-formats.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:45 +02:00
Erik Faye-Lund
38652a29ae docs: slightly reword paragraph and tweak markup
This makes this paragraph a bit easier to digest.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:44 +02:00
Erik Faye-Lund
b2ac7582d9 docs: remove stray space in code-block
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-05 23:48:44 +02:00