Find a file
Ian Romanick 9abeb3d739 intel/fs: Optimize integer multiplication of large constants by factoring
Many Intel platforms can only perform 32x16 bit multiplication.  The
straightforward way to implement 32x32 bit multiplications is by
splitting one of the operands into high and low parts called H and L,
repsectively.  The full multiplication can be implemented as:

         ((A * H) << 16) + (A * L)

On Intel platforms, special register accesses can be used to eliminate
the shift operation.  This results in three instructions and a temporary
register for most values.

If H or L is 1, then one (or both) of the multiplications will later be
eliminated.  On some platforms it may be possible to eliminate the
multiplication when H is 256.

If L is zero (note that H cannot be zero), one of the multiplications
will also be eliminated.

Instead of splitting the operand into high and low parts, it may
possible to factor the operand into two 16-bit factors X and Y.  The
original multiplication can be replaced with (A * (X * Y)) = ((A * X) *
Y).  This requires two instructions without a temporary register.

I may have gone a bit overboard with optimizing the factorization
routine.  It was a fun brainteaser, and I couldn't put it down. :) On my
1.3GHz Ice Lake, a standalone test could chug through 1,000,000 randomly
selected values in about 5.7 seconds.  This is about 9x the performance
of the obvious, straightforward implementation that I started with.

v2: Drop an unnecessary return.  Rearrange logic slightly and rename
variables in factor_uint32 to better match the names used in the large
comment.  Both suggested by Caio. Rearrange logic to avoid possibly
using `a` uninitialized. Noticed by Marcin.

v3: Use DIV_ROUND_UP instead of open coding it. Noticed by Caio.

Tiger Lake, Ice Lake, Haswell, and Ivy Bridge had similar results. (Ice Lake shown)
total instructions in shared programs: 19912558 -> 19912526 (<.01%)
instructions in affected programs: 3432 -> 3400 (-0.93%)
helped: 10 / HURT: 0

total cycles in shared programs: 856413218 -> 856412810 (<.01%)
cycles in affected programs: 122032 -> 121624 (-0.33%)
helped: 9 / HURT: 0

No shader-db changes on any other Intel platforms.

Tiger Lake and Ice Lake had similar results. (Ice Lake shown)
Instructions in all programs: 141997227 -> 141996923 (-0.0%)
Instructions helped: 71

Cycles in all programs: 9162524757 -> 9162523886 (-0.0%)
Cycles helped: 63
Cycles hurt: 5

No fossil-db changes on any other Intel platforms.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>
2022-11-08 00:02:16 +00:00
.github/workflows ci/macOS: Getting the installed binary to be artifacts 2022-10-26 02:16:35 +00:00
.gitlab/issue_templates gitlab: ask that reporters don't include long logs in descriptions 2022-06-08 15:06:51 +00:00
.gitlab-ci ci: Uprev kernel to 6.0 2022-11-04 14:51:24 +00:00
android Android.mk: Fix gnu++14 related build failures 2022-11-07 17:46:42 +00:00
bin bin/perf-annotate-jit.py: Update to Python 3. 2022-08-21 22:14:43 -07:00
build-support configure: commit test files 2017-10-16 16:32:43 -07:00
docs docs: update calendar and link releases notes for 22.2.3 2022-11-07 10:28:11 -08:00
include drm-uapi: bump headers 2022-11-03 20:24:41 +00:00
src intel/fs: Optimize integer multiplication of large constants by factoring 2022-11-08 00:02:16 +00:00
subprojects meson: upgrade zlib wrap 2022-10-20 22:52:06 +00:00
.dir-locals.el dir-locals.el: Adds White Space support 2016-11-14 19:17:49 +02:00
.editorconfig rusticl: added 2022-09-12 05:58:12 +00:00
.gitattributes Add new rules to .gitattributes 2022-01-19 15:17:17 +00:00
.gitignore .gitignore: Qualify the path for the ignored build directory. 2022-06-09 22:53:37 +00:00
.gitlab-ci.yml Revert "ci: Collabora farm maintanance" 2022-11-04 12:22:17 +00:00
.graphqlrc.yml ci/bin: Add utility to find jobs dependencies 2022-08-03 23:10:37 +00:00
.mailmap .mailmap: change spelling for Constantine Kharlamov 2022-08-12 13:11:03 +00:00
CODEOWNERS CODEOWNERS: remove rajnesh-kanwal as an Imagination maintainer 2022-10-31 23:59:41 +00:00
meson.build nine: enable on freedreno 2022-11-05 14:35:41 +00:00
meson_options.txt meson: simplified meson for enabling ray-tracing on Intel 2022-11-01 06:30:47 +00:00
README.rst docs: promote #dri-devel on oftc over freenode 2021-05-24 09:21:48 +00:00
VERSION VERSION: fix version as it will be a new year 2022-11-04 14:30:55 +00:00

`Mesa <https://mesa3d.org>`_ - The 3D Graphics Library
======================================================


Source
------

This repository lives at https://gitlab.freedesktop.org/mesa/mesa.
Other repositories are likely forks, and code found there is not supported.


Build & install
---------------

You can find more information in our documentation (`docs/install.rst
<https://mesa3d.org/install.html>`_), but the recommended way is to use
Meson (`docs/meson.rst <https://mesa3d.org/meson.html>`_):

.. code-block:: sh

  $ mkdir build
  $ cd build
  $ meson ..
  $ sudo ninja install


Support
-------

Many Mesa devs hang on IRC; if you're not sure which channel is
appropriate, you should ask your question on `OFTC's #dri-devel
<irc://irc.oftc.net/dri-devel>`_, someone will redirect you if
necessary.
Remember that not everyone is in the same timezone as you, so it might
take a while before someone qualified sees your question.
To figure out who you're talking to, or which nick to ping for your
question, check out `Who's Who on IRC
<https://dri.freedesktop.org/wiki/WhosWho/>`_.

The next best option is to ask your question in an email to the
mailing lists: `mesa-dev\@lists.freedesktop.org
<https://lists.freedesktop.org/mailman/listinfo/mesa-dev>`_


Bug reports
-----------

If you think something isn't working properly, please file a bug report
(`docs/bugs.rst <https://mesa3d.org/bugs.html>`_).


Contributing
------------

Contributions are welcome, and step-by-step instructions can be found in our
documentation (`docs/submittingpatches.rst
<https://mesa3d.org/submittingpatches.html>`_).

Note that Mesa uses gitlab for patches submission, review and discussions.