Find a file
Heiko Przybyl e933246013 r600/sb: Fix loop optimization related hangs on eg
Make sure unused ops and their references are removed, prior to entering
the GCM (global code motion) pass, to stop GCM from breaking the loop
logic and thus hanging the GPU.

Turns out, that sb has problems with loops and node optimizations
regarding associative folding:

- the global code motion (gcm) pass moves ops up a loop level/basic block
until they've fulfilled their total usage count
- if there are ops folded into others, the usage count won't be
fulfilled and thus the op moved way up to the top
- within GCM the op would be visited and their deps would be moved
alongside it, to fulfill the src constaints
- in a loop, an unused op is moved out of the loop and GCM would move
the src value ops up as well
- now here arises the problem: if the loop counter is one of the src
values it would get moved up as well, the loop break condition would
never get hit and the shader turn into an endless loop, resulting in the
GPU hanging and being reset

A reduced (albeit nonsense) piglit example would be:

[require]
GLSL >= 1.20

[fragment shader]

uniform int SIZE;
uniform vec4 lights[512];

void main()
{
    float x = 0;
    for(int i = 0; i < SIZE; i++)
        x += lights[2*i+1].x;
}

[test]
uniform int SIZE 1
draw rect -1 -1 2 2

Which gets optimized to:

===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN =====
===== 42 dw ===== 1 gprs ===== 2 stack =========================================
ALU 3 @24
     1      y: MOV                R0.y,  0
            t: MULLO_UINT         R0.w,  [0x00000002 2.8026e-45].x, R0.z

LOOP_START_DX10 @22
PUSH @6
ALU 1 @30 KC0[CB0:0-15]
     2 M    x: PRED_SETGE_INT     __.x,  R0.z, KC0[0].x
JUMP @14 POP:1
LOOP_BREAK @20
POP @14 POP:1
ALU 2 @32
     3      x: ADD_INT            R0.x,  R0.w, [0x00000002 2.8026e-45].x

TEX 1 @36
               VFETCH             R0.x___, R0.x,   RID:0   MFC:16 UCF:0 FMT[..]
ALU 1 @40
     4      y: ADD                R0.y,  R0.y, R0.x
LOOP_END @4
EXPORT_DONE        PIXEL 0     R0.____  EOP
===== SHADER_END ===============================================================

Notice R0.z being the loop counter/break condition relevant register
and being never incremented at all. Also some of the loop content
has been moved out of it, to fulfill the requirements for the one unused
op.

With a debug build of mesa this would produce an error like
error at : PRED_SETGE_INT     __, __, EM.2,    R1.x.2||FP@R0.z, C0.x
  : operand value R1.x.2||FP@R0.z was not previously written to its gpr
and the compilation would fail due to this. On a release build it gets
passed to the GPU.

When using this patch, the loop remains intact:

===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN =====
===== 48 dw ===== 1 gprs ===== 2 stack =========================================
ALU 2 @24
     1      y: MOV                R0.y,  0
            z: MOV                R0.z,  0
LOOP_START_DX10 @22
PUSH @6
ALU 1 @28 KC0[CB0:0-15]
     2 M    x: PRED_SETGE_INT     __.x,  R0.z, KC0[0].x
JUMP @14 POP:1
LOOP_BREAK @20
POP @14 POP:1
ALU 4 @30
     3      t: MULLO_UINT         T0.x,  [0x00000002 2.8026e-45].x, R0.z

     4      x: ADD_INT            R0.x,  T0.x, [0x00000002 2.8026e-45].x

TEX 1 @40
               VFETCH             R0.x___, R0.x,   RID:0   MFC:16 UCF:0 FMT[..]
ALU 2 @44
     5      y: ADD                R0.y,  R0.y, R0.x
            z: ADD_INT            R0.z,  R0.z, 1
LOOP_END @4
EXPORT_DONE        PIXEL 0     R0.____  EOP
===== SHADER_END ===============================================================

Piglit: ./piglit summary console -d results/*_gpu_noglx
        name: unpatched_gpu_noglx patched_gpu_noglx
        ----  ------------------- -----------------
        pass:               18016             18021
        fail:                 748               743
        crash:                  7                 7
        skip:                1124              1124
        timeout:                0                 0
        warn:                  13                13
        incomplete:             0                 0
        dmesg-warn:             0                 0
        dmesg-fail:             0                 0
        changes:                0                 5
        fixes:                  0                 5
        regressions:            0                 0
        total:              19908             19908

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94900
Tested-by: Heiko Przybyl <lil_tux@web.de>
Tested-on: Barts PRO HD6850
Signed-off-by: Heiko Przybyl <lil_tux@web.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-01-03 21:58:52 +01:00
bin Introduce .editorconfig 2016-08-31 17:06:54 -07:00
docs doc/features.txt: update for freedreno 2017-01-03 10:46:13 -05:00
doxygen doxygen: Plumb through gallium/ to automated documentation 2016-05-30 17:53:45 +01:00
include dri: Add __DRI_IMAGE_FORMAT_ARGB1555 2016-12-27 09:13:43 -08:00
m4 build: Remove unused AX_CHECK_COMPILE_FLAG macro 2016-07-25 15:14:12 +02:00
scons scons: Recognize LLVM_CONFIG environment variable. 2016-11-24 13:37:33 -08:00
scripts get_reviewer.pl: fix mesa check 2016-08-30 16:44:00 -04:00
src r600/sb: Fix loop optimization related hangs on eg 2017-01-03 21:58:52 +01:00
.dir-locals.el dir-locals.el: Adds White Space support 2016-11-14 19:17:49 +02:00
.editorconfig editorconfig: Fix up the tab rendering width. 2017-01-03 10:38:53 -08:00
.gitattributes Disable autocrlf for Visual Studio project files. 2008-02-28 12:34:01 +09:00
.gitignore .gitignore: Ignore tags generated by make tags 2016-08-24 11:33:48 +01:00
.mailmap .mailmap: Update my address again 2016-08-25 13:55:52 -07:00
.travis.yml travis: remove no longer needed libudev-dev dependency 2016-10-18 17:06:24 +01:00
Android.common.mk android: avoid using libdrm with host modules 2016-11-02 14:43:26 +00:00
Android.mk android: add support for libmesa_amdgpu_addrlib 2016-09-13 10:06:04 +10:00
appveyor.yml appveyor: Update winflexbison download URL. 2016-09-13 17:54:51 +01:00
autogen.sh autogen.sh: pass --force to autoreconf, quote ORIGDIR 2015-03-11 23:28:26 +00:00
CleanSpec.mk android: Depend on gallium_dri from EGL, instead of linking in gallium. 2015-06-09 11:38:45 -07:00
common.py scons: Recognize LLVM_CONFIG environment variable. 2016-11-24 13:37:33 -08:00
configure.ac configure: cleanup GLX_USE_TLS handling 2016-12-09 19:22:03 +00:00
install-gallium-links.mk gallium: Fix install-gallium-links.mk on non-bash /bin/sh 2016-10-10 08:56:12 -07:00
install-lib-links.mk install-lib-links: remove the .install-lib-links file 2015-02-24 15:33:25 +00:00
Makefile.am automake: don't forget to pick wglext.h in the tarball 2016-10-24 09:44:26 +01:00
REVIEWERS reviewers: add Rob H for the Android EGL+build parts 2016-11-21 16:01:06 +00:00
SConstruct scons: whitespace cleanup 2016-05-25 12:23:12 -06:00
VERSION docs: add 13.1.0-devel release notes template, bump version 2016-10-19 19:10:16 +01:00

File: docs/README.WIN32

Last updated: 21 June 2013


Quick Start
----- -----

Windows drivers are build with SCons.  Makefiles or Visual Studio projects are
no longer shipped or supported.

Run

  scons libgl-gdi

to build gallium based GDI driver.

This will work both with MSVS or Mingw.


Windows Drivers
------- -------

At this time, only the gallium GDI driver is known to work.

Source code also exists in the tree for other drivers in
src/mesa/drivers/windows, but the status of this code is unknown.

Recipe
------

Building on windows requires several open-source packages. These are
steps that work as of this writing.

- install python 2.7
- install scons (latest)
- install mingw, flex, and bison
- install pywin32 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
  get pywin32-218.4.win-amd64-py2.7.exe
- install git
- download mesa from git
  see http://www.mesa3d.org/repository.html
- run scons

General
-------

After building, you can copy the above DLL files to a place in your
PATH such as $SystemRoot/SYSTEM32.  If you don't like putting things
in a system directory, place them in the same directory as the
executable(s).  Be careful about accidentially overwriting files of
the same name in the SYSTEM32 directory.

The DLL files are built so that the external entry points use the
stdcall calling convention.

Static LIB files are not built.  The LIB files that are built with are
the linker import files associated with the DLL files.

The si-glu sources are used to build the GLU libs.  This was done
mainly to get the better tessellator code.

If you have a Windows-related build problem or question, please post
to the mesa-dev or mesa-users list.