mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2026-01-03 13:40:11 +01:00
read-only mirror of https://gitlab.freedesktop.org/mesa/mesa
The execbuf2 kernel API requires us to construct two kinds of lists.
First is a "validation list" (struct drm_i915_gem_exec_object2[])
containing each BO referenced by the batch. (The batch buffer itself
must be the last entry in this list.) Each validation list entry
contains a pointer to the second kind of list: a relocation list.
The relocation list contains information about pointers to BOs that
the kernel may need to patch up if it relocates objects within the VMA.
This is a very general mechanism, allowing every BO to contain pointers
to other BOs. libdrm_intel models this by giving each drm_intel_bo a
list of relocations to other BOs. Together, these form "reloc trees".
Processing relocations involves a depth-first-search of the relocation
trees, starting from the batch buffer. Care has to be taken not to
double-visit buffers. Creating the validation list has to be deferred
until the last minute, after all relocations are emitted, so we have the
full tree present. Calculating the amount of aperture space required to
pin those BOs also involves tree walking, which is expensive, so libdrm
has hacks to try and perform less expensive estimates.
For some reason, it also stored the validation list in the global
(per-screen) bufmgr structure, rather than as an local variable in the
execbuffer function, requiring locking for no good reason.
It also assumed that the batch would probably contain a relocation
every 2 DWords - which is absurdly high - and simply aborted if there
were more relocations than the max. This meant the first relocation
from a BO would allocate 180kB of data structures!
This is way too complicated for our needs. i965 only emits relocations
from the batchbuffer - all GPU commands and state such as SURFACE_STATE
live in the batch BO. No other buffer uses relocations. This means we
can have a single relocation list for the batchbuffer. We can add a BO
to the validation list (set) the first time we emit a relocation to it.
We can easily keep a running tally of the aperture space required for
that list by adding the BO size when we add it to the validation list.
This patch overhauls the relocation system to do exactly that. There
are many nice benefits:
- We have a flat relocation list instead of trees.
- We can produce the validation list up front.
- We can allocate smaller arrays and dynamically grow them.
- Aperture space checks are now (a + b <= c) instead of a tree walk.
- brw_batch_references() is a trivial validation list walk.
It should be straightforward to make it O(1) in the future.
- We don't need to bloat each drm_bacon_bo with 32B of reloc data.
- We don't need to lock in execbuffer, as the data structures are
context-local, and not per-screen.
- Significantly less code and a better match for what we're doing.
- The simpler system should make it easier to take advantage of
I915_EXEC_NO_RELOC in a future patch.
Improves performance in Synmark 7.0's OglBatch7:
- Skylake GT4e: 12.1499% +/- 2.29531% (n=130)
- Apollolake: 3.89245% +/- 0.598945% (n=35)
Improves performance in GFXBench4's gl_driver2 test:
- Skylake GT4e: 3.18616% +/- 0.867791% (n=229)
- Apollolake: 4.1776% +/- 0.240847% (n=120)
v2: Feedback from Chris Wilson:
- Omit explicit zero initializers for garbage execbuf fields.
- Use .rsvd1 = ctx_id rather than i915_execbuffer2_set_context_id
- Drop unnecessary fencing assertions.
- Only use _WR variant of execbuf ioctl when necessary.
- Shrink the arrays to be smaller by default.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
||
|---|---|---|
| bin | ||
| docs | ||
| doxygen | ||
| include | ||
| m4 | ||
| scons | ||
| scripts | ||
| src | ||
| .dir-locals.el | ||
| .editorconfig | ||
| .gitattributes | ||
| .gitignore | ||
| .mailmap | ||
| .travis.yml | ||
| Android.common.mk | ||
| Android.mk | ||
| appveyor.yml | ||
| autogen.sh | ||
| CleanSpec.mk | ||
| common.py | ||
| configure.ac | ||
| install-gallium-links.mk | ||
| install-lib-links.mk | ||
| Makefile.am | ||
| REVIEWERS | ||
| SConstruct | ||
| VERSION | ||
File: docs/README.WIN32 Last updated: 21 June 2013 Quick Start ----- ----- Windows drivers are build with SCons. Makefiles or Visual Studio projects are no longer shipped or supported. Run scons libgl-gdi to build gallium based GDI driver. This will work both with MSVS or Mingw. Windows Drivers ------- ------- At this time, only the gallium GDI driver is known to work. Source code also exists in the tree for other drivers in src/mesa/drivers/windows, but the status of this code is unknown. Recipe ------ Building on windows requires several open-source packages. These are steps that work as of this writing. - install python 2.7 - install scons (latest) - install mingw, flex, and bison - install pywin32 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs get pywin32-218.4.win-amd64-py2.7.exe - install git - download mesa from git see https://www.mesa3d.org/repository.html - run scons General ------- After building, you can copy the above DLL files to a place in your PATH such as $SystemRoot/SYSTEM32. If you don't like putting things in a system directory, place them in the same directory as the executable(s). Be careful about accidentially overwriting files of the same name in the SYSTEM32 directory. The DLL files are built so that the external entry points use the stdcall calling convention. Static LIB files are not built. The LIB files that are built with are the linker import files associated with the DLL files. The si-glu sources are used to build the GLU libs. This was done mainly to get the better tessellator code. If you have a Windows-related build problem or question, please post to the mesa-dev or mesa-users list.