mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2025-12-24 08:50:13 +01:00
glsl2: Add a README file for the new compiler.
This commit is contained in:
parent
2928588267
commit
e5cf3aadb8
1 changed files with 153 additions and 0 deletions
153
src/glsl/README
Normal file
153
src/glsl/README
Normal file
|
|
@ -0,0 +1,153 @@
|
|||
Welcome to Mesa's GLSL compiler. A brief overview of how things flow:
|
||||
|
||||
1) lex and yacc-based preprocessor takes the incoming shader string
|
||||
and produces a new string containing the preprocessed shader. This
|
||||
takes care of things like #if, #ifdef, #define, and preprocessor macro
|
||||
invocations. Note that #version, #extension, and some others are
|
||||
passed straight through. See glcpp/*
|
||||
|
||||
2) lex and yacc-based parser takes the preprocessed string and
|
||||
generates the AST (abstract syntax tree). Almost no checking is
|
||||
performed in this stage. See glsl_lexer.lpp and glsl_parser.ypp.
|
||||
|
||||
3) The AST is converted to "HIR". This is the intermediate
|
||||
representation of the compiler. Constructors are generated, function
|
||||
calls are resolved to particular function signatures, and all the
|
||||
semantic checking is performed. See ast_*.cpp for the conversion, and
|
||||
ir.h for the IR structures.
|
||||
|
||||
4) The driver (Mesa, or main.cpp for the standalone binary) performs
|
||||
optimizations. These include copy propagation, dead code elimination,
|
||||
constant folding, and others. Generally the driver will call
|
||||
optimizations in a loop, as each may open up opportunities for other
|
||||
optimizations to do additional work. See most files called ir_*.cpp
|
||||
|
||||
5) linking is performed. This does checking to ensure that the
|
||||
outputs of the vertex shader match the inputs of the fragment shader,
|
||||
and assigns locations to uniforms, attributes, and varyings. See
|
||||
linker.cpp.
|
||||
|
||||
6) The driver may perform additional optimization at this point, as
|
||||
for example dead code elimination previously couldn't remove functions
|
||||
or global variable usage when we didn't know what other code would be
|
||||
linked in.
|
||||
|
||||
7) The driver performs code generation out of the IR, taking a linked
|
||||
shader program and producing a compiled program for each stage. See
|
||||
ir_to_mesa.cpp for Mesa IR code generation.
|
||||
|
||||
FAQ:
|
||||
|
||||
Q: What is HIR versus IR versus LIR?
|
||||
|
||||
A: The idea behind the naming was that ast_to_hir would produce a
|
||||
high-level IR ("HIR"), with things like matrix operations, structure
|
||||
assignments, etc., present. A series of lowering passes would occur
|
||||
that do things like break matrix multiplication into a series of dot
|
||||
products/MADs, make structure assignment be a series of assignment of
|
||||
components, flatten if statements into conditional moves, and such,
|
||||
producing a low level IR ("LIR").
|
||||
|
||||
However, it now appears that each driver will have different
|
||||
requirements from a LIR. A 915-generation chipset wants all functions
|
||||
inlined, all loops unrolled, all ifs flattened, no variable array
|
||||
accesses, and matrix multiplication broken down. The Mesa IR backend
|
||||
for swrast would like matrices and structure assignment broken down,
|
||||
but it can support function calls and dynamic branching. A 965 vertex
|
||||
shader IR backend could potentially even handle some matrix operations
|
||||
without breaking them down, but the 965 fragment shader IR backend
|
||||
would want to break to have (almost) all operations down channel-wise
|
||||
and perform optimization on that. As a result, there's no single
|
||||
low-level IR that will make everyone happy. So that usage has fallen
|
||||
out of favor, and each driver will perform a series of lowering passes
|
||||
to take the HIR down to whatever restrictions it wants to impose
|
||||
before doing codegen.
|
||||
|
||||
Q: How is the IR structured?
|
||||
|
||||
A: The best way to get started seeing it would be to run the
|
||||
standalone compiler against a shader:
|
||||
|
||||
./glsl --dump-lir ~/src/piglit/tests/shaders/glsl-orangebook-ch06-bump.frag
|
||||
|
||||
So for example one of the ir_instructions in main() contains:
|
||||
|
||||
(assign (constant bool (1)) (var_ref litColor) (expression vec3 * (var_ref Surf
|
||||
aceColor) (var_ref __retval) ) )
|
||||
|
||||
Or more visually:
|
||||
(assign)
|
||||
/ | \
|
||||
(var_ref) (expression *) (constant bool 1)
|
||||
/ / \
|
||||
(litColor) (var_ref) (var_ref)
|
||||
/ \
|
||||
(SurfaceColor) (__retval)
|
||||
|
||||
which came from:
|
||||
|
||||
litColor = SurfaceColor * max(dot(normDelta, LightDir), 0.0);
|
||||
|
||||
(the max call is not represented in this expression tree, as it was a
|
||||
function call that got inlined but not brought into this expression
|
||||
tree)
|
||||
|
||||
Each of those nodes is a subclass of ir_instruction. A particular
|
||||
ir_instruction instance may only appear once in the whole IR tree with
|
||||
the exception of ir_variables, which appear once as variable
|
||||
declarations:
|
||||
|
||||
(declare () vec3 normDelta)
|
||||
|
||||
and multiple times as the targets of variable dereferences:
|
||||
...
|
||||
(assign (constant bool (1)) (var_ref __retval) (expression float dot
|
||||
(var_ref normDelta) (var_ref LightDir) ) )
|
||||
...
|
||||
(assign (constant bool (1)) (var_ref __retval) (expression vec3 -
|
||||
(var_ref LightDir) (expression vec3 * (constant float (2.000000))
|
||||
(expression vec3 * (expression float dot (var_ref normDelta) (var_ref
|
||||
LightDir) ) (var_ref normDelta) ) ) ) )
|
||||
...
|
||||
|
||||
Each node has a type. Expressions may involve several different types:
|
||||
(declare (uniform ) mat4 gl_ModelViewMatrix)
|
||||
((assign (constant bool (1)) (var_ref constructor_tmp) (expression
|
||||
vec4 * (var_ref gl_ModelViewMatrix) (var_ref gl_Vertex) ) )
|
||||
|
||||
An expression tree can be arbitrarily deep, and the compiler tries to
|
||||
keep them structured like that so that things like algebraic
|
||||
optimizations ((color * 1.0 == color) and ((mat1 * mat2) * vec == mat1
|
||||
* (mat2 * vec))) or recognizing operation patterns for code generation
|
||||
(vec1 * vec2 + vec3 == mad(vec1, vec2, vec3)) are easier. This comes
|
||||
at the expense of additional trickery in implementing some
|
||||
optimizations like CSE where one must navigate an expression tree.
|
||||
|
||||
Q: Why no SSA representation?
|
||||
|
||||
A: Converting an IR tree to SSA form makes dead code elmimination,
|
||||
common subexpression elimination, and many other optimizations much
|
||||
easier. However, in our primarily vector-based language, there's some
|
||||
major questions as to how it would work. Do we do SSA on the scalar
|
||||
or vector level? If we do it at the vector level, we're going to end
|
||||
up with many different versions of the variable when encountering code
|
||||
like:
|
||||
|
||||
(assign (constant bool (1)) (swiz x (var_ref __retval) ) (var_ref a) )
|
||||
(assign (constant bool (1)) (swiz y (var_ref __retval) ) (var_ref b) )
|
||||
(assign (constant bool (1)) (swiz z (var_ref __retval) ) (var_ref c) )
|
||||
|
||||
If every masked update of a component relies on the previous value of
|
||||
the variable, then we're probably going to be quite limited in our
|
||||
dead code elimination wins, and recognizing common expressions may
|
||||
just not happen. On the other hand, if we operate channel-wise, then
|
||||
we'll be prone to optimizing the operation on one of the channels at
|
||||
the expense of making its instruction flow different from the other
|
||||
channels, and a vector-based GPU would end up with worse code than if
|
||||
we didn't optimize operations on that channel!
|
||||
|
||||
Once again, it appears that our optimization requirements are driven
|
||||
significantly by the target architecture. For now, targeting the Mesa
|
||||
IR backend, SSA does not appear to be that important to producing
|
||||
excellent code, but we do expect to do some SSA-based optimizations
|
||||
for the 965 fragment shader backend when that is developed.
|
||||
Loading…
Add table
Reference in a new issue