The alloced_vec4/vec4 distinction was an experiment to expose the cost
of temps to the codegen. But the problem is that the temporary
production rule gets called after the emit rule that was using the
temp. We could have the args to emit_op be pointers to where the temp
would get allocated later, but that seems overly hard while just
trying to bring this thing up. Besides, the temps used in expressions
bear only the vaguest relation to how many temps will be used after
register allocation.
Ideally this would be hooked up by ir_print_visitor dumping into a
string that we could include as prog_instruction->Comment when in
debug mode, and not try keeping ir_instruction trees around after
conversion to Mesa. The ir_print_visitor isn't set up to do that for
us today.
There are major missing pieces here. Most operations aren't
supported. Matrices need to be broken down to vector ops before we
get here. Scalar operations (RSQ, RCP) are handled incorrectly.
Arrays and structures are not even considered.
I'm not sure if I really got it right. This seems like one of those
"Duh, of course it works that way" things, but I'd like the
documentation to be readable by people not acquainted with OGL/D3D.
Only for first RT at the moment, as there is no trivial way in galahad
to look at framebuffer state and (sadly) people don't usually calloc
their CSOs, so flags could be wrongly set.
On the other hand, of course, galahad will hopefully encourage more
people to calloc their CSOs. :3
1. Move all GL entrypoint functions and files into src/mesa/main/
This includes the ARB vp/vp, NV vp/fp, ATI fragshader and GLSL bits
that were in src/mesa/shader/
2. Move src/mesa/shader/slang/ to src/mesa/slang/ to reduce the tree depth
3. Rename src/mesa/shader/ to src/mesa/program/ since all the
remaining files are concerned with GPU programs.
4. Misc code refactoring. In particular, I got rid of most of the
GLSL-related ctx->Driver hook functions. None of the drivers used
them.
Conflicts:
src/mesa/drivers/dri/i965/brw_context.c
And hook it up at the two sites it's called.
Note that with this change we still don't use glsl_type* objects as
talloc contexts, (see things like get_array_instance that accept both
a talloc 'ctx' as well as a glsl_type*). The reason for this is that
the code is still using many instance of glsl_type objects not created
with new.
This closes 3 leaks in the glsl-orangebook-ch06-bump.frag test:
total heap usage: 55,623 allocs, 55,618
Leaving only 5 leaks to go.
Add a talloc ctx to both get_array_instance and the glsl_type
constructor in order to be able to call talloc_size instead of
malloc.
This fix now makes glsl-orangebook-ch06-bump.frag 99.99% leak free:
total heap usage: 55,623 allocs, 55,615
Only 8 missing frees now.
Simply call talloc_strdup rather than strdup, (using the talloc_parent
of our 'state' object, (known here as yyextra).
This fix now makes glsl-orangebook-ch06-bump.frag 99.97% leak free:
total heap usage: 55,623 allocs, 55,609 frees
Only 14 missing frees now.