Commit graph

96 commits

Author SHA1 Message Date
Brian Paul
2b26a92cd3 gallium: s/false/FALSE/ 2009-01-10 14:58:44 -07:00
José Fonseca
42d0079002 rtasm: Remove spurious semi-colons after function bodies. 2008-12-30 17:06:51 +00:00
Robert Ellison
11fc390f64 CELL: use variant-length fragment ops programs
This is a set of changes that optimizes the memory use of fragment
operation programs (by using and transmitting only as much memory as is
needed for the fragment ops programs, instead of maximal sizes), as well
as eliminate the dependency on hard-coded maximal program sizes.  State
that is not dependent on fragment facing (i.e. that isn't using
two-sided stenciling) will only save and transmit a single
fragment operation program, instead of two identical programs.

- Added the ability to emit a LNOP (No Operation (Load)) instruction.
  This is used to pad the generated fragment operations programs to
  a multiple of 8 bytes, which is necessary for proper operation of
  the dual instruction pipeline, and also required for proper SPU-side
  decoding.

- Added the ability to allocate and manage a variant-length
  struct cell_command_fragment_ops.  This structure now puts the
  generated function field at the end, where it can be as large
  as necessary.

- On the PPU side, we now combine the generated front-facing and
  back-facing code into a single variant-length buffer (and only use one
  if the two sets of code are identical) for transmission to the SPU.

- On the SPU side, we pull the correct sizes out of the buffer,
  allocate a new code buffer if the one we have isn't large enough,
  and save the code to that buffer.  The buffer is deallocated when
  the SPU exits.

- Commented out the emit_fetch() static function, which was not being used.
2008-11-21 11:42:35 -07:00
Robert Ellison
2c29a6896a CELL: fix stencil twiddling, stencil invert
Many stencil tests were failing because of a failure to read the
stencil buffer, due to "twiddling" (or "untwiddling") "an unsupported
texture format".  This is fixed for the case of a stencil/Z S824Z format
(which twiddles just like the 32-bit color formats).

tests/stencilwrap.c was failing on the GL_INVERT test, because
the emitted code for "spe_xori" turned out not to be an actual
"xori" instruction, but rather a "stqd" instruction, because
of a typo in the rtasm code.  This is now fixed, and
tests/stencil_wrap now works.
2008-11-13 11:23:04 -07:00
Brian Paul
b44ec717c8 gallium: add missing prototypes 2008-11-12 11:09:12 -07:00
Brian Paul
1cd15f0370 cell: move semicolons to silence warnings w/ other compilers 2008-11-12 11:06:48 -07:00
Brian Paul
7f15e34cfa cell: fix typo in EMIT_ macro 2008-11-12 11:06:48 -07:00
Michal Krol
87f77105ce rtasm: Use INLINE keyword. Compile for all platforms, not only GALLIUM_CELL. 2008-11-12 18:44:20 +01:00
Michal Krol
8fee30064e rtasm: Compile only for GALLIUM_CELL. 2008-11-12 18:13:58 +01:00
Robert Ellison
90027f8578 CELL: two-sided stencil fixes
With these changes, the tests/stencil_twoside test now works.

- Eliminate blending from the stencil_twoside test, as it produces an
  unneeded dependency on having blending working

- The spe_splat() function will now work if the register being splatted
  and the destination register are the same

- Separate fragment code generated for front-facing and back-facing
  fragments.  Often these are the same; if two-sided stenciling is on,
  they can be different.  This is easier and faster than generating
  code that does both tests and merges the results.

- Fixed a cut/paste bug where if the back Z-pass stencil operation
  were different from all the other operations, the back Z-fail
  results were incorrect.
2008-11-11 13:57:10 -07:00
Brian Paul
f952aac1da gallium: grow SPE instruction buffer as needed 2008-10-29 16:56:28 -06:00
Brian Paul
725ba94ce5 gallium: no longer pass max_inst to ppc_init_func() 2008-10-29 16:35:59 -06:00
Brian Paul
a5d920297a gallium: use execmem for PPC code, grow instruction buffer as needed 2008-10-29 16:26:10 -06:00
Brian Paul
8828d52348 gallium: fix alignment parameter passed to u_mmAllocMem()
Was 32, now 5.  The param is expressed as a power of two exponent.
The net effect is that the alignment was a no-op on X86 but on PPC we
always got the same memory address everytime rtasm_exec_malloc() was called.
2008-10-29 14:52:35 -06:00
Brian Paul
3ad56968f0 gallium: prefix memory manager functions with u_ to differentiate from functions in mesa/main/mm.c 2008-10-29 14:19:12 -06:00
Brian Paul
09570d2e73 gallium: test for PIPE_OS_LINUX instead of __linux__ 2008-10-29 14:08:13 -06:00
Brian Paul
7640264064 gallium: added ppc_vnmsubfp() 2008-10-29 11:03:51 -06:00
Michel Dänzer
6b69e3c717 scons: ppc support. 2008-10-23 10:28:48 +02:00
Brian Paul
f8ab4feb75 gallium: remove ppc_vload_float(), rename ppc_vecmove() -> ppc_vmove(). 2008-10-22 17:21:43 -06:00
Brian Paul
3026616c48 gallium: added ppc_vzero() 2008-10-22 17:17:11 -06:00
Brian Paul
b06d072019 gallium: added ppc_vload_float(), for limited cases 2008-10-22 14:48:33 -06:00
Brian Paul
ebdc399d83 gallium: fix-up confusing register allocation masks in rtasm_ppc.c
Plus, add ppc_reserve_register() func.
2008-10-22 13:57:56 -06:00
Brian Paul
049f57f86a gallium: added ppc_lvewx() 2008-10-22 11:06:39 -06:00
Brian Paul
e0c6653a5f cell: implement many more PPC instructions for code gen 2008-10-22 10:35:38 -06:00
Brian Paul
d3403b5482 cell: add emit_RI10s() which does range checking on the 10-bit signed immediate field
This type of checking should be expanded to cover more instructions...
2008-10-10 14:57:57 -06:00
Brian Paul
f42ef6f39d cell: additional 'offset' checking in spe_lqd(), spe_stqd() 2008-10-10 14:44:52 -06:00
Brian Paul
78c67a726f cell: fix assertions in spe_lqd(), spe_stqd() 2008-10-10 14:36:18 -06:00
Robert Ellison
adeed0f90f CELL: fixing stencil bugs
These are the defects found and fixed so far.  Several more have
been observed; I'm working on them.

- Fixed an error in spe_load_uint() that caused incorrect values to be
  loaded if the given unsigned value had the low 18 bits as 0,
  and that caused inefficient code to be emitted if the given value
  had the high 14 bits as 0.

- Fixed a problem in stencil code generation where optional registers
  weren't tracked correctly.

- Fixed a problem that the stencil function NEVER was acting as ALWAYS.

- Fixed several problems that could occur if stenciling were enabled but
  depth was disabled.

- Fixed a problem with two-sided stencil writemask handling that could
  cause a stencil writemask to not be applied.

- Fixed several state permutations that were incorrectly flagged as
  not requiring stencil values to be calculated.
2008-10-10 14:15:51 -06:00
Keith Whitwell
d7f1cb5b5a Merge commit 'origin/gallium-0.1' into gallium-0.2
Conflicts:

	src/gallium/auxiliary/gallivm/instructionssoa.cpp
	src/gallium/auxiliary/gallivm/soabuiltins.c
	src/gallium/auxiliary/rtasm/rtasm_x86sse.c
	src/gallium/auxiliary/rtasm/rtasm_x86sse.h
	src/mesa/main/texenvprogram.c
	src/mesa/shader/arbprogparse.c
	src/mesa/shader/prog_statevars.c
	src/mesa/state_tracker/st_draw.c
	src/mesa/vbo/vbo_exec_draw.c
2008-10-10 15:23:36 +01:00
Brian Paul
7ac1fc7766 cell: fix incorrect bitmask in spe_load_uint() 2008-10-09 19:54:46 -06:00
Brian Paul
d48a92e880 cell: implement function calls from shader code. fslight demo runs now.
Used for SIN, COS, EXP2, LOG2, POW instructions.  TEX next.

Fixed some bugs in MIN, MAX, DP3, DP4, DPH instructions.

In rtasm code:
  Special-case spe_lqd(), spe_stqd() functions so they take byte offsets but
  low-order 4 bits are shifted out.  This makes things consistant with SPU
  assembly language conventions.
  Added spe_get_registers_used() function.
2008-10-08 20:44:32 -06:00
Brian Paul
5c57cbec32 gallium: asst. clean-ups
Don't use register qualifier.  Doxygen-ize comments.  Remove 'extern'.
2008-10-08 16:35:40 -06:00
Brian Paul
73d00b9e93 gallium: better instruction printing for SPE code 2008-10-08 16:33:04 -06:00
Brian
f7ee3c9792 gallium: replace assertion with conditional/recovery code
The assertion failed when we ran out of exec memory.
Found with conform texcombine test.
2008-10-06 18:31:56 -06:00
Keith Whitwell
7053f8c902 rtasm: fix debug build 2008-10-06 11:54:22 +01:00
Robert Ellison
afaa53040b CELL: changes to generate SPU code for stenciling
This set of code changes are for stencil code generation
support.  Both one-sided and two-sided stenciling are supported.
In addition to the raw code generation changes, these changes had
to be made elsewhere in the system:

- Added new "register set" feature to the SPE assembly generation.
  A "register set" is a way to allocate multiple registers and free
  them all at the same time, delegating register allocation management
  to the spe_function unit.  It's quite useful in complex register
  allocation schemes (like stenciling).

- Added and improved SPE macro calculations.
  These are operations between registers and unsigned integer
  immediates.  In many cases, the calculation can be performed
  with a single instruction; the macros will generate the
  single instruction if possible, or generate a register load
  and register-to-register operation if not.  These macro
  functions are: spe_load_uint() (which has new ways to
  load a value in a single instruction), spe_and_uint(),
  spe_xor_uint(), spe_compare_equal_uint(), and spe_compare_greater_uint().

- Added facing to fragment generation.  While rendering, the rasterizer
  needs to be able to determine front- and back-facing fragments, in order
  to correctly apply two-sided stencil.  That requires these changes:
  - Added front_winding field to the cell_command_render block, so that
    the state tracker could communicate to the rasterizer what it
    considered to be the front-facing direction.
  - Added fragment facing as an input to the fragment function.
  - Calculated facing is passed during emit_quad().
2008-10-03 18:05:14 -06:00
Keith Whitwell
6965532e14 rtasm: add sse_movntps 2008-10-03 13:50:34 +01:00
Keith Whitwell
66d4beb874 rtasm: add prefetch instructions 2008-10-02 10:19:48 -04:00
Keith Whitwell
102daee1b8 rtasm: add prefetch instructions 2008-10-02 12:59:14 +01:00
José Fonseca
6607f2cf19 rtasm: Implement immediate group 1 instructions. Fix SIB emition. 2008-09-29 19:09:39 +09:00
Brian Paul
938e12c1ca gallium: SPU register comments 2008-09-26 17:06:22 -06:00
Brian Paul
99cdfc997b cell: use different opcodes for spe_move() depending on even/odd address 2008-09-19 17:56:45 -06:00
Brian Paul
7af5f944e5 gallium: added spe_code_size() 2008-09-19 17:56:45 -06:00
Brian Paul
0838b70275 cell: change spe_complement() to take a src and dst reg, like other instructions 2008-09-19 09:36:29 -06:00
Robert Ellison
a57fbe53dc CELL: add codegen for logic op, color mask
- rtasm_ppc_spe.c, rtasm_ppc_spe.h: added a new macro function
  "spe_load_uint" for loading and splatting unsigned integers
  in a register; it will use "ila" for values 18 bits or less,
  "ilh" for word values that are symmetric across halfwords,
  "ilhu" for values that have zeroes in their bottom halfwords,
  or "ilhu" followed by "iohl" for general 32-bit values.

  Of the 15 color masks of interest, 4 are 18 bits or less,
  2 are symmetric across halfwords, 3 are zero in the bottom
  halfword, and 6 require two instructions to load.

- cell_gen_fragment.c: added full codegen for logic op and
  color mask.
2008-09-19 01:55:00 -06:00
Robert Ellison
f8bba34d4e CELL: finish fragment ops blending (except for unusual D3D modes)
- Added new "macro" functions spe_float_min() and spe_float_max()
  to rtasm_ppc_spe.{ch}.  These emit instructions that cause
  the minimum or maximum of each element in a vector of floats
  to be saved in the destination register.

- Major changes to cell_gen_fragment.c to implement all the blending
  modes (except for the mysterious D3D-based PIPE_BLENDFACTOR_SRC1_COLOR,
  PIPE_BLENDFACTOR_SRC1_ALPHA, PIPE_BLENDFACTOR_INV_SRC1_COLOR, and
  PIPE_BLENDFACTOR_INV_SRC1_ALPHA).

- Some revamping of code in cell_gen_fragment.c: use the new spe_float_min()
  and spe_float_max() functions (instead of expanding these calculations
  inline via macros); create and use an inline utility function for handling
  "optional" register allocation (for the {1,1,1,1} vector, and the
  blend color vectors) instead of expanding with macros; use the Float
  Multiply and Subtract (fnms) instruction to simplify and optimize many
  blending calculations.
2008-09-18 01:29:41 -06:00
Brian Paul
ae3373441d gallium: emit SPU instructions in assembler-compatible syntax 2008-09-15 15:10:02 -06:00
Jonathan White
367774a62a Fixed emit_RRR 2008-09-15 11:57:59 -06:00
Brian Paul
8b5013d232 gallium: added print/dump code to SPE code emitter 2008-09-12 21:52:47 -06:00
Brian Paul
31a112cad4 gallium: added spe_splat_word() 2008-09-12 21:08:01 -06:00