Christoph Bumiller
e2dded78ea
nvc0: add MP trap handler for nve4
2013-03-12 12:55:37 +01:00
Christoph Bumiller
ae59a7d35d
nvc0: they removed the NTID,NCTAID,GRIDID registers on nve4
2013-03-12 12:55:37 +01:00
Christoph Bumiller
75f1f852b0
nvc0/ir: try to fix CAS (CompareAndSwap)
2013-03-12 12:55:37 +01:00
Christoph Bumiller
18fdfbdc32
nv50/ir: add CCTL (cache control) op
2013-03-12 12:55:37 +01:00
Christoph Bumiller
9db7e09cb4
nvc0/ir/emit: fix emission of large address offsets
2013-03-12 12:55:36 +01:00
Christoph Bumiller
7a91d3a2a4
nv50/ir: add support for different sampler and resource index on nve4
...
And remove non-working code for indirect sampler/resource selection.
Will be added back later.
Includes code from "nv50/ir/tgsi: Resource indirect indexing" by
Francisco Jerez (when mixing the R and S handles we can only specify
them via a register, i.e. indirectly, unless we upload all the used
handle combinations to c[] space, which we don't for now).
2013-03-12 12:55:36 +01:00
Christoph Bumiller
99e4eba669
nv50/ir: implement splitting of 64 bit ops after RA
2013-03-12 12:55:36 +01:00
Christoph Bumiller
ac9f19e485
nvc0/ir: skip back edges when determining latest sched value
2013-03-12 12:55:36 +01:00
Christoph Bumiller
f07c46a4f4
nvc0/ir: use large issue delay after RET, too
2013-03-12 12:55:36 +01:00
Christoph Bumiller
c3a5bc0bdf
nv50/ir: add support for barriers
...
nv50 part by Francisco Jerez.
2013-03-12 12:55:35 +01:00
Christoph Bumiller
d105b3df14
nvc0/ir: don't replace load from input in COMPUTE progs with VFETCH
2013-03-12 12:55:35 +01:00
Christoph Bumiller
4506ed28de
nvc0/ir: implement lowering of surface ops for nve4
2013-03-12 12:55:35 +01:00
Christoph Bumiller
8ac68b071d
nvc0/ir: add formatted surface load lib code, move to extra header
...
OpenGL is nice and makes the user specify a format with an image unit.
OpenCL is evil and doesn't, and what's better than adding a huge load
of functions that we call indirectly to handle the conversion ?
2013-03-12 12:55:35 +01:00
Christoph Bumiller
c0fc3463e9
nvc0/ir: lower atomics in s[]
2013-03-12 12:55:35 +01:00
Christoph Bumiller
9c196779bc
nvc0/ir/emit: implement INSBF, EXTBF, PERMT and ATOM
2013-03-12 12:55:35 +01:00
Christoph Bumiller
d6c95f6819
nvc0/ir/target: some ops can't be predicated, e.g. CALL
2013-03-12 12:55:35 +01:00
Christoph Bumiller
c893b94060
nv50/ir: add support for indirect BRA,CALL
2013-03-12 12:55:34 +01:00
Christoph Bumiller
efe55075b5
nvc0/ir/emit: implement move to and logic ops on predicates
2013-03-12 12:55:34 +01:00
Christoph Bumiller
ce7610f7d5
nvc0/ir/emit: implement surface related ops
2013-03-12 12:55:34 +01:00
Christoph Bumiller
3741b7d844
nv50/ir: initialize CodeEmitters' specialized target fields
2013-03-12 12:55:34 +01:00
Christoph Bumiller
22b762f9b4
nv50/ir: add various new OPs that will be needed for compute
2013-03-12 12:55:34 +01:00
Francisco Jerez
c82714c593
nv50/ir: Rename "mkLoad" to "mkLoadv" for consistency.
2013-03-12 12:55:34 +01:00
Johannes Obermayr
6bca283ad5
nv50/nvc0: Build codegen in nv50.
...
This is required to make libnv50 independent of libnvc0.
2013-01-12 17:14:04 +01:00
Christoph Bumiller
1f079f9e58
nvc0/ir: allow neg,abs modifiers on OP_SET with integer result
2012-12-08 22:47:00 +01:00
Christoph Bumiller
7c6584b996
nvc0/ir/emit: fix check for flags register use in logic ops
2012-12-08 22:46:37 +01:00
Christoph Bumiller
f7599b2c32
nv50,nvc0: add support for cube map arrays
...
NOTE: nv50 support not enabled, someone with nva3/8 please fix.
2012-12-07 22:48:54 +01:00
Christoph Bumiller
3433471e8b
nvc0/ir: add initial code to support GK110 ISA encoding
2012-09-07 19:03:40 +02:00
Christoph Bumiller
79eed0d224
nvc0/ir: allow 64-bit constant loads on nve4
...
Looks like only 128-bit access doesn't work.
2012-05-29 17:00:10 +02:00
Christoph Bumiller
40c224a573
nvc0/ir: fix texture barrier insertion to prevent WAW hazards
...
Fixes, for instance, object highlighting in Diablo 3 (wine).
2012-05-29 15:01:41 +02:00
Christoph Bumiller
717f55d79d
nv50/ir: fix reversed order of lane ops in quadops
2012-05-17 15:24:58 +02:00
Christoph Bumiller
c19672f90a
nvc0/ir: allow abs,neg source modifiers with ceil,floor,trunc
2012-05-06 22:03:06 +02:00
Christoph Bumiller
38a20281fc
nvc0/ir: fix lowering of textureGrad
2012-05-06 22:03:06 +02:00
Christoph Bumiller
1f4c154f02
nv50/ir/opt: try to convert ABS(SUB) to SAD
2012-04-29 18:03:11 +02:00
Christoph Bumiller
d6ab3106cf
nvc0/ir: try to use the optimal texture op mode
...
Don't really know what they are yet but for groups of textures, the
last one should use mode "p" and the others "t".
2012-04-29 18:02:37 +02:00
Christoph Bumiller
afcd7b5d16
nvc0/ir: initial implementation of nve4 scheduling hints
2012-04-29 17:59:06 +02:00
Christoph Bumiller
00fe442253
nvc0/ir: implement better placement of texture barriers
...
Put them before first uses instead of right after the texturing
instruction and cull unnecessary barriers.
2012-04-29 17:56:57 +02:00
Christoph Bumiller
d9baa004ea
nvc0/ir/emit: fix emitTXQ 2nd src
2012-04-29 17:55:13 +02:00
Christoph Bumiller
3a9f036e00
nvc0/ir/target: integer ADD doesn't support ABS modifier
2012-04-29 17:54:34 +02:00
Christoph Bumiller
e44089b2f7
nvc0: add initial support for nve4+ (Kepler) chipsets
...
Most things that work on Fermi should work on Kepler too.
There are a few performance optimizations left to do, like better
placement of texture barriers and adding scheduling data to the
shader instructions (without them, a thread group will be masked
for 32 cycles after each single instruction issue).
2012-04-15 00:08:51 +02:00
Christoph Bumiller
322bc7ed68
nv50/ir: import nv50 target
2012-04-14 21:54:04 +02:00
Christoph Bumiller
15ce0f76e2
nv50/ir: fix off-by-ones in CSE and nvc0 insnCanLoad
2012-04-14 21:54:04 +02:00
Christoph Bumiller
e43a3a66a9
nv50/ir: rewrite the register allocator as GCRA, with spilling
...
This is more flexible than the linear scan, and we don't need the
separate allocation pass for constrained values anymore.
2012-04-14 21:54:03 +02:00
Christoph Bumiller
12a2f5121d
nvc0: fix emission of 3rd src in SET_AND,OR,XOR
2012-04-14 21:54:03 +02:00
Francisco Jerez
a05e6a3fa2
nv50/ir: Decouple object cloning logic from the sub-object recursion policy.
2012-04-14 21:54:01 +02:00
Christoph Bumiller
9362d4bc0a
nv50/ir: make Instruction::src/def container private
2012-04-14 21:54:00 +02:00
Christoph Bumiller
55f9bdb64e
nv50/ir/opt: improve post-multiply and check target for support
2012-04-14 21:54:00 +02:00
Christoph Bumiller
286abcb51e
nv50/ir: add isAccessSupported check for memory access coalescing
2012-04-14 21:54:00 +02:00
Christoph Bumiller
af0ce1dba8
nv50/ir: make use of TGSI_INTERPOLATE_COLOR
...
Flat SHADE_MODEL still overrides any non-flat interpolation
qualifier, but pulling that state out of the rasterizer cso
isn't really worth the effort, is it ?
NOTE: This is a candidate for the 8.0 branch.
2012-01-12 22:38:01 +01:00
Christoph Bumiller
e4210a42bc
nvc0/ir: TXF array index already is an integer
2012-01-10 00:39:41 +01:00
Christoph Bumiller
7c6ca0367b
nvc0/ir/emit: fix modifiers of f32 add with long immediate
2012-01-10 00:36:59 +01:00