Reorder LLVM passes, running mem2reg earlier.

This gives a ~30% shader optimization time improvement on blender.
Tested by comparing the dumped LLVM modules.
Current ordering:
time ~/llvm-git/obj/Release-Asserts/bin/opt l.bc  -constprop -instcombine
-mem2reg -gvn  -simplifycfg
real    0m1.126s
user    0m1.108s
sys     0m0.012s

With this patch:
time ~/llvm-git/obj/Release-Asserts/bin/opt l.bc -mem2reg -constprop -instcombine   -gvn  -simplifycfg
real    0m0.885s
user    0m0.880s
sys     0m0.000s

The overall improvement in blender is ~15%.
Blender without the patch takes 1m13s:
edwin     5934 87.6 11.5 729440 458296 pts/5   SLl+ 17:35   1:13 blender

Blender with the patch takes 1m3s:
edwin     5726 94.2 11.2 716424 446168 pts/5   SLl+ 17:32   1:03 blender

It is still slow with the patch, but better (most of the optimization time is
taken up by GVN, see LLVM PR7023).

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: José Fonseca <jfonseca@vmware.com>
This commit is contained in:
Török Edwin 2010-05-03 07:43:03 -07:00 committed by José Fonseca
parent 723ab664f6
commit 15af543f10
2 changed files with 4 additions and 4 deletions

View file

@ -182,6 +182,8 @@ draw_llvm_create(struct draw_context *draw)
/* These are the passes currently listed in llvm-c/Transforms/Scalar.h,
* but there are more on SVN. */
/* TODO: Add more passes */
LLVMAddCFGSimplificationPass(llvm->pass);
LLVMAddPromoteMemoryToRegisterPass(llvm->pass);
LLVMAddConstantPropagationPass(llvm->pass);
if(util_cpu_caps.has_sse4_1) {
/* FIXME: There is a bug in this pass, whereby the combination of fptosi
@ -190,9 +192,7 @@ draw_llvm_create(struct draw_context *draw)
*/
LLVMAddInstructionCombiningPass(llvm->pass);
}
LLVMAddPromoteMemoryToRegisterPass(llvm->pass);
LLVMAddGVNPass(llvm->pass);
LLVMAddCFGSimplificationPass(llvm->pass);
init_globals(llvm);

View file

@ -185,6 +185,8 @@ lp_jit_screen_init(struct llvmpipe_screen *screen)
/* These are the passes currently listed in llvm-c/Transforms/Scalar.h,
* but there are more on SVN. */
/* TODO: Add more passes */
LLVMAddCFGSimplificationPass(screen->pass);
LLVMAddPromoteMemoryToRegisterPass(screen->pass);
LLVMAddConstantPropagationPass(screen->pass);
if(util_cpu_caps.has_sse4_1) {
/* FIXME: There is a bug in this pass, whereby the combination of fptosi
@ -193,9 +195,7 @@ lp_jit_screen_init(struct llvmpipe_screen *screen)
*/
LLVMAddInstructionCombiningPass(screen->pass);
}
LLVMAddPromoteMemoryToRegisterPass(screen->pass);
LLVMAddGVNPass(screen->pass);
LLVMAddCFGSimplificationPass(screen->pass);
}
lp_jit_init_globals(screen);