Intel Research, Intel Corp.
Institute of Computing Technology, CAS.
last updated Jul 15, 2003



Table of Contents


General questions. 2

What is the relation between ORC and Pro64?.. 2

Will there always be two separate sources, one ORC, the other Pro64?.. 2

How well were the compilers tested?.. 2

Can I use ORC to build the Linux/IA-64 kernel?.. 2

What is the status of ORC native compiler on IA-64 Itanium machines?.. 3

Are there tools that come with the compilers?.. 3

What are the differences between release 1.0 and 1.1?.. 3

What are the differences between release 1.1 and 2.0?.. 3

What are the differences between release 2.0 and 2.1?..

How do one generate best optimized code using ORC?.. 3

How to use Whirl and code generation phase profile/feedback?.. 3

How to use the compilers to achieve peak performance?..

Are there documents that come with the compilers?.. 3


Interactions of ORC and IA-64 Linux questions. 4

How do I install ORC compilers on IA-64 Linux systems?.. 4

How do I install ORC compilers on IA32 Redhat 6.2 Linux systems?..4

How do I install ORC compilers on IA32 Redhat 7.1 Linux systems?..

How do I install ORC compilers on IA32 Redhat 7.2 Linux systems?..

How do I make a native built compiler? 3

How do I install multiple versions of ORC compilers on IA-64 systems?.. 5

If I do cross compile on an IA32 machine, how do I produce IA-64 binaries?.. 8

How do I rebuild the Fortran front-end?..


Reporting bugs and problems. 8

Where do I report bugs?.. 8

How do I decide if the bug is ORC specific?.. 9

How do I  report bugs?.. 9


Using tools that come with the compiler. 9

How to use hot path enumeration tool.. 16

How to use cycle counting tool.. 16


Adding new optimizations or phases. 18

Are there any coding conventions?.. 18

How do I add my own tracing options?.. 18

How do I change the driver and related files when I add an optimization?.. 19

How do I effectively use gdb to debug the compilers?.. 20

How do I make sure my new phase is not the cause of unnecessary long compile time?.. PAGEREF _Toc535160437 \h 21

General Questions

What is the relation between ORC and Pro64?

ORC  is  based  on  Pro64.  Major  changes  have been done in the code  generator and the profiling framework. We intend to continue improving ORC in  both  features and performance for Itanium processor and its follow-ups.  

Will there always be two separate sources, one ORC, the other Pro64?

Currently  Pro64 source is hosted by the open64-user group, also found  in  SourceForge. Release 1.0 has been merged into Pro64 source. It has  been  incorporated  into  Open64  0.14  release.  More  merge has been discussed but no concrete proposal at this point.

 How well were the compilers tested?

For  1.0  release,  a  number of test suites for C, C++, and Fortran90  were  used,  as well as a number of  open source programs.       Tests were  run  at   -O2,  and -O3 levels of optimization.  The tests were run on simulators and Itanium systems. Also a number of large (>400,000 source lines) applications were used, thanks to University of Alberta, TsingHua University and University of  Minnesota.

For  1.1  release,  the release is not a full scale release. Hence, it  has  not  gone  through  the same rigorous testing as 1.0, yet we have done  our best to ensure the quality of the 1.1 release. The same test  suites  and  open  source  programs were used during our testing, with more  optimization  combinations (O2, O3, IPA, profiling) than that of  1.0.  No  external  organizations  were involved for this release. The same  large  application used during 1.0 testing were not used for 1.1 release either.


For 2.0 release, we have gone through similar rigorous testing as for 1.0 release. Most of the testing is done with the cross compiler. The native compiler has gone through less testing due to lack of resources.


For 2.1 release, we have gone through similar rigorous testing as for 2.0 release. And most of the testing is done with both cross compiler and native compiler.

Can I use ORC to build the Linux/IA-64 kernel?

we  have  not attempted to build the Linux/IA64 kernel, and we have no  plan  to  do so.  Hence, it is safe to say ORC cannot build Linux. However, the following  instruction is  extracted from Readme file of 0.13 Pro64(tm) release:

Edit the Linux/Makefile and replace the definitions for CROSS_COMPILE, CC, and CFLAGS with the following:

  CC = orcc  -D__KERNEL__ -I$(HPATH) -D__LP64__   CFLAGS=$(CPPFLAGS)  -Wall  -Wstrict-prototypes  -O0  -fomit-frame-pointer  -ffixed-r15   -CG:emit_unwind_info=off -Wf,-O2 -PHASE:p -D__OPTIMIZE -OPT:Olimit=5000


What is the status of native ORC compiler on IA-64 Itanium machines?

For 1.1 release, we have included a binary that is built natively. The binary  is  in  the  tar  file  orc-1.1.0.native.tgz.  This  compiler,    although  built  -O0, still outperforms that of a cross built compiler running  on  an  Itanium  machine. This binary have not been tested as thoroughly as the cross built compiler.

For  2.0  release,  we  have included an optimized built compiler (-O2 with  all  internal  consistency checks removed, except that WOPT.so and BE.so are built at -O0). The binary is in the tar  file  orc-2.0-bin.tar.gz

You  should  see large compile time improvement  from  this compiler. This native  compiler are not tested as thoroughly as the cross built compiler

Are there tools that come with the compilers?

We  have also included two new  tools for performance tuning and analysis - "hpe.pl", a  hot  path enumeration tool and  a cycle counting tool, "cycount.pl" . Details are described below.


What are the differences between release 1.0 and 1.1?

Three  major  focuses  in 1.1: IPA functional, performance improvement and full profile feedback functional. To  turn  on   InterProcedural  Optimization, add -IPA in your compile line.  You  still  need  to  specify -O0, -O2 or -O3 besides that -IPA flag.  Performance  has  also  improved  compared  with  1.0.  We have measured  an average 10%+ performance gain compiling with 1.1 release,  under  the  same optimization flags.  Two types of profile/feedback is now  fully  supported: Whirl level profiling and code generation level  profiling.

 There  is  also  change  in  default  optimization  behavior. With 1.1  release alias analysis (memory disambiguation) defaults to assume user code  is ANSI-type compliant. Also, compiled code is assumed to become  part  of main executable, where we assume functions defined in an main  executable  (non-dso)  will  not  be  preempted.  To  get  back to 1.0 behavior for both, use -OPT:Olegacy.

What are the differences between release 1.1 and 2.0?

The  major  focus in 2.0 has been performance. Although performance is not  the  primary goal of this Open Research Compiler, it is vital for researchers  who  base  their  work  on this compiler to know that the binary  produced  by  this  compiler is very competitive. Hence, their  work  is  also  trustworthy  and  sound. The concentration for ORC has always  been  in  the integer side. This is mainly due to insufficient resource.  We  have  not  studied,  nor  conduct  measurement  for the relative  performance of floating point code for ORC compared to other research  or  production  compilers.  Given that ORC is based on SGI's product  compiler, heavy on scientific applications, we have reason to expect  that  it has the capability to be among the best, should there be  resource  put  onto  it. In the integer side, we are pleased to say  that  to the best of our knowledge, the code produced by this compiler is  very  competitive, to all existing IA64 compilers, at optimization level -O2, -O3 and -IPA, with and without feedback information.

A  major  performance  improvement of 2.0 over 1.1 is that of C++ code quality.  With 2.0, we have spent a lot of efforts to ensure C++ style code  can  achieve  similar kind of speed up people expects to achieve when they write C code.

Besides  performance  enhancement, 2.0 release also includes Itanium-2 micro-architecture  support.  A new Itanium-2 machine model is added into the compiler. Now the compiler optimizations such as instruction scheduling and bundling  are Itanium-2 aware. The option to turn on Itanium-2 mode is -TARG:platform=itanium2 or -itanium2. The default for 2.0 release is Itanium1 machine. Performance investigation and enhancements specific to Itanium-2 will be in the next release.


What are the differences between release 2.0 and 2.1?

The major focus in 2.1 has been the performance on Itanium-2 systems. The 2.0 release was already capable of generating code based on Itanium-2 machine model, but the performance for 2.0 was on Itanium rather than Itanium-2. After extensive work, we are pleased to say that to the best of our knowledge, the code generated for both Itanium-2 and Itanium by ORC is very competitive to all existing IA64 compilers, at optimization level -O2, -O3 and -IPA, with and without feedback information.

In addition to all features in 2.0, the 2.1 release has added or improved a number of features: cache optimizations, loop-invariant code motion in code generator, switch optimization, multi-way branches and renaming in instruction scheduling, SWP, unrolling, etc. In addition, we provided a new inter-procedural framework to balance RSE traffic and explicit spills. A dynamic instrumentation tool, called Pin, is also available along with 2.1.

The 2.1 release is also more robust comparing with the 2.0 release with a number of defects fixed in both cross and native build environments.

By default, ORC 2.1 generates code for Itanium-2 systems. It also can generate code for Itanium by adding the option "-itanium".

How do one generate best optimized code?

Optimization   results   will   vary   according   to   your  specific program/application.  Different optimization phases are involved at various optimization levels: -O2  goes  through  the  global  optimizer  and code generator, -O3 will go  through  the  loop  level  optimizer in addition, -IPA will invoke the inter-procedural  optimizer.  Hence,  in  general,  -O3 code should run faster  than -O2. Also, -O3 -IPA will outperform that of -O3 generated code.   -O3  also  allow  the  compiler  to  perform  more  aggressive optimization that may cause differences in results also (such as allow  wrap  around  to be safe, allow change in floating operation ordering,...).  Occasionally,  when  the compiler sees that the procedures it's compiling  is  too  big,  it  might choose to turn off optimization to curtail  compile  time  (you  will  see  a  warning  message  when  it  happens).  You can use the option -OPT:Olimit=0 to ensure the compiler not to turn off optimizations.

How to use Whirl and code generation phase profile/feedback?


Profile/feedback is based on a compile time instrumentation, training run and collection of feedback data followed by another compilation/optimization based on feedback data. There are two feedback phases in ORC implemented. Each phase will add one extra compilation pass. The two phases are Whirl feedback phase and code generation feedback phase.Whirl  feedback  will help primarily IPA for better inlining and other optimizations at Whirl level. Please note that currently Whirl profiling is only enabled at -O3. Code generation phase feedback will help optimizations  during code generation. To utilize both Whirl profiling and code generation phase profiling, the entire process then becomes a three  pass compilation/run process. First pass will instrument Whirl.Second  pass  will read feedback data due to Whirl instrumentation, at the  same time, do instrumentation in code generation time. Third pass will  read  feedback  data  generated from instrumentation of Whirl as well as from code generation time.

The options to do instrumentation are:
      -fb_create feed_back_file_name -fb_type={1, 2, 4, 8} -fb_phase={0, 4}
The options to use feedback data are:
     -fb_opt feed_back_file_name

where   fb_type = 1 : whirl; 2 : cg edge; 4 : cg value; 8: cg stride
             fb_phase = 0 : before very high whirl optimizer (vho), 4 : before cg region formation
directory  specifies where the feedback data file will be  produced. Currently, only the above types and phases are supported.

  e.g.  To  use feedback data of whirl profiling and edge profiling, one needs to go through the following,

        compile with options: -fb_create feed_back_file_name -fb_type=1 -fb_phase=0
        run binary
        compile  with  options:  -fb_opt  feed_back_file_name  -fb_create feed_back_file_name -fb_type=2 -fp_phase=4 -O3
        run binary
        compile with options: -fb_opt feed_back_file_name
        final run of binary

How to use the compilers to achieve peak performance?


To use the compilers to generate peak performance binary code, you need to turn on all optimizations in ORC (including inter-procedural analysis (IPA), inlinling, stride pre-fetching, procedure reordering, etc.), as well as the various profiling.The process consists of three phases, as described below:


1. The first phase is Whirl profiling instrumentation. An example of the compiler options used is shown below:

     "-O3 -ipa -fb_create fb.mid -fb_type=1 -fb_phase=0 "

    After linkage, run the application with "train" input data set. It'll generate whirl feedback files with names like fb.mid.instr0.aginqw.


2. The second phase involves whirl  profiling annotation, stride profiling instrumentation and edge profiling instrumentation, e.g.,

    "-O3 -ipa -fb_opt fb.mid -fb_create fb.mid -fb_type=10 -fb_phase=4"

    After linkage, run the application with "train" input data set. It'll generate feedback files with names like fb.mid.instr0.01324, which

    combine the information obtained by edge profiling and stride profiling.


3. The last phase uses all the profiling information collected and turns on all optimizations, e.g.,

    "-O3 -ipa -fb_opt fb.mid"

    For further optimization opportunities, you may use -OPT:Olimit=0 as described above.

Are there documents that come with the compilers?


We  have  included  in the release some documents related to our added  features  and  optimizations.  More will be coming. The documents will  mostly   be   in   the   code   generator   area,   related   to   our  changes/additions. For documents related to Pro64, please refer to the publication  list.  Those  marked  with  *  in the list reflect actual  implementation in various components of the compilers. We  have  also  done  three  tutorials  in  the past two years. Two in Micro34  and  Micro35,  one  in PACT02. Each tutorial covers different aspects of the compiler. You can find the tutorial material in the ORC site in sourceforge.

Interactions of ORC and IA-64 Linux Questions

How do I install ORC compilers on IA-64 Linux systems?

The  binaries  are  packaged  in  tar-ball  form. Just untar the file downloaded,  run  the  script "install-bin.sh". Help manual of this script is:

Usage: install.sh [-hHnc] [-t toolroot] [-l native-archive-root]

              -h  --help              give this help
              -H --hierarchy      print orcc binary hierarchy
              -n  --native           install native components
              -c --cross             install cross components
              -t  --toolroot TR   use directory <TR> as the root of orcc binary
              -l --libroot  LR     use directory <LR> as the root of native archives
              -q --prerequire     print the pre-requirement.
              -e --example         print example installation session

             Invoke "install.sh -H" to get better unstanding of "toolroot" and "libroot"


To install compiler on IA-64 Linux system, simplily by invoking ./install -n.
The environment variables $TOOLROOT will affect installation as well as orcc's  run-time behavior. $TOOLROOT refers to root  for the binary hierarchy of ORC suite.  For example, if this variables is not set, the full path of orcc is /usr/bin/orcc (it obviously  requires root proprietary), otherwise, ${TOOLROOT}/usr/bin/orcc.

  If $TOOLROOT is set, you need to add $TOOLROOT/usr/bin/ to $PATH.

  Following are 2 installation examples:

       e.g 1
             Assume the account name is joesmith and the IA-64 host name is uranus, the login shell of joesmith is GNU BASH.

[joesmith@uranus joesmith]$vi $HOME/.bashrc  # set $TOOLROOT and $PATH

[joesmith@uranus joesmith]$cat $HOME/.bashrc | grep "TOOLROOT\|PATH"
     export TOOLROOT=$HOME/music_and_photo 
     export PATH=$PATH: $TOOLROOT/usr/bin
[joesmith@uranus joesmith]$source $HOME/.bashrc
[joesmith@uranus joesmith]$tar zxvf  orc-2.0-bin.tar.gz
[joesmith@uranus joesmith]$cd orc-2.0-bin; ./install.sh -n




 e.g 2
             Assume the account name is joesmith and the IA-64 host name is uranus, the login shell of joesmith is GNU BASH.

[joesmith@uranus joesmith]source $HOME/.bashrc
[joesmith@uranus joesmith]tar zxvf  orc-2.0-bin.tar.gz
[joesmith@uranus joesmith]cd orc-2.0-bin; ./install.sh -n  -t $HOME/music_and_photo       
[joesmith@uranus joesmith]vi $HOME/.bashrc  # set $TOOLROOT and $PATH

[joesmith@uranus joesmith]cat $HOME/.bashrc | grep "TOOLROOT\|PATH"
     export TOOLROOT=$HOME/music_and_photo 
     export PATH=$PATH: $TOOLROOT/usr/bin




Please note that the binary in orc-2.0-bin.tar.gz is prepared for RedHat 7.2 Linux system. Some users experienced certain problems when installing RedHat 7.2 on Itanium I machines. Our test shows the compiler works fine on successfully installed machines.

How do I install ORC compilers on IA32 Redhat 6.2 Linux systems?

Sometimes it's desirable to install the compilers on IA32 machines and do cross compile on  a  bare  Linux  6.2 box. In this case you can use NUE, an Itanium simulation environment, to get  a "virtual native IA64 system" on IA32.

    1. Download NUE from HP website and install it.
    2. After installing NUE,  you might need to re-link following 2 folders (depending on your
        NUE version number) to make orcc  also workable outside  NUE. 
        if /nue/usr/include/asm is a  symbolic link then change it to /nue/usr/src/linux/include/asm, by:

              cd /nue/usr/include

              rm asm
              ln -s  /nue/usr/src/linux/include/asm  asm 

        if /nue/usr/include/linux is a symbolic link then change it to /nue/usr/src/linux/include/linux/,  by:

                cd /nue/usr/include
                rm linux
                ln -s /nue/usr/src/linux/include/linux/ linux

     3.  Download  gcc (we only used 2.95.2 release and 2.96), and then
                configure --prefix=/usr;  make all install

            and then do a gcc -v to make sure you have the right version.

      4.  You don't need to build the binary at this point, just install that from the download.
    To build cross environment, keep in mind a number of things:
        1.  The  NUE environment is a simulated IA64 system on top of the IA32 Linux box. However, we do not need to enter NUE to perform cross compilation. 

        2. ORC relies on NUE's cpp (C PreProcessing) functionality as well as its native header files. Therefore NUE is an essential setup for cross compilation.
        3. By entering "nue", you are switched to a simulated native box. But  you need not do so for cross compilation.
        4. From  the IA32 Linux box, your file structure is really under  /nue.
        5. To do cross compile, do ${TOOLROOT}/usr/ia64-orc-linux/bin/orcc file.c. NOTE: The path for cross orcc is a little bit different from that of the native one, i.e., "ia64-orc-linux" is interposed between "/usr" and "/bin"

How do I install ORC compilers on IA32 Redhat 7.1 Linux systems?

orc-2.0-bin.tar.gz is not workable on this platform due to the shared object compatibility problem. A cross compiler on a 7.1 system is possible, you need to build that from the source tree.

How do I install ORC compilers on IA32 Redhat 7.2 Linux systems?

For  IA32  system  with Redhat 7.2 installed, the ORC compiler will be  used as a cross compiler. We have provided a script     INSTALL.cross that  will  install  the  compiler  binaries.  The  binaries are packaged in  tar-ball  form.  Just  untar  the  file downloaded,  run  the  script  "install.sh -c". We recommend the users install NUE 1.1 on RedHat 7.2 or higher. Since  NUE 1.0 does not run on this system, you will be doing strictly cross compilation in  this case.

How do I make a native built compiler?

We have provided a sample makefile (Make.native) in the same directory  as  Make.cross.  Simply  do  a  make  with Make.native.

How do I install multiple versions of ORC compilers on IA-64 systems?

Sometimes it's desirable to install multiple versions of the compilers on a machine, for debugging or experimental purposes. Assuming you have two different versions of ORC compiler binaries, installed in different directories, simply set the environment variables TOOLROOT and COMP_TARGET_ROOT to point to the directories for the desired binary and archives and it should work.


How do I rebuild the Fortran front-end?


     The Fortran front-end (i.e., mfef90) will not be built by default. If you want to rebuild it, you need to install the ORC binary first.

     Then do the following:

        cd ${ORC_SRC_ROOT}/src/osprey1.0/targia64_ia64_nodebug/crayf90/sgi; make BUILD_COMPILER=SGI


If I do cross compile on an IA32 machine, how do I produce IA-64 binaries?

In order to pick up the right set of archives or dynamic shared libraries, the simplest way is to produce object files in cross compilation, copy the objects to the target IA64 systems and do the final link there to produce the binary, as follows:

1. At IA32 side:

  [IA32]% orcc  –c hello.c –o hello.o

  [IA32]% ftp IA64                                         (transfer hello.o to an Itanium machine)

2. At IA64 side:

  [IA64]% orcc hello.o –o hello

  [IA64]% ../hello

  hello world!



ORC 2.0 release allows you to produce IA64 binary on IA32 machines directly. Simply follow the install instructions to set the environment variables and install the pre-built native archives properly.

Reporting bugs and problems

Where do I report bugs?

We cannot promise to fix every bug reported. If you think that you have uncovered a bug in the ORC compilers, you can post that to ipf-orc-support@lists.sourceforge.net. We will do our best if the bug is found to be ORC specific. For Pro64 specific bugs, we encourage you to post the problem in open64-devel@lists.sourceforge.net. 

How do I decide if the bug is ORC specific?

Our changes are primarily in the code generator area. To find out if the problem you have is specific to code due to our changes:

If the bug persists at –O0 level, it is likely Pro64 specific.

Otherwise, add “-CG:opt=0” and if the bug persists, it is most likely not in the code generator area

Otherwise, add “-ORC:=off” and if the bug goes away, the bug is ORC related.

How do I report bugs?

You can help us quickly turn around the fixes by doing some up front work. 

Minimize the test cases. Provide a fully preprocessed file (-E option) to avoid dependence on specific header files.

Give us a full command line description to compile the tests.

Tell us how to run your program if the symptom occurred at runtime. If the program needs data input, please append it as well.

You can even help by utilizing the triage tool we provided to narrow down the optimization, procedure and BB/region/instruction that is showing the problem.


Using tools that come with the compiler

How to use hot path enumeration tool?

               The hot path enumeration tool (hpe.pl) can be used to enumerate hot paths in a PU. It is suitable  for performance analysis.

            It works on an assembly  file (generated by ORC).

usage : hpe.pl [options] file
  -h  --help                               Display this information
  -pt --prob_threshold <num>   A path probability threshold to suppress less executed paths. Default is 0.4
  -fr --freq_ratio     <num>       A path frequency ratio  to suppress  less  executed paths. Namely, if a path frequency is <num> times less than that of the  hottest path in the same PU, we won't output it. Default is 100
  -pu --pu_name     <str>          PU name to process. Default is null, to process all PUs
  -i  --insn_ptn       <ptn>         An instruction(a regular expression) to statistic along the hot paths
  -d  --davinci                          Produce DaVinci files for PUs
  -v  --verbose                          Give a verbose introduction about this tool  
  file                                        An assembly file generated by ORC


How to use cycle counting tool?

The cycle counting tool(cycount.pl) can be used to count the cycle number of hot functions in SPEC2KINT benchmarks. It's suitable for statically investigating ORC performance degradation. It works on SPEC2KINT assembly files generated by ORC.

Usage: cycount.pl [options] benchmarks
  -h  --help                          Display this information
  -d  --spec_home  <dir>   Set <dir> as the SPEC2KINT home. Default is $HOME/spec2000
  -l  --log        <filename>  Give a log file name. Default is STDOUT
  benchmarks                   SPEC2KINT benchmarks. Type 'int' for all 12 SPEC2KINT benchmarks


Adding new optimizations or phases

Are there any coding conventions?

Please find the coding convention guideline here. This guideline is designed to stay close to the Pro64 coding style and convention. You can also find documentation on how the compiler handles memory management and various other issues.

How do I add my own tracing options?

One can have either summarized or detailed trace of any specific optimization performed. The traces are dumped to the file xxx.t where xxx.o is the desired output object file. To add your own tracing options, do:


1. Make sure what you want to add is a phase-specific trace flag. Assume your phase is XYZ (a corresponding number xyz can be found in osprey1.0/common/util/tracing.h). It will be used like this:


    ( -Wb,-ttxyz,0xnn is the same )

2. Modify xyz_defs.h to define your flags

    (the minor number above, nn. The major number is mm. ).

    Please add them in the end.


3. Your code using these flags should be like:

      if (Get_Trace(TP_XYZ, YOUR_FLAG)) {



   where Tfile is the file handle of the trace file.

4. common/util/tracing.h is the file of interest to add tracing.

How do I change the driver and related files when I add an optimization?

When you add your phase in the compilers, the following issues are important:

1. Suppose you want to add a phase inside the code generator

2. All files you want to add should have lower-cased file names, with words separated with underscore, such as: if_conv.cxx.

3. You should define (declare) a flag in cg_flags.cxx(.h) controlling whether your phase should be run.

4. You should add an element in arrary Options_IPFEC in cgdriver.cxx describing your flags.

    There are already several flags there, you can take them as examples.

5. Your flags should have prefix: IPFEC_Enable_XXX, such as IPFEC_Enable_If_Conversion.

    At least, they should have prefix: IPFEC_XXX.

6. For other components, they are all similar. The files common/com/xxx_config.h and common/util/flags.{h,c} are files of interest. 

How do I effectively use gdb for debugging the compiler?

1. Your gcc version needs to be 2.95.2 or 2.96

2. To enable debugging, you can build the entire compiler with


    Or you can build individual components (such as wopt.so) by going into the corresponding targia_xxx directories, and build with "BUILD_OPTIMIZE=DEBUG".

To single step inside the backend components (we'll show the CG portion, other components are similar): 
1. Remember to set the environment variable LD_LIBRARY_PATH. 

       For cross compiler this variable should be set as
           export LD_LIBRARY_PATH=${TOOROOT}/usr/ia64-orc-linux/lib/gcc-lib/ia64-orc-linux/2.0:$LD_LIBRARY_PATH

       For native compiler it should be 

           export  LD_LIBRARY_PATH=${TOOROOT}/usr/lib/gcc-lib/ia64-orc-linux/2.0:$LD_LIBRARY_PATH

2. First run orcc using options " ... -show -keep ... ". It will keep some needed intermediate files. 
   orcc -show -keep kk16.c
   It will print some info like this:

               /home/xyz/orc-2.0/usr/ia64-orc-linux/altbin/gcc -D_LANGUAGE_C -D_SGI_COMPILER_VERSION=10. -D__host_ia32 -D__INLINE_INTRINSICS -D_LP64 -D__ia64=1 kk16.c -E > kk16.i

               /usr/ia64-orc-linux/lib/gcc-lib/ia64-orc-linux/2.0/gfec -O0 -dx -quiet -dumpbase kk16.c kk16.i -o kk16.B

               /usr/ia64-orc-linux/lib/gcc-lib/ia64-orc-linux/2.0/be -PHASE:c -G8 -O0 -TENV:PIC -m1 -INTERNAL:return_val=on -INTERNAL:mldid_mstid=on -INTERNAL:return_info=on -show -TARG:abi=i64 -LANG:=ansi_c -fB,kk16.B -s -fs,kk16.s kk16.c


               Compiling kk16.c (kk16.B) -- Back End

               Compiling main(0)

               /usr/ia64-orc-linux/bin/as kk16.s -o kk16.o

               /usr/ia64-orc-linux/bin/ld -dynamic-linker /lib/ld-linux-ia64.so.2 -rpath-link ...


            3. Now we can use the input file and options for BE as the input to gdb when debugging BE:

                gdb /usr/ia64-orc-linux/lib/gcc-lib/ia64-orc-linux/2.0/be

                (gdb)break be_debug

                (gdb)run -PHASE:c -G8 -O0 -TENV:PIC -m1 -INTERNAL:return_val=on -INTERNAL:mldid_mstid=on -INTERNAL:return_info=on -show -TARG:abi=i64 -LANG:=ansi_c -fB,kk16.B -s -fs,kk16.s kk16.c


            4. After gdb stopped, set another breakpoint and trace into the component that you are interested in.

How do I make sure my new phase is not the cause of unnecessary long compile time?

All optimization phases are timed by the compiler for ease of measurement. Please make your phase timed also.

It's simple and you only need to add two lines in your code, one after entering your phase, for example:


and another before leaving your phase, for example:


and remember to include timing.h in your code.


Naturally, you'll need to define your timing phase, please look at file be/com/timing.h.

After re-compiling, you can get timing info using -Wb,-ti1.