Profilers

From HP-SEE Wiki

(Difference between revisions)
Jump to: navigation, search
(gprof)
(gprof)
Line 18: Line 18:
   gcc -pg -O5 -c src/util/inout.c -o inout.o
   gcc -pg -O5 -c src/util/inout.c -o inout.o
   gcc -pg -O5 -c src/util/time.c -o time.o
   gcc -pg -O5 -c src/util/time.c -o time.o
-
   gcc -pg -O5 -c src/imagtime3d/imagtime3d.c -o imagtime3d.o     
+
   gcc -pg -O5 -c src/imagtime3d/imagtime3d.c -o imagtime3d.o     
-
   gcc -pg -O5 -o imagtime3d imagtime3d.o diffint.o mem.o cfg.o inout.o time.o -lm  
+
   gcc -pg -O5 -o imagtime3d imagtime3d.o diffint.o mem.o cfg.o inout.o time.o -lm  
-
   gcc -pg -O5 -c src/realtime3d/realtime3d.c -o realtime3d.o     
+
   gcc -pg -O5 -c src/realtime3d/realtime3d.c -o realtime3d.o     
-
   gcc -pg -O5 -o realtime3d realtime3d.o diffint.o mem.o cfg.o inout.o time.o -lm
+
   gcc -pg -O5 -o realtime3d realtime3d.o diffint.o mem.o cfg.o inout.o time.o -lm
If only some of the modules of the program are compiled with <code>-pg</code> flag, user can still profile the program, but one won't get complete information about the modules that were compiled without <code>-pg</code>. The only information user will get for the functions in those modules is the total time spent in them. There is no record of how many times they were called, or from where. If user wants to perform line-by-line profiling, one should also specify the <code>-g</code> option, instructing the compiler to insert debugging symbols into the program that match program addresses to source code lines.
If only some of the modules of the program are compiled with <code>-pg</code> flag, user can still profile the program, but one won't get complete information about the modules that were compiled without <code>-pg</code>. The only information user will get for the functions in those modules is the total time spent in them. There is no record of how many times they were called, or from where. If user wants to perform line-by-line profiling, one should also specify the <code>-g</code> option, instructing the compiler to insert debugging symbols into the program that match program addresses to source code lines.

Revision as of 09:04, 18 April 2012

Contents

gprof

Section contributed by IPB

gprof is the profiler of the GNU (www.gnu.org) project. It is free software, with the source code freely available and it is licensed under the terms of the GNU Public License. Gprof is not a separate package and it is included in the binutils (http://www.gnu.org/software/binutils/) package which is one of the one of the basic packages of the GNU development suite which makes gprof available on almost every Linux system.

Gprof is available on the PARADOX Cluster as a part of the installed binutils package, version 2.17.50. Beside GNU compilers, Intel and PGI compilers that are available on PARADOX can also generate a gprof compatible profiling output.

Users can perform profiling of their programs using the following steps:

Compiling and linking program with profiling enabled

To compile a source file for profiling, user should specify the -pg option when run the compiler. Compiling with -pg instruments the code so that gprof reports detailed information about program execution. Profiling in gprof works by changing how every function in program is compiled so that when it is called, it will stash away some information about where it was called from. From this, the profiler can figure out what function called it, and can count how many times it was called. For example, to enable gprof profiling of NUQG 3D BEC imag- and real-time propagation source codes, the following lines are used:

 gcc -pg -O5 -c src/util/diffint.c -o diffint.o
 gcc -pg -O5 -c src/util/mem.c -o mem.o
 gcc -pg -O5 -c src/util/cfg.c -o cfg.o
 gcc -pg -O5 -c src/util/inout.c -o inout.o
 gcc -pg -O5 -c src/util/time.c -o time.o
 gcc -pg -O5 -c src/imagtime3d/imagtime3d.c -o imagtime3d.o    
 gcc -pg -O5 -o imagtime3d imagtime3d.o diffint.o mem.o cfg.o inout.o time.o -lm 
 gcc -pg -O5 -c src/realtime3d/realtime3d.c -o realtime3d.o    
 gcc -pg -O5 -o realtime3d realtime3d.o diffint.o mem.o cfg.o inout.o time.o -lm

If only some of the modules of the program are compiled with -pg flag, user can still profile the program, but one won't get complete information about the modules that were compiled without -pg. The only information user will get for the functions in those modules is the total time spent in them. There is no record of how many times they were called, or from where. If user wants to perform line-by-line profiling, one should also specify the -g option, instructing the compiler to insert debugging symbols into the program that match program addresses to source code lines.

Sun Studio performance tools

Intel VTune

Section contributed by IICT-BAS

Intel VTune Amplifier is a performance analysis application for both 32-bit and 64-bit x86 based machines. It can profile C, C++ and Fortran code and can be used for both serial and threaded application. Additionaly VTune is available as an MPI-enabled version.

To use VTune you must compile your code using the -g compiler flag in order to generate the debug information needed for VTune.

VTune provides several tools for analysis:

Software sampling
A feature limited only for x86 compatible processors. It can be used to generate the time spent on different locations along with the call stack, thus identifying hotspots that which require optimization.
Locks and waits analysis
Used for locating long synchronization waits. Useful for detecting underutilization.
Threading timeline
A feature that allows the indetification of synchronization issues.
Source view
Visualizes the sampling times results mapped to the the source code. Can also be drilled-down to assembly code.
Hardware event sampling
A feature limited to Intel processors, because it uses the chip performance monitoring unit. Quite useful for minimizing cache misses and branch mispredictions.
PTU (Performance Tuning Utility)
An additional application, that grants access to experimental performance analysis tools.

D8.2 related materials

http://hpseewiki.ipb.ac.rs/index.php/Optimization_techniques_for_scalability#Profilers

Personal tools