Presented by

  • Ruud van der Pas

    Ruud van der Pas

    Ruud is a Distinguished Engineer in the Oracle Linux and Virtualization organization at Oracle Corporation, where he works in the Tools team. There he is deeply involved in the development of the gprofng application profiling tool. Ruud has studied mathematics and physics and has been involved with the performance of technical applications for well over 25 years. Before joining Oracle he worked at Sun Microsystems, SGI, Convex Computer Corporation, the University of Utrecht and Philips. Ruud regularly gives technical presentations and tutorials at conferences and workshops. These are mostly, but not always, related to the OpenMP parallel programming model. He is also a board member of IDC's HPC Advisory Committee, is on the program committee of various international conferences and has a strong interest in Interval Analysis and Interval Arithmetic. Ruud has published over 20 conference papers related to application tuning and parallelization, several technical white papers and is co-author of the books "Using OpenMP", and "Using OpenMP - The Next Step", both published by MIT Press.

Abstract

In this talk we present an overview of gprofng, a next generation profiling tool for Linux. This profiler has its roots in the Performance Analyzer from the Oracle Developer Studio product. Gprofng is a standalone tool however and specifically targets Linux. It includes several tools to collect and view the performance data. Various processors from Intel, AMD, and Arm are supported. The focus is on applications written in C, C++, Java, and Scala. For C/C++ we assume gcc has been used to build the code. In the case of Java and Scala, OpenJDK and compatible implementations are supported. Among other things, another difference with the widely known gprof tool is that gprofng offers full support for shared libraries and multithreading using Posix Threads, OpenMP, or Java Threads. Unlike gprof, gprofng can also be used in case the source code of the target executable is not available. Gprofng also works with unmodified executables. There is no need to recompile, or instrument the code. By profiling the production executable it is ensured that the profile reflects the actual run time behaviour and conditions of a production run. After the data has been collected, the performance information can be viewed at the function, source, and disassembly level. Individual thread views are supported as well. Through command line options, the user specifies the information to be displayed. In addition to this, a simple, but yet powerful scripting feature can be used to produce a variety of performance reports in an automated way. This may also be combined with filters to zoom in on specific aspects of the profile. For example, it is very easy to zoom in on one or more threads, but also to compare the behaviour across threads. One of the very powerful features of gprofng is the ability to compare two or more profiles. This allows for an easy way to spot regressions, or find scalability bottlenecks for example. In the talk, we start with a description of the architecture of the gprofng tools suite. This is followed by an overview of the various tools that are available, plus the main features. A comparison with gprof will be made, but the bulk of the talk consists of examples to show the functionality and features. We conclude with the plans for future developments. This includes a GUI to graphically navigate through the data.