GDB Tutorial Command Line Walkthrough : Part 3

Infinite loop

Well, the segfault is gone, but now the code just runs in a continuous loop, which is arguably worse. Run the invader program from GDB and press Ctrl-C to halt execution. View a backtrace by typing bt to see where we are in the code (screenshot below). Note that in a larger, multi-threaded program, when you Ctrl-C into a loop, it may stop at different places each time, so it is often worth trying it more than once and examining the backtrace of each.

gdb backtrace

As you can see, GDB has listed an impressive selection of function calls.

This is the call stack, and it shows you how your program got to its current point. The top line is where the program stopped when you interrupted it, and the last line is the original function call.

This call stack actually contains almost all system library calls, which are not that interesting, as it is unlikely that the bug is in anyone’s code but ours. To examine the frame you are interested in, select it using the framecommand:

(gdb) f 9

GDB prints out the line about to be executed, which is the printf call. A look at the local variables with i lo doesn’t show anything untoward (you can also display the local variables for every stack frame in a backtrace with the command bt full), so what’s going on here?

At least some output is appearing on the screen when the program is run, so it must be the outer forloop (the one that contains the printf statement), that is running repeatedly. It is set to run no more than 36 times, so you can assume that count is never getting high enough to exit the loop. Put a watch on count again to see if’s behaving as expected i.e. incrementing by 2 each time:

(gdb) b 31
(gdb) c
(gdb) watch count
(gdb) c

As soon as GDB stops you can see what the problem is – count is being changed from 2 back to 0. The line reported is the for loop, but this is yet to be executed, so what was the previous statement that just changed count? That’s right – the last statement in the loop resets count to zero. The programmer resets the buffer after printing to the screen and has been over-zealous with their tidying up.

Exit GDB, remove the count = 0; statement at line 47, recompile the program and hurrah! Everything works as expected.

ascii space invader

Luckily this program only had a couple of errors in it, but the techniques that you’ve used to find them will enable you to track down all sorts of lovely bugs, from simple typos to subtle mistakes in pointer handling.

Debugging is very much like being a detective, and GDB is a great tool to aid you in confirming your suspicions about what might be going wrong.

Some other topics worth mentioning

1. Reversible debugging

One of the more exciting new features in GDB 7.0 onwards, is reversible debugging. This is still in its early stages of development, but you can see it at work as follows (type gdb –version to check if you have a recent enough version number). Start the invader program with GDB. Set a breakpoint at line 44. Type run and when the breakpoint is hit, type rec (or target record). Now use the ncommand twice and print out the buffer:

(gdb) p buffer
$1 = ' ' <repeats 45 times>
(gdb)

The buffer has been reset – oops! But we wanted to see what was in it – lets turn back time with rn(or reverse-next). The memset is ‘undone’, and now if we print out the buffer, we can see that it contains the characters that were printed to the screen. Amazing! Examples of other commands you can use to step backwards are rc (reverse-continue), and rs (reverse-step).

Reversible debugging is still in its infancy, and it does make your program run slower (because it is recording and storing so much extra information), but it has great potential and it is fun to see it at work.

2. Automate GDB startup

Each time GDB is run, it checks the local directory for a .gdbinit file and runs any commands it finds there. This is a really useful way to save configuration information that you don’t want to type in each time, such as any runtime arguments or GDB settings. For example, in Part 1 of the tutorial, you can see the following line is output by GDB in Figure 1:

Missing separate debuginfos, use: debuginfo-install glibc-2.11.1-6.i686

This is because more recent versions of GDB will warn you if you do not have the debug information installed for system libraries that are called by your application. For most of us, this is an unnecessary distraction. You can hide these reports by typing in

set build-id-verbose 0

when you start up GDB, or, you can add this line to your .gdbinit file. Just open any text editor, type in your commands (one per line) and save the file as .gdbinit. This file needs to be saved in the directory that you run GDB from, which is usually the one that contains your executable. Remember that files that start with a dot are hidden on Linux, so you won’t be able to see it in the graphical browser unless you enable viewing of hidden files, but running ls -a on the command line will display it. You can set up a separate .gdbnit file for each application you are debugging, as long as your applications don’t all live in the same folder!

3. Attach to a process

You don’t have to start your application with GDB – you can also use GDB to attach to a running program. In a new terminal window, look up the process id (pid) by running ps and note down the number. Then start GDB in a separate terminal window and type attach. GDB will halt the program’s execution and display the now familiar (gdb) prompt, allowing you to debug as if you had run the application directly. To let the program continue as normal, use the detach command to quietly slip away.

4. Core dumps

Traditionally, GDB has been used to examine core dump files. When a program crashes, the Linux kernel can create a “core dump” of data about what was happening in the program at that time. GDB can read this file to give you valuable information about a crash after the event. Many Linux distributions have this ability disabled by default – you need to type ulimit -c unlimited to allow the creation of the files. On Fedora you may also need to update the abrt package withyum update abrt, as it can interfere with core file creation. Once you’ve done that, when a crash occurs, a file called core.pid should be created in the same directory as the executable, where pid is the process id.

To examine a core file, just pass it to gdb with gdb core.pid (obviously, using your core file’s actual name). GDB will load up all the info and it will look as though you have just run the program and seen the error. You’ll be able to view a backtrace and a range of other information, but it will be a frozen “snapshot” of execution. The advantage of using a core dump is that you don’t have to wait around for the crash to happen, and users can send core files back to the programmer to illustrate a problem, but beyond that there is no substitute for debugging a program live if you are able to.

The end of the beginning

GDB has hundreds of configuration settings and many more commands, so your next stop is the GDB documentation. There is also a slightly out-of-date, but still useful, GDB quick ref PDF reference card floating around online, (just Google for GDB quick ref), which you can hide under some books on your desk, thus fooling people that you really have committed 500 slightly obscure commands to memory.

There should be nothing stopping you now, so enjoy using GDB and long may it aid you in your quest for wonderfully bug free code.

Appendix: GDB Tutorial Quick Ref

b <line num/function>      breakpoint
r <args>                   run/restart, with program arguments
n                          next : step over next line
s                          step : step into next line
finish                     step out of function
c                          continue
l			   list next 10 lines of source
l -		           list previous 10 lines of source
i lo			   info locals – display all variables in the current stack frame
i b 			   info breakpoints – display all watchpoints and breakpoints
watch <var>	           watchpoint
d <num> 		   delete breakpoint/watchpoint
d			   delete all breakpoints/watchpoints
attach <pid> 		   attach to running process
detach			   detach from running process
p			   print
bt			   backtrace
bt full			   backtrace and print all local variables for each frame
frame <num>		   select frame in stack
rec			   target record (for reversible debugging)
rs			   reverse step
rc			   reverse continue
rn			   reverse next

 

GDB Tutorial Command Line Walkthrough : Part 2

Track down those segfaults

Now we’ve got an idea of how GDB works, lets look at a more complex example.

The program in Listing 2 (see bottom of page) should print out a familiar character in your terminal window. It uses a C array called template to determine where to print out a character versus a space. The template contains numbers that are used in pairs – the first is the position in the buffer to start writing, the second is the number of characters to write. There are 18 pairs in total.

Compile the program as follows:

gcc -fno-stack-protector -g invader.c -o invader

(Note: The no-stack-protector option just allows us to easily see a buffer overrun at work without the compiler trying to save us from it. Protection is enabled by default on some platforms, such as Ubuntu, but not on others, like Fedora. Adding this flag will disable it.)

When you run the invader program, you should immediately see a segmentation fault. Oh no! Don’t despair, let’s run it again in GDB and see if it can help. The segfault occurs and GDB stops, reporting that the program received SIGSEGV, along with a function and line number (screenshot below).

segfault in gdb

The line of code that the problem occurred in is handily displayed for you. If you want to see more source code, you can list it with l, which displays 10 lines around the current line. If you type l again, you’ll get the next 10 lines. To go backwards, type l -.

The line that GDB has stopped at gives us a big clue. It is a comparison of values in an array, so the segfault is possibly caused by the use of an index that is past the end of the declared array. Type:

(gdb) i lo

This displays information on all the local variables in the current stack frame (i.e. the current function call). You can immediately see that the count variable is a very large number indeed, and our template array certainly isn’t that long. Let’s put a breakpoint in at line 29 and see what’s happening to count:

(gdb) b 29

You don’t have to quit GDB to start again – just type r and GDB will ask you if you want to run the program from the beginning. Type y and on the next run, when the program stops at the for loop, you can add a watchpoint:

(gdb) watch count

This is a lovely little command that asks GDB to keep an eye on count and report its value each time it is changed*. Now you’re all ready to see what’s going on, so continue with c. The next time the program breaks, we can see that two values of count are reported, old and new. So far this looks exactly as expected – the old value is 0 and the new value is 2 which is correct as the for loop increments 2 at a time. Press Return to repeat the last command, and GDB will break again the next time it changes. This time things look wrong. The old value is 2, but the new value is 64 (Figure 3).

gdb watchpoint

Why is the value of count suddenly jumping to 64?

Lets look at GDB’s output a little closer. On a watchpoint, GDB displays the line it is currently halted at. That line has NOT yet been executed. It’s important to remember this. So, count has been changed, and GDB has stopped at line 35. List the source to see the last instruction with l.

You’d be forgiven for looking at the source and thinking, well, the last instruction before line 35 was:

int i = 0;

at line 34; surely that’s not going to be doing anything bad?

But remember – GDB has stopped at the top of a for loop, and it may already have run several times. List the local variables with i lo and you can see that the value of i is not zero – therefore the loop has been executed several times and the last line to be run is actually the one below, where the character ‘@’ is written to the buffer.

You can also see that pos indexes to a position outside the declared buffer length of 45 (the exact value of pos at this point may vary). The decimal ascii value of the ‘@’ character is 64, which is exactly what count is, so there you have it: the buffer is being overrun and as a consequence ‘@’ signs are being stamped all over the next variable in memory, which happens to be count.

Now you’ve found the problem, but you still need to know why it’s happening so you can fix it.

The pos variable should never be greater than the buffer length (if the template is correct), and the loop will increment pos while i is less than len. The value of len is set directly from the template and in this case is 33. However, if you look at the template you can see that there isn’t an instance of the value 33 anywhere.

To find out why len is such an odd number, you need to adjust the current session to focus on the len variable. Type d to delete all breakpoints, then set a new one at the point where len is set with b 32.

For reference, breakpoints (and watchpoints) can also be deleted individually: they are allocated numbers when you create them, so you can list and delete them one by one using i b followed byd <breakpoint number>.

Restart the program again with r. When GDB hits the new breakpoint, remember that the line you are looking at hasn’t yet been executed. Run this line, by typing n and then examine len with the print command:

(gdb) p len

And take a look at count too, to see where in the template we are obtaining len from:

(gdb) p count

The variable count is zero, so we’re reading the first item from the template instead of the second and then adding 1, making 9. This is our bug, as the template holds pair values and the second item should be used as the length. The programmer has made a typo (or misunderstood how arrays are referenced), by adding ‘+1’ outside the subscript operator at line 32. You can fix this by changing line 32 to:

(gdb) int len = template[count+1];

Now you need to exit GDB with q, recompile the source and re-run. You have to exit GDB completely if you change source files – if you don’t, GDB won’t pick up your changes and you’ll be debugging the same old version of the program as before. This is something that even veterans of GDB sometimes forget!

So there we have it, our first buffer overrun tracked down with minimum pain and fuss. Just a little more to do and we should have a fully functioning program.

On to Part 3.

*The watch command is very useful, but you should be aware that in a multithreaded environment,watch will not notify you of changes that are made to your selected variable while the thread that contains it is not being executed. Therefore if you are facing a memory overwrite issue that is originating from another thread, you won’t be notified at the time this happens, you will only see the change when you switch back to the thread that contains the variable you are watching.

Listing 2

Copy and paste, or download the invader.c file here.

/* 
# invader.c 
# Print a pattern to the terminal using a predefined area 
# of 44 characters width and 16 lines height 
# The template defines the pattern with a series of pairs: 
# the first is the position in the current line that should 
# be filled in, the second is how many characters should be filled. 
*/ 

#include <stdio.h>  
#include <string.h> 

#define BUF_LEN 45  /*44 plus null terminator*/ 
#define TEMPLATE_LEN 37 

int template[TEMPLATE_LEN] = 
{ 8,4,32,4,12,4,28,4,8,28,4,8,16,12,32,8,0,44, 
0,4,8,28,40,4,0,4,8,4,32,4,40,4,12,8,24,8,0 }; 

int main() 
{ 
    unsigned int count = 0; 
    char buffer[BUF_LEN]; 

    memset(buffer, ' ', BUF_LEN); 
    buffer[BUF_LEN-1] = 0x00; 

    printf("\n\n"); 
    for (count = 0 ; count < TEMPLATE_LEN-1; count+=2) 
    { 
        int pos = template[count]; 
        int len = template[count]+1;				 

        int i = 0;
        for (i = 0 ; i < len ; ++i, ++pos)
        {
            buffer[pos] = '@';
        }

        /*check for end of row*/
        if (template[count] >= template[count+2])
        {
            /*print and reset*/
            printf("  %s\n  %s\n", buffer, buffer);
            memset(buffer, ' ', BUF_LEN);
            buffer[BUF_LEN-1] = 0x00;
            count = 0;
        }
    }
    printf("\n\n"); 

    return 0; 
}

GDB Tutorial Command Line Walkthrough : Part 1

Master command line debugging with GDB

Whether you’ve spent hours fine-tuning printf statements to track down a persistent bug, or you just fancy impressing someone with your command line skills, GDB has the answers.

Introduction

This tutorial will take a walk-through approach to finding the bugs in two short C programs, covering around two dozen of the most useful GDB commands. This will get you used to using GDB against real code, which should leave you with enough knowledge to confidently approach debugging larger applications.

A word about GDB

You can skip this bit if you’re desperate to get started 🙂

The GNU Debugger (or GDB) is an incredibly powerful program. It may be free, but don’t underestimate its capabilities. It can be used on dozens of processors, running a wide variety of Unix-based operating systems, from small embedded projects to large, multi-threaded applications, and it has a range of followers, from large teams of software professionals to students. Its appeal isn’t just in the price tag (did I mention it was free?), as GDB is just as capable, if not more so, than many commercial debuggers out there. The trick, as always, is in learning how to use it.

A lot of people are put off by the command line interface, especially those who are used to using a graphical front end that allows you to see variable values and switch threads with one mouse click. While there are GUIs available for GDB, and support for in it many IDEs, including SlickEdit, Eclipse and CodeBlocks, once you have learnt how to use it in its raw form, you will find that it is elegant, fast and robust.

Getting ready

Fedora should have GDB installed by default, but if you have an older or minimal distro, you can install it with:

yum install gdb

For this tutorial, you will also need the gcc compiler, which again you can install with yum:

yum install gcc

Ubuntu users can make sure they have the compiler and debugger by running:

sudo apt-get install build-essential

Simple first example

Right, lets write some bad code and see what GDB can do. We’ll start gently with a very simplistic example. Compile the code in Listing 1 (you can copy or download the code from the bottom of this page):

gcc -g add.c -o add

The -g option tells gcc to include debugging information in the executable.

If you run this program, you’ll see that it tells you:

Sum of 200 and 800 is 160000

Let’s use GDB to see what’s happening. On the command line, in the same directory as your executable, start up the debugger with your program name as an argument:

gdb add

You will see a (gdb) prompt, ready for the next instruction. For reference, the screenshot below displays the full output of the commands that follow.

gdb session

GDB commands are usually whole words, but most of them have an abbreviated form. In this tutorial I have used the abbreviated forms of all commands, where they apply, as this is a much more efficient way to work (much less typing!). For the most part, the longer form is obvious, but where it isn’t, I’ve added it in brackets so you know exactly what the abbreviation stands for.

To make a start, insert a breakpoint at the main function:

(gdb) b main

Then run the program by typing r. If you wanted or needed to pass any command line arguments to your program, you would include them here, after the r.

Press Return and GDB runs the program until it enters main, at which point it stops and waits for your next command. You can see it prints out the line of source code that is about to be executed.

Lets move on by typing n. This moves us to the next line of code (and steps over any function calls). Now try pressing the Return key without any commands. GDB repeats the last command entered, so it moves us on to the next line again.

In the source code, our printf statement calls an Add() function, which might be a good place to look. Use the s command to step into the function. GDB enters Add() and shows the line that multiplies i by j. This is the source of our error, so now we know what to change to fix the problem.

If you enter a function and want to step back out again, use the command finish (which has no short form).

You can stop debugging your program with the ki command to kill it, which leaves you in GDB ready for another session; or you can exit GDB completely at any point with q for quit.

Now you’ve got the basics under your belt, let’s move onto Part 2.

Listing 1

Copy and paste, or download the add.c file here.

/* add.c  
 * Adds two integers and prints the result 
 */

#include <stdio.h>

int Add(int i, int j);

int main()
{
    int i = 200;
    int j = 800;
    printf("Sum of %d and %d is %d\n", i, j, Add(i, j));
    return 0;
}

int Add(int i, int j)
{
    return i * j;
}