GDB Tutorial Command Line Walkthrough : Part 2

Track down those segfaults

Now we’ve got an idea of how GDB works, lets look at a more complex example.

The program in Listing 2 (see bottom of page) should print out a familiar character in your terminal window. It uses a C array called template to determine where to print out a character versus a space. The template contains numbers that are used in pairs – the first is the position in the buffer to start writing, the second is the number of characters to write. There are 18 pairs in total.

Compile the program as follows:

gcc -fno-stack-protector -g invader.c -o invader

(Note: The no-stack-protector option just allows us to easily see a buffer overrun at work without the compiler trying to save us from it. Protection is enabled by default on some platforms, such as Ubuntu, but not on others, like Fedora. Adding this flag will disable it.)

When you run the invader program, you should immediately see a segmentation fault. Oh no! Don’t despair, let’s run it again in GDB and see if it can help. The segfault occurs and GDB stops, reporting that the program received SIGSEGV, along with a function and line number (screenshot below).

segfault in gdb

The line of code that the problem occurred in is handily displayed for you. If you want to see more source code, you can list it with l, which displays 10 lines around the current line. If you type l again, you’ll get the next 10 lines. To go backwards, type l -.

The line that GDB has stopped at gives us a big clue. It is a comparison of values in an array, so the segfault is possibly caused by the use of an index that is past the end of the declared array. Type:

(gdb) i lo

This displays information on all the local variables in the current stack frame (i.e. the current function call). You can immediately see that the count variable is a very large number indeed, and our template array certainly isn’t that long. Let’s put a breakpoint in at line 29 and see what’s happening to count:

(gdb) b 29

You don’t have to quit GDB to start again – just type r and GDB will ask you if you want to run the program from the beginning. Type y and on the next run, when the program stops at the for loop, you can add a watchpoint:

(gdb) watch count

This is a lovely little command that asks GDB to keep an eye on count and report its value each time it is changed*. Now you’re all ready to see what’s going on, so continue with c. The next time the program breaks, we can see that two values of count are reported, old and new. So far this looks exactly as expected – the old value is 0 and the new value is 2 which is correct as the for loop increments 2 at a time. Press Return to repeat the last command, and GDB will break again the next time it changes. This time things look wrong. The old value is 2, but the new value is 64 (Figure 3).

gdb watchpoint

Why is the value of count suddenly jumping to 64?

Lets look at GDB’s output a little closer. On a watchpoint, GDB displays the line it is currently halted at. That line has NOT yet been executed. It’s important to remember this. So, count has been changed, and GDB has stopped at line 35. List the source to see the last instruction with l.

You’d be forgiven for looking at the source and thinking, well, the last instruction before line 35 was:

int i = 0;

at line 34; surely that’s not going to be doing anything bad?

But remember – GDB has stopped at the top of a for loop, and it may already have run several times. List the local variables with i lo and you can see that the value of i is not zero – therefore the loop has been executed several times and the last line to be run is actually the one below, where the character ‘@’ is written to the buffer.

You can also see that pos indexes to a position outside the declared buffer length of 45 (the exact value of pos at this point may vary). The decimal ascii value of the ‘@’ character is 64, which is exactly what count is, so there you have it: the buffer is being overrun and as a consequence ‘@’ signs are being stamped all over the next variable in memory, which happens to be count.

Now you’ve found the problem, but you still need to know why it’s happening so you can fix it.

The pos variable should never be greater than the buffer length (if the template is correct), and the loop will increment pos while i is less than len. The value of len is set directly from the template and in this case is 33. However, if you look at the template you can see that there isn’t an instance of the value 33 anywhere.

To find out why len is such an odd number, you need to adjust the current session to focus on the len variable. Type d to delete all breakpoints, then set a new one at the point where len is set with b 32.

For reference, breakpoints (and watchpoints) can also be deleted individually: they are allocated numbers when you create them, so you can list and delete them one by one using i b followed byd <breakpoint number>.

Restart the program again with r. When GDB hits the new breakpoint, remember that the line you are looking at hasn’t yet been executed. Run this line, by typing n and then examine len with the print command:

(gdb) p len

And take a look at count too, to see where in the template we are obtaining len from:

(gdb) p count

The variable count is zero, so we’re reading the first item from the template instead of the second and then adding 1, making 9. This is our bug, as the template holds pair values and the second item should be used as the length. The programmer has made a typo (or misunderstood how arrays are referenced), by adding ‘+1’ outside the subscript operator at line 32. You can fix this by changing line 32 to:

(gdb) int len = template[count+1];

Now you need to exit GDB with q, recompile the source and re-run. You have to exit GDB completely if you change source files – if you don’t, GDB won’t pick up your changes and you’ll be debugging the same old version of the program as before. This is something that even veterans of GDB sometimes forget!

So there we have it, our first buffer overrun tracked down with minimum pain and fuss. Just a little more to do and we should have a fully functioning program.

On to Part 3.

*The watch command is very useful, but you should be aware that in a multithreaded environment,watch will not notify you of changes that are made to your selected variable while the thread that contains it is not being executed. Therefore if you are facing a memory overwrite issue that is originating from another thread, you won’t be notified at the time this happens, you will only see the change when you switch back to the thread that contains the variable you are watching.

Listing 2

Copy and paste, or download the invader.c file here.

/* 
# invader.c 
# Print a pattern to the terminal using a predefined area 
# of 44 characters width and 16 lines height 
# The template defines the pattern with a series of pairs: 
# the first is the position in the current line that should 
# be filled in, the second is how many characters should be filled. 
*/ 

#include <stdio.h>  
#include <string.h> 

#define BUF_LEN 45  /*44 plus null terminator*/ 
#define TEMPLATE_LEN 37 

int template[TEMPLATE_LEN] = 
{ 8,4,32,4,12,4,28,4,8,28,4,8,16,12,32,8,0,44, 
0,4,8,28,40,4,0,4,8,4,32,4,40,4,12,8,24,8,0 }; 

int main() 
{ 
    unsigned int count = 0; 
    char buffer[BUF_LEN]; 

    memset(buffer, ' ', BUF_LEN); 
    buffer[BUF_LEN-1] = 0x00; 

    printf("\n\n"); 
    for (count = 0 ; count < TEMPLATE_LEN-1; count+=2) 
    { 
        int pos = template[count]; 
        int len = template[count]+1;				 

        int i = 0;
        for (i = 0 ; i < len ; ++i, ++pos)
        {
            buffer[pos] = '@';
        }

        /*check for end of row*/
        if (template[count] >= template[count+2])
        {
            /*print and reset*/
            printf("  %s\n  %s\n", buffer, buffer);
            memset(buffer, ' ', BUF_LEN);
            buffer[BUF_LEN-1] = 0x00;
            count = 0;
        }
    }
    printf("\n\n"); 

    return 0; 
}

GDB Tutorial Command Line Walkthrough : Part 1

Master command line debugging with GDB

Whether you’ve spent hours fine-tuning printf statements to track down a persistent bug, or you just fancy impressing someone with your command line skills, GDB has the answers.

Introduction

This tutorial will take a walk-through approach to finding the bugs in two short C programs, covering around two dozen of the most useful GDB commands. This will get you used to using GDB against real code, which should leave you with enough knowledge to confidently approach debugging larger applications.

A word about GDB

You can skip this bit if you’re desperate to get started 🙂

The GNU Debugger (or GDB) is an incredibly powerful program. It may be free, but don’t underestimate its capabilities. It can be used on dozens of processors, running a wide variety of Unix-based operating systems, from small embedded projects to large, multi-threaded applications, and it has a range of followers, from large teams of software professionals to students. Its appeal isn’t just in the price tag (did I mention it was free?), as GDB is just as capable, if not more so, than many commercial debuggers out there. The trick, as always, is in learning how to use it.

A lot of people are put off by the command line interface, especially those who are used to using a graphical front end that allows you to see variable values and switch threads with one mouse click. While there are GUIs available for GDB, and support for in it many IDEs, including SlickEdit, Eclipse and CodeBlocks, once you have learnt how to use it in its raw form, you will find that it is elegant, fast and robust.

Getting ready

Fedora should have GDB installed by default, but if you have an older or minimal distro, you can install it with:

yum install gdb

For this tutorial, you will also need the gcc compiler, which again you can install with yum:

yum install gcc

Ubuntu users can make sure they have the compiler and debugger by running:

sudo apt-get install build-essential

Simple first example

Right, lets write some bad code and see what GDB can do. We’ll start gently with a very simplistic example. Compile the code in Listing 1 (you can copy or download the code from the bottom of this page):

gcc -g add.c -o add

The -g option tells gcc to include debugging information in the executable.

If you run this program, you’ll see that it tells you:

Sum of 200 and 800 is 160000

Let’s use GDB to see what’s happening. On the command line, in the same directory as your executable, start up the debugger with your program name as an argument:

gdb add

You will see a (gdb) prompt, ready for the next instruction. For reference, the screenshot below displays the full output of the commands that follow.

gdb session

GDB commands are usually whole words, but most of them have an abbreviated form. In this tutorial I have used the abbreviated forms of all commands, where they apply, as this is a much more efficient way to work (much less typing!). For the most part, the longer form is obvious, but where it isn’t, I’ve added it in brackets so you know exactly what the abbreviation stands for.

To make a start, insert a breakpoint at the main function:

(gdb) b main

Then run the program by typing r. If you wanted or needed to pass any command line arguments to your program, you would include them here, after the r.

Press Return and GDB runs the program until it enters main, at which point it stops and waits for your next command. You can see it prints out the line of source code that is about to be executed.

Lets move on by typing n. This moves us to the next line of code (and steps over any function calls). Now try pressing the Return key without any commands. GDB repeats the last command entered, so it moves us on to the next line again.

In the source code, our printf statement calls an Add() function, which might be a good place to look. Use the s command to step into the function. GDB enters Add() and shows the line that multiplies i by j. This is the source of our error, so now we know what to change to fix the problem.

If you enter a function and want to step back out again, use the command finish (which has no short form).

You can stop debugging your program with the ki command to kill it, which leaves you in GDB ready for another session; or you can exit GDB completely at any point with q for quit.

Now you’ve got the basics under your belt, let’s move onto Part 2.

Listing 1

Copy and paste, or download the add.c file here.

/* add.c  
 * Adds two integers and prints the result 
 */

#include <stdio.h>

int Add(int i, int j);

int main()
{
    int i = 200;
    int j = 800;
    printf("Sum of %d and %d is %d\n", i, j, Add(i, j));
    return 0;
}

int Add(int i, int j)
{
    return i * j;
}