Dynamic Memory in C (malloc, calloc, realloc, free)

Let’s take a look at the four methods that allow us to utilize dynamic memory in C. Dynamic memory just means we are using memory on the heap, instead of on the stack.

Why would you want to use dynamic memory?

You might want to create a variable or object that persists beyond the scope of the function it is created in (i.e. to share it between functions). The only way to do this is to create it dynamically, and then remember to deallocate it at a later point. Alternatively, you might not know the size of something you are using until runtime (e.g. reading in a file), in which case the heap is a better place to store your data. Why? Because the size of the stack is limited and it may be too small for unknown data processed at runtime.

How do you dynamically allocate in C?

There are three functions for memory allocation (plus one for deallocation). Each allocator returns a void* to the memory that has been allocated, which can be cast to the appropriate type.

Let’s look at each in turn:

1. malloc

This is the most commonly used method. Simply pass in how big you want your memory to be (in bytes), and you get a pointer to that memory back. The memory is uninitialized. If it fails it returns NULL.

2. calloc

Instead of passing in a size, you tell calloc how many of a certain type of variable you are going to use. E.g. 10 ints, or 16 structs. The memory is initialized to zeros. If it fails it returns NULL.

3. realloc

This method resizes an existing block of memory and you can make your existing memory allocation bigger or smaller. It frees the existing block and returns a void* to the new block. If you pass in zero, it effectively frees the memory in question. If it fails it returns NULL (see the comments in the code below for why you should pay careful attention to how you use realloc).

And what do you do when you’re finished?

You call free, nice and easy, and your memory is released (although it is not “deleted” as such – the data may exist until something else overwrites it, or part of it, or until the program ends).

Note that calls to malloc, calloc and realloc set errno if they fail. You can find out how to use errno here!

Can I see an example?

Sure – here’s a little bit of code that uses all four methods so you can see them at work:

#include <stdlib.h>

#define BIG_NUMBER 1024
#define SMALL_NUMBER 16

struct msg
{
	int code;
	char message[BIG_NUMBER];
};

int main(void)
{
	char* buffer;
	struct msg* messagelist;

	/* Allocate some memory from the heap */
	buffer = (char*)malloc(BIG_NUMBER);
	if (buffer != NULL)
	{
		/* I can use the memory safely */
	}

	/* Reduce the size of the memory */
	char* smallbuffer = (char*)realloc(buffer, SMALL_NUMBER);
	if (smallbuffer != NULL)
	{
		/* I can use the memory safely */
	}

	/*******************************************
	 * NOTE: Look carefully at the realloc call above.
	 * If the call to realloc had failed and I had assigned
	 * it to the original buffer like so:
	 *     buffer = (char*)realloc(buffer, SMALL_NUMBER);
	 * then my buffer would have been set to NULL and I would
	 * not only lose access to the data that was stored
	 * there, but I'd create a memory leak too!
	 *******************************************/

	/* Allocate some memory from the heap */
	messagelist = (struct msg*)calloc(SMALL_NUMBER, sizeof(struct msg));
	if (messagelist != NULL)
	{
		/* I can use the memory safely */
	}

	/* Remember to clear up after myself */
	free(smallbuffer);
	free(messagelist);

	/* NOTE: I DON'T need to free the 'buffer' variable */
	/* because realloc already did it for me 🙂 */

	return EXIT_SUCCESS;
}

Declaring Variables in Switch Statements

There you are, happily programming away, when suddenly you get a compile error:

error: jump to case label
error: crosses initialization of 'int x'

“Huh?” You say, peering at the computer screen. Your code looks fine, so what does it mean?

Look closely at your switch statement. A switch statement contains case labels, which provide options to do different things after checking the value of a variable. However, what you may not realise is that the contents of each of these case labels actually exists in the same scope.

Why does scope matter?

If you declare a variable after a case label, you are actually declaring that variable for all subsequent labels without realising it.

This could lead to you trying to declare the same variable twice (if you’re doing similar things for each case), or worse, for you to inadvertently change the value of a variable under another case.

To stop you from doing this, the compiler flags an error and stops compilation.

Oh right. But how do I fix it?

You can still declare variables in switch statements, you just have to put curly brackets around the code after the case label.

Compare the two examples below. The first one generates an error. The second lets you compile and move on.

1. This generates a compile error:

switch (y)
{
case 0:
    int x = 42;
    cout << "I declared variable x." << end;
    break;
case 1:
    cout << "Variable x is still in scope!" << end;
    break;
default:
    break;
}

2. This compiles successfully:

switch (y)
{
case 0:
    {
        int x = 42;
        cout << "I declared variable x." << end;
    }
    break;
case 1:
    cout << "I don't know about variable x." << end;
    break;
default:
    break;
}

 

Bitwise RGBA Values

Let’s take a look at bit shifting in practice.

Say we have a variable called colour, that contains an RGBA value. If you have never had any experience with graphics, all you need to know is that the colours you see on your screen may be represented as a combination of four different variables – red, green, blue and alpha. The alpha value is usually a percentage to describe the opacity, while red, green and blue values are combined to describe the final colour.

RGBA values are usually stored in a single 32 bit integer, with 8 bits used for each component:

RRRRRRRR GGGGGGGG BBBBBBBB AAAAAAAA

All well and good, but imagine we need to know what the green value is independently of everything else. How can we extract this information? And moreover, how do we get a colour encoded into the variable in the first place?

Setting an RGBA value

Imagine we want to set our colour to a bright yellow, fully opaque. This uses the RGBA components:

R) 0xFF
G) 0xCC
B) 0x00
A) 0xFF

As binary this looks like:

11111111 11001100 00000000 11111111

OK, we could set the colour variable using a large number:

int colour = 4291559679;

but that isn’t a very intuitive (or re-usable) solution.

Instead we’ll add our components in one at a time using a mask, and shift them to the correct positions:

unsigned int colour =
0xFF | (0x00 << 8) | (0xCC << 16) | (0xFF << 24);

To fully break down what is happening here, let’s look at the binary behind the scenes:

The first step is a bitwise OR on 0xFF with 0x00 shifted left by 8 places:

0000 0000 1111 1111 // 0xff
0000 0000 0000 0000 // 0x00 << 8
___________________
0000 0000 1111 1111

The next step is a bitwise OR on the result with 0xCC shifted left 16 places:

0000 0000 0000 0000 1111 1111 // result
1100 1100 0000 0000 0000 0000 // 0xCC << 16
_____________________________
1100 1100 0000 0000 1111 1111

And finally a bitwise OR on the result with 0xFF shifted left 24 places:

0000 0000 1100 1100 0000 0000 1111 1111 // result
1111 1111 0000 0000 0000 0000 0000 0000 // 0xFF << 24
_______________________________________
1111 1111 1100 1100 0000 0000 1111 1111 // 0xFFCC00FF, or 4291559679

The final result is the number we want to assign to the colour integer.

Extracting an RGBA value

Now say we want to extract that green value from our colour integer. We can simply do the following:

int green = (colour & 0x00FF0000) >> 16;

What’s happening here?

First off, we’re masking our colour variable using bitwise AND to effectively “turn off” all the components that we aren’t interested in:

1111 1111 1100 1100 0000 0000 1111 1111
0000 0000 1111 1111 0000 0000 0000 0000
_______________________________________
0000 0000 1100 1100 0000 0000 0000 0000

Then we shift 16 places to the right to put our green component in the first byte:

0000 0000 0000 0000 0000 0000 1100 1100 // 0xCC

Simple! Now we know how to extract any component we choose by adjusting the mask and number of places shifted accordingly.

Bitwise operator summary

A quick guide to which operator to use when.

Bitwise AND

  • Use with a mask to check if bits are on or off
  • Turn off individual bits

Bitwise OR

  • Turn on individual bits

Bitwise XOR

  • Toggle bits on and off, like a switch

Bitwise NOT

  • Turn off individual bits with AND

Bitwise left and right shift

  • Extract bytes from longer variables
  • Insert bytes into longer variables
  • Multiplication and division by powers of 2 (but be cautious with signed integers, remainders and overflow)

This is not an exhaustive list, but a basic guide. Have fun with bitwise operators, and if you want more examples and ideas, have a look at this fantastic collection of code snippets from Sean Eron Anderson.

Bit Shifting

The left and right shift operators are the last two bitwise operators for us to look at.

In C (and C++), << and >> are used to shift bits around inside bytes. Not just in any random fashion – these operators move all the bits either to the left or to the right, exactly as their names imply.

Left bit shift

This is the easy one. No matter what your integer type (long, short, signed, unsigned), applying this operator always moves the bits to the left and pads with zeros from the right. The bits that move off to the left are discarded – they are gone forever and you can’t get them back.

For each place that the bits are shifted, it is the equivalent of multiplying by 2.

62 << 2 = 248    //or, multiply by 4

In binary:

0011 1110 //62
(00)1111 1000 // 248 (discarded) padded

The next example shows what happens if you use an integer that is not large enough to hold the result. 120 << 2 should give you 480, but since we are using a single byte, we lose bits from the left and the result is 224:

120 << 2 = 224

In binary:

0111 1000 //120
(01)1110 0000 // 224 (discarded) padded

Right bit shift

The right shift does the opposite of the left shift. If you are applying it to an unsigned integer this is always the case and it is always the equivalent of dividing by 2 for each place shifted (and it always rounds down towards zero).

62 >> 2 = 15    // or, divide by 4

In binary:

0011 1110 //62
0000 1111(10) // 15 padded (discarded)

Right shifting signed integers

If you apply a right shift to a signed integer, then the result may vary. On some machines, under some compilers, it will pad with sign bits, i.e. 1s for a negative number and 0s for a positive number, since the leftmost bit is the sign bit for signed integers. This is also termed the ‘arithmetic shift’. On other machines, with other compilers, it will always pad with zeros. This is termed the ‘logical shift’.

GCC, which is the compiler I use, implements the arithmetic shift. You can easily write a few lines of code to test out what your own compiler does if you can’t find details in the documentation.

Note that right shifting a negative signed integer rounds towards negative infinity, not zero. -50 >> 2 will give -13, but -50 / 4 will give -12.

Two’s complement is used to represent signed integers, so the binary representation for the negative numbers below is probably not familiar. However, you can still see how the shift is applied here. (I’d like to cover two’s complement representation at some point in the future – it requires its own post to be explained effectively.)

1100 1110 // -50
1111 0011(10) // -13 padded (discarded)

What would I use bit shifting for?

We’ll take a look at some examples next week – right now I have a 3 week old baby that needs feeding!