Menu

Programming Fundamentals: Pointers

July 3, 2012 - Programming

What is a Pointer?

To understand a pointer, one must better understand variables. In a statically typed language such as C, a variable has three properties- it has a type, it has a name, and it has a value.

In the above example, int is the type of this variable, variable is the name, and 400 is it’s value. A Pointer is essentially a variable that has a type and a name, but get’s it’s value from elsewhere. For example, in the following C program:

a has a type, name, and a value. b and c, however, are declared as pointers. This is done by placing a asterisk between the type and the variable name. For the most part, a pointer is just a memory pointer, usually a long integer. in this example we set b to point to the address of a, and then set c to be the same as b. When we change a, the “values” of the other variables change to.

Since pointers are fundamentally integers of some description, all operations assume you are dealing with the pointer value (as in, the integer pointing into memory) as opposed to the value stored at that location. In order to get the value stored at the location a pointer points, you need to dereference the pointer. In C, this is accomplished by prefixing the variable name with a asterisk, as shown in the above printf()’s for b and c. dereferencing a pointer returns a non-pointer value of the type of that pointer, in this case, while b is a int*, *b is a int. Without the dereference, printf() will print out the numeric value of the pointer, which is not desired.

Why are Pointers important?

Pointers are important simply because they are used for a vast number of implementations of algorithms. Pointers form the basis of References, which are used in other languages like C# and Java. Pointers are essentially what a number of algorithms are built on; Sorting algorithms deal with pointers within a larger structure, Linked Lists deal obviously with a set of elements linked via pointers, and so on.

The Dangers

The Dangers of pointers are pretty easy to understand. In order to dereference a pointer, the memory it points at has to be valid. The most common problem stemming from this is dereferencing a NULL Pointer. (NULL being 0). in C and C++, doing this will either crash the program (without an error message, unless special care is taken), or cause undefined results. C++ has a number of library classes and templates (such as auto_ptr and smart pointers) designed to make these problems easier to identify by using C++ capabilities such as operator overloading. C#, Java, and other higher level languages are of course not exempt from the problems with pointers, because Pointers are references and both are accosted by the same set of problems. These come about in the form of NullReferenceExceptions. These managed languages do mitigate some common problems such as creating a pointer but making it point to the wrong location, or faulty pointer arithmetic, and so forth, by making those unnecessary (C# makes it possible using unsafe{} code blocks, though). The Core capability of Pointers is aliasing- that is, being able to refer to one thing in multiple locations by different names. in C, if you pass a pointer to a function, that function can change the contents of what pointer is pointing at, but it cannot change where the pointer points.

Function Pointers

Before we talk about Function pointers, one needs to understand Functions themselves. a function is in a programming sense very much the same as a function in the scientific sense; it takes one or more inputs and returns a result. For example, in C:

This C code actually does a lot “under the covers” within the assembly:

Functions in low-level Assembly language consist of three parts: a prolog, an epilog, and the body. the prolog does the task of “housekeeping” and setting up the stack frame for the function. The epilog tears it down. Without getting into to many details, sometimes these parts of code need to be done before calling the function, or from within the called function. Either way, it boils down to a Assembly “CALL” instruction. The CALL Instruction (again, without going to in depth because if I do that I’m bound to make numerous factual errors, assuming I haven’t already) essentially moves the instruction pointer (which indicates the current location of program execution) to a new location. It also saves the return location for the subsequent RET instruction that is usually the last instruction executed in a function. the CALL instruction takes one thing: a Pointer.

Of course, we don’t usually work in assembly, do we? So how does this work in higher level languages, such as C? C actually does the work for you in many respects. In the above program, for example, we didn’t have to know about epilog, prolog, the stack frame, or any of that sort of stuff. We simply defined functions and used them. A Function pointer is a pointer which points at a function, rather than a storage location. Because of the Prolog and Epilog code needed to handle parameters and return values and stack frames, the Function pointer type needs to include information about Function parameters. a Function Pointer is defined in C this way:

This creates a function Pointer ptFunction which is made to point at a function accepting two int arguments and returning an int, and that Pointer is initialized to NULL. In order to use it, it needs to have a value, this is done by making the pointer actually point to something:

Obviously, the real power in Function Pointers comes from being able to change what they point at. Add to this you can have functions that return other functions, and you have the beginnings of Functional Programming style.

In my next entry on this subject, we will move into C# and explore what C# delegates add to this and how they differ from your standard Function Pointer.

Have something to say about this post? Comment!