March 3, 2017

What's the Point? Adventures with Pointers in C

1*v8bo8r5rliJh6PMg3ZvKPw.jpeg

Abbott: What’s the point?

Costello: I know the point, but where does it point to?!?

Abbott: What compiler are you running?

Costello: Don’t act surprised, my name’s not compiler, and I’m walking, not running. Do I look like I’m in a hurry?

It’s too bad Abbott and Costello aren’t alive today. I’m sure they would do a worthy update of the “Who’s On First” skit that would put mine to shame.

We don’t have Abbott and Costello to entertain us anymore. But that’s alright. We programmers are a pretty fun bunch, and computers have a way of keeping us entertained.

As a programmer, you might consider the question, “What’s the point?” to be out of scope. But a slightly different question that definitely is relevant to a C programmer is “What does that pointer point to?” Let’s consider an example:

1*Km3OG9j9baikmFhtZ7h0_g.png

(example.c available to copy and paste here)

If you are accustomed to looking at C code, you might notice something strange. The first printf statement seems normal enough. It prints out the memory address of the first element of a 2 element array of chars. But what does the next printf statement do??? How many elements are there in an array called 0??? 0 isn’t a declared variable, and variables can’t have a number as the first character of their name, anyway. What is this code going to do? Will it even compile? And if it does, what will the output be? Let’s toss it at the compiler and see what happens …

1*WfsqUyX4MAayNR2xJ2iGFA.png

OK, but what is the output upon execution? Before we get to that …

How to look Smart Without Answering the Question

When your friend asks you a question about some obscure corner of the C language, answer the question with another question, like so: “What compiler will you use to compile this code you speak of? I need to know what compiler you’re using to give a definitive answer.” This retort should give you some breathing room if you’re not up to speed on C arcana.

Little Do They Know …

… that you are persistent, and you actually *will* install a different C compiler so that you can be the coder in-the-know. What compiler to install? Well gcc and clang are the two most commonly used compilers, and should be for a while (from a 2017 vantage point). So, having already seen gcc pass our example code, let’s see what clang v.3.8.0 has to say about our strange syntax:

1*qUejRAsEL8PDGV-EPdB7CQ.png

Small print says: clang passed code without complaint!

clang doesn’t seem to mind this strange script either. What do these compilers know that we don’t?

What We've Covered So Far:

We cooked up a script with some strange syntax (here), specifically this fragment from the 5th line: &0[c] It seems to be asking the C compiler to reference an array that doesn’t have a permissable variable name. The array index inside the square brackets is a pointer c, which points to a location in the upper reaches of memory space on my typical PC laptop running a 64 bit operating system. We know that  by using the ampersand, like so:  &0[c]  we will get not the value stored at 0[c] but the memory location of 0[c].

Here is the output (as compiled by gcc v.4.8.4, on a 64bit linux system):

1*yL6TR8_Bw5SOIjm9hTGOKA.png

&c[0] == & 0[c] !!!

What Does This Output Tell Us?

The salient point is that we printed out the location of an array element of two different arrays using: &a[0] and of &0[a]. What we see above when we examine the output of our script is that the two memory locations are one and the same. Why should this be? Here is the conclusion that I am forced to reach: I could have typed any number for an array name, and as long as it meets this condition:

Lowest Available Memory Location ≤ Memory Location associated with Array Variable Name ≤ Highest Memory Location.

Further, the number associated with the variable name (such as a) can be replaced by a simple numeric value (such as 0). Is it wise to hard code a numeric memory location instead of letting C determine where in memory to store your variables? Generally, no. But is it possible? Yes, the evidence is right here, and it even works on two different compilers.

And this gets me what?

By experimenting with code, we gain insight into the inner workings of the C language, as implemented by the compiler. And we are one step closer to the most sublime art: creating and comprehending obfuscated code!

Thanks to my classmate Julija Lee who wrote and tested code for this blog post!

Also, extra credit to you if you modify and test the example script. What happens if the pointer points to something other than a char?

Check out my github account here, and follow me on twitter!

Click Here!