pointers to arrays

Sat Feb 18 14:32:47 AEST 1989

In article <244 at tityus.UUCP> jim at athsys.uucp (Jim Becker) writes:
>... I wanted to return a pointer to an allocated array of char*
>pointers via a procedure call.

Probably not.  You probably wanted a pointer that pointed *at* (not
`to') a block of memory (`array') containing a series of `char *'
objects each pointing at a block of memory containing a series of
`char's.  The type of such a pointer is `char **'.

You might ask, `what is the difference between a pointer that points
``at'' a block of memory and one that points ``to'' an array?'  The
distinction is somewhat artificial (and I made up the words for some
netnews posting in the past).  Given a pointer to array pa:

	int a[5];
	int (*pa)[5] = &a;	/* pANS C semantics for &a */

I can get a pointer that points `at' the array instead:

	int *p = &a[0];

The latter is the more `natural' C version of the former: typically
a pointer points at the first element of a group (here 5).  The rest
of the group can be reached via pointer arithmetic: *(p+3), aka p[3],
refers to the same location as a[3].

The pointer need not point at the first element, as long as it points
somewhere into the object:

	p = &a[2];

Now p[1] refers to a[3]; p[-2] refers to a[0].  To use pa to get at
a[3] one must write (*pa)[3] (or, equivalently, pa[0][3]).

The thing that is most especially confusing, but that really makes
the difference, is that *pa, aka pa[0], refers to the entire array
`a'.  *p refers only to one element of the array.  This can be seen
in the result produced by `sizeof': (sizeof *p)==(sizeof(int)), but
(sizeof *pa)==(sizeof(int[5]))==(5 * sizeof(int)).

Pointers to entire arrays are not particularly useful unless there
are several arrays:

	int twodim[3][5];

Now we can use pa to point to (not at) any of the three array-5-of-int
elements of twodim:

	pa = &twodim[1];	/* or pa = twodim + 1, in Classic C */

and now (*pa)[3] (or pa[0][3]) is an alias for twodim[1][3].  Note
especially that since pa[0] names the *entire* array-5-of-int at
twodim[1], pa[-1] names the entire array-5-of-int at twodim[0].
\bold{Pointer arithmetic moves by whole elements, even if those
elements are aggregates.}  Thus pa[-1][2] is an alias for twodim[0][2].

This is merely a convenience, for we can do the same with p:

	p = &twodim[1][0];

Now p points to the 0'th element of the 1'th element of twodim---the
same place that pa[0][0] names.  p[3] is an alias for twodim[1][3].  To
get at twodim[0][2], take p[(-1 * 5) + 2], or p[-3].  Arrays are are
stored in row-major order with the columns concatenated without gaps;
they can be `flattened' (viewed as linear, one-dimensional) with
impunity.  (The flattening concept extends to arbitrarily deep
matrices, so that a six-dimensional array can be viewed as a string of
five-D arrays, each of which can be viewed as a string of four-D
arrays, and so forth, all the way down to a string of simple values.%)

Once you understand this, and see why C guarantees that p[-3],
pa[-1][2], and twodim[0][4] are all the same, you are well on your way
to understanding C's memory model (not `paradigm': that means
`example').  You will also see why pa can only point to objects of type
`array 5 of int', not `array 17 of int', and why the size of the array
is required.

-----
% For fun: the six-D array `char big[2][3][5][4][6][10]' occupies
  7200 bytes (assuming one byte is one char).  If the first byte is at
  byte address 0xc400, find the byte address of big[1][0][3][1][5][5].
  I hid my answer as a message-ID in the references line.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at mimsy.umd.edu	Path:	uunet!mimsy!chris