writing code.. (rather: design etc..)

Thu Aug 1 03:57:21 AEST 1985

I do not "write" the program on paper, but I also do not believe in
pure terminal (?) programming. I use large amounts of desk space
for reference material, relevant program listings, highlighters,
several coffee cups and other miscellania. I found that the
following works best for me:

	Context files: typically contains notes on implementation
		       details, visual abstractions of complex
		       data structures, photocopies of relevant
		       articles from IEEE and ACM journals, listings
		       of programs that are somehow relevant to the
		       problem at hand. (As an example, my context
		       file on regular expressions contains the 
		       photocopies of the original article by
		       Ken Thompson (CACM 11, #6 1968), another
		       very interesting implementation by Martin
		       Richards (Software- Practice & Experience,
		       Vol. 9 1979), listings of public-domain
		       implementations of grep (DECUS C distn.),
		       ratfor implementation of makpat, amatch etc.)
		       and a page of references, including coffee 
		       stains..)

	Blackboard:    Scribbles and flow control / interaction
		       diagrams for interesting parts of the program
		       at hand. 

	Note pad:      I use pseude-code for complex algorithms only.
		       One cannot always go to C directly from a
		       theoretical description of the algorithm, say
		       the description of "patricia trees" in JACM.
		       (A Space- Economical Suffix Tree Construction 
		       Algorithm, J. ACM 23, April 1976). Obviously,
		       these notes are kept in the context file. 

		       Graphic abstractions of certain parts of an
		       algorithm or a program help a great deal in
		       discovering shortcomings and/or new implementation
		       strategies. (I draw well :-) Consider a simple
		       routine such as translit in m4. (translit(dest,from,
		       to)) Although it is very straightforward to implement
		       this routine directly on the terminal, the critical
		       part of translit is its *side effect* part:
		       a call such as translit(str,"abcde","bd-?-") will
		       map character "a" to "b". But notice that "b" is mapped
		       to "d" which is mapped to "?". Thus, a is ultimately
		       mapped to "?". In one sitting, one is tempted to make
		       multiple passes over str to resolve all side effect
		       mappings such as this. I had such an implementation
		       about a year ago. This time, I did a small graphic
		       representation of what was going on in terms of this
		       *side effect* mapping, and this visual representation
		       revealed an implementation that is about 5 times
		       faster than the previous one, and resolved all *side
		       effect* mappings in one pass. (routine is included
		       at the end of this article..)

	Pseudo-code:   I have my own pseudo-code, which is a cross between
		       C and English. Although certain design methodologies
		       (such as Jackson's) look interesting, I do not yet
		       use them, since I am not yet convinced that they
		       offer any great advantages, undoubtedly an arguable
		       case.

What is the use of all this: It turns out that all this preperation
pays off very handomely if one has to prepare a "walk through the 
implementation of X" type document, or one has to make a presentation
about X. Of course, the context file is saved for future use on similar
projects, now including the listings of X. One can use the context file
to follow the thought process of the original implementor, find all the
algorithms, references, similar implementations as one coherent whole. So
far, I have resisted the temptation of saving the photos of my blackboard. :-)
I am sure many others have similar techniques in software development. I
would be interested in hearing about them. To finish off, a quote from
Knuth:
	"That is my new hobbyhorse, to say that people should
	think about the communication of the program while
	writing it. This makes it easier to write. The suprising
	thing is, that even though I'm writing programs that are
	better documented, it's taking me less time to write the
	program."

	Computer Language, Vol. 2, May 1985

Oz
----------- map routine ------------
/*
 * map(dest, source, from, to)
 * 
 * map every character of source that is specified in "from"
 * into "to" and replace in dest. (source remains unmodified)
 *
 * This is a semi-standard implementation of translit(s,from,to)
 * routine of M4. Within mapvec (an identity vector), we replace every
 * character of from with the corresponding character in to. If to is
 * shorter than from, than the corresponding entries are null, which
 * means that those characters dissapear altogether. Furthermore,
 * imagine map(sourcestring, "srtin", "rn..*") type call. In
 * this case, `s' maps to `r', `r' maps to `n' and `n' maps to `*'.
 * Thus, `s' ultimately maps to `*'. In order to achieve this effect
 * in an efficient manner (i.e. without multiple passes over the
 * destination string), we loop over mapvec, starting with the initial
 * source character. if the character value (dch) in this location is 
 * different than the source character (sch), sch becomes dch, once
 * again to index into mapvec, until the character value stabilizes
 * (i.e. sch = dch, in other words mapvec[n] == n). Even if the entry 
 * in the mapvec is null for an  ordinary character, it will stabilize, 
 * since mapvec[0] == 0 at all times. At the end, we restore mapvec 
 * back to normal where mapvec[n] == n for 0 <= n <= 127. This strategy, 
 * along with the restoration of mapvec, is about 5 times faster than 
 * any algorithm that makes multiple passes over destination string.
 *
 * Ozan S. Yigit (oz)
 * July 15 1985
 *
 */

map(dest,src,from,to)
register char *dest;
register char *src;
register char *from;
register char *to;
{
        register char *t0;
	register char sch, dch;
	static char mapvec[128] = {
		0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
		12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 
		24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 
		36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 
		48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 
		60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 
		72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 
		84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 
		96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 
		108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 
		120, 121, 122, 123, 124, 125, 126, 127 
	};

        if (*src) {
                t0 = from;
		while (*from)           /* initialise mapvec...  */
			mapvec[*from++] = (*to) ? *to++ : (char) 0;

		while (*src) {		/* map and follow chains */
			sch = *src++;
			dch = mapvec[sch];
			while (dch != sch) {
				sch = dch;
				dch = mapvec[sch];
			}
			if (*dest = dch) 
				dest++;
		}

		while (*t0)		/* restore mapvec back..*/
			mapvec[*t0] = *t0++;
        }
	*dest = (char) 0;
}
-- 
   __ O		Usenet: [decvax|allegra|linus|ihnp4]!utzoo!yetti!oz
  / /\___	Bitnet: oz@[yuleo|yuyetti]
   .		-------------------------
  / \		Support GNU. Consider the 3-musketeers' motto:
 / --+		ONE FOR ALL - ALL FOR ONE
/