I wrote my first compiler for class at Purdue University. I chose to code in assembler and did not suffer for it.
The professor announced a semester competition with two awards, one for the fastest program, the other for the smallest. I won both.
Gordon Letwin was in the class with me. Only the two of us chose to code in assembler. I was pleased to have bettered him too.
Representation
A fellow student chose to write in Snobol which made the first assignment (parsing) easy. He suffered later.
Most students chose to code in Fortran. Dynamic data structures weren't easy for them either.
Machine words were large and character addressing was done with shifts and masks.
6 CDC Display Code (bits) per (chars) 60 CDC CPU Memory (bits) per (words)
There were sizes that worked out well and I knew what these were. After packing characters into a symbol table entry everything became an address that fit nicely into all of the CPU registers.
7 (chars) for symbol name 18 (bits) for memory address SUM (words) per symbol table entry
Parsing theory had not yet fully emerged at this time. The text talked about Floyd-Evans reductions. I might have done something similar.
Aside: As I now search the web for a definition of the technique I've learned more of Robert Floyd and especially enjoyed Knuth's "Robert W Floyd, In Memoriam". ACM SIGACT News. wikipedia
Our compilers executed from in-memory data structures. This made them technically interpreters leaving code generation for some other class.
Debugging
Early on I wrote a macro for printing debugging information. A key feature was that I could in one line retrieve a value and print it with a label.
Symbol Table Size = 26
This would take two lines in Fortran and involved counting Hollerith characters to make a proper Format statement. Yuck.
I protected all my register contents so that I could insert debug statements freely and even move them around while debugging.
With each run my compiler explained to me step-by-step what is was doing on my behalf. Every run taught me something.
Extensions
The text prescribed a scalar numerical subset which it called Mini PL/1. We were encouraged to go beyond.
I had an IBM language reference manual which I studied carefully. There was much talk of high-level languages at the time. PL/1 was often cited as an example.
I chose to implement dynamic arrays which involved indirect reference to storage through dope vectors as discussed at length in the manual. wikipedia
I was pleased that this increment in functionality was only a similar increment in complexity of my compiler, a much better experience than that of my Expression Calculator only moths earlier.