PDP-8 Challenge: Addressing modes, part 2: Memory paging and Direct addressing

A brief preface to remind readers that octal is the standard numbering system for the PDP-8, but for most of us decimal is generally easier to understand. As a general rule, I will use octal when referring to memory addresses or encoded data, but decimal for quantities and most other things. To make it clear when I'm using one or the other, I will adopt the C language practice of preceding octal numbers with a leading 0. If there is no leading zero, it will be a decimal number.

So as I discussed in the last post, all operations are encoded into a single 12-bit word. The operation code and the operand are all contained within a single word, unlike most other architectures where the operand is a second or even a third byte after the first. And of these 12 bits, only 7 is used for the memory address. 7 bits can represent only 128 numbers, but even a basic PDP-8 has 4k words of ram. So how does this work?

The PDP-8 breaks it's entire bank of RAM into 128-word pages. The first starts at address 0, the second at 0200, the third at 0400, and so on and so forth. There is no actual physical break in ram, of course, it's just a paging scheme to support the instruction set.

To better understand this, let's examine a single assembler instruction: TAD 0250, which for the purposes of this example is in location 0200 (as we'll see, the location of the operation is important for Direct addressing). As mentioned in the last post, TAD is the mnemonic for two's complement addition to the accumulator. It will take whatever value is in address 0250 and add it to the accumulator. Now you might be wondering how 0250 is encoded in 7 bits since, as we've already observed, the largest possible number is 128 or 0177. The answer, of course, is that 0250 isn't encoded. Since 0250 is found within the same page as the instruction itself (the page goes from 0200 to 0377), only the address relative to the top of the page is encoded. Thus the operand for the instruction is encoded as 050, not 0250.

So why does the assembler make us enter in 0250 if it's just going to convert it to 050? Well, this is where bit 4 comes in, the bit that says whether we're working with the current page or page 0. Page 0 is the first page in the range of 0 to 0177, and it is the only page that is accessible from all other pages. In the above instruction, TAD 0250 would be encoded with bit 4 set to 1. But set bit 4 to 0 and now it's accessing page 0. So that's why the instruction is encoded TAD 0250 instead of TAD 050; it tells the assembler you're wanting the current page's 050, not page 0's 050. Of course, in a real program, you'd likely use labels instead of hard-coding addresses, and so much of this would be taken care of automatically, but it's still important to understand as we try to understand the two addressing modes. The Direct mode is what I have been discussing in this post. The Indirect mode is the other mode, and it will be the topic of the next post.

One final note for 6502 aficionados. Page 0 may sound a lot like the 6502's Zero Page, and there certainly are some similarities, but there are also important differences. Most notably they differ in that, as far as I can tell, there is no real performance benefit to be had by using page 0, other than how it may simplify your code overall. That is to say, the Zero Page on the 6502 is used for oft-accessed data because it's just faster and requires fewer bytes to encode the instruction, but this isn't the case on the PDP-8. An instruction fetching data from page 0 takes 12 bits, and so does one fetching from the current page. And the number of cycles each requires appears to be the same. So it's certainly a good idea to use page 0 when necessary, but there should be no performance-related compulsion to do so.

PDP-8 Challenge

Saturday, August 22, 2015

Addressing modes, part 2: Memory paging and Direct addressing

1 comment: