As I've been learning the 65816 instruction set for this project, I thought I knew what I was getting into... but as usual, things have a way of becoming twice as interesting as I expected. I'm learning a bunch, however, and I thought that I'd throw out some discoveries and tips as I learn the processor and discover how to use the ISA to build software for the Foenix.
Today's note is about the stack and the interesting hybrid 8/16-bit nature of the 65816 processor.
So the first thing to know is that when you're working with the 65816, it can run in 65C02 emulation mode, with near perfect compatibility with the published 6502 and 65C02 instruction sets. However, the Foenix is going to run in Native mode, which allows the CPU to access 16MB of RAM and store 16-bit values in the accumulator, X, and Y registers.
This is where things get fun. You can switch the registers back and forth between 8 and 16 bit mode on the fly by setting the M and X flags in the P register. The SEP and REP instructions are helpful with this. If you see SEP $20 in a code sequence, that's the programmer setting the accumulator to 8 bit mode.
Which is where the fun comes in. If you're writing public API's, such as kernel code, you need to remember the previous state of the registers when exiting your subroutine.
For example, in the LOCATE API, which moves the cursor around on the screen, I use the accumulator in 16-bit mode to store the memory address the cursor sits on top of. If the screen starts at $1000, and the cursor is on row 5, column 12, that means I need to add 5*80+12 to figure out which location in memory matches the character cell under the cursor.
But here's the thing... when in 16-bit mode, the CPU reads and writes two bytes when using the LDA and STA instructions. Since I'm working with byte-oriented screen memory, I need to set A to 8-bit mode so I can write a single character to the screen. Oh, and did I mention I need to remember what A was before I started the LOCATE routine?
That's where the stack comes in. You can store the value of A on the stack, then pull it back out later. But here's the important part: when A is set to 8 bit mode, only one byte gets pushed onto the stack. And when it's in 16-bit mode, two bytes get pushed. And you don't always know the state of A when you start.
So the key is to remember the width of the register, so that you can set that back when you're done. The way to do that is to push P, the Flags, then recover it before pulling A back off again.
Enter code pattern 1: Preserving Registers
When entering a subroutine, preserve registers that you intend to modify. You can do this by placing them on the stack. And any time you push an unknown value onto the stack, push the flags on afterward. When you reach the end of your routine, pull the flags back off first, then the registers you saved. This ensures that the flags are in the state they were in before your subroutine was called.
Here's one way to preserve the registers.
PHA
PHX
PHY
PHP
do stuff
PLP
PLY
PLX
PLA
However, it's missing something...when you switch A to 8 bit mode, the top 8 bytes are actually preserved in the register. They're just not used most of the time.
So here's another pattern to consider:
PHP ; Preserve the flags
REP #$30 ; Set all registers to 16 bit mode
PHA
PHX
PHY
do stuff.
REP #$30
PLY
PLX
PLA ; Retrieve A
PLP ; Retrieve the flags
This captures the full accumulator, making sure to grab the hidden B value.
Should you use the second pattern? It's up to you. There are arguments to be made for brevity and simplicity vs completeness and preserving calling programs' data. I'd love to hear what experienced '816 developers have to say.
QUOTE:"But here's the thing... when in 16-bit mode, the CPU reads and writes two bytes when using the LDA and STA instructions. Since I'm working with byte-oriented screen memory, I need to set A to 8-bit mode so I can write a single character to the screen. Oh, and did I mention I need to remember what A was before I started the LOCATE routine? "
Can't you just PAD the upper 8 bits with ZEROs ?? $0030h instead of $30h ?
Dan
For right now, I'm just going to start assembly with 64TASS to assemble for my C64 not the Foenix. This will familiarize me with the assembler ONLY. I suppose 2 (TWO) .BYT commands at the beginning of my code will give the necessary address for my C64. I have only worked with 3 assemblers over the past 40 years...TRS-80 Z80, PAL for the C64 and MPASM for the PIC 16F877 AND PIC 16F887 PIC CHIP.
DAN
So my entire FastFingers program is written using Brad Templeton's PAL. I'm going to try and import the source into 64TASS and continue assembling with that to get familiar with this Assembler before moving to the 65C816. It will be easy enough to to move the machine code back to my C64 because I have Jim Brain's sdcard reader, BUT there is a funky thing about the Commodore disk system. The first 2 bytes of the chunk that I'm going to load must have the DESTINATION ADDRESS of the code to be inserted to the proper memory address.
So 2 questions
1) Will 64TASS do that automatically ?
2) Will that even be an issue with the Foenix ?
Dan
Stef, you're right about the shipped RAM size, and that's why I don't consider it such a big deal. My thought was that the system should be able to accomodate the full 16MB if the user should choose to install more RAM. One idea I netioned in another post was to have a context switcher in the KERNAL and allow multiple relocatable programs to be loaded at once. This wouldn't be the same thing as a full pre-emptive or cooperative multitasking OS, but rather more like being able to switch among several single-tasking Foenix machines. A flash memory would serve just as well for that purpose. (The Model 100 accomplished something similar with a NiCad battery to preserve the RAM contents.)
Not sure how you would map an SD card into the CPU's address space, as I was under the impression SD behaves more like a disk drive than RAM. You guys undoubtedly know more about that than I do.
Hi, guys!
Believe it or not, I was studying 65C816 assembly language with an eye towards building a similar system when I stumbled upon this project. I catch the occasional 8-Bit Guy video from time to time, but his mentioning a C-64 successor "dream computer" evidently slipped right past me. I won't nitpick the specs too much, except to say I agree with those who think this machine should be able to use the full 16MB, even if it's not installed by default. Oh, and I'd put it all in one case with a clicky switch keyboard, like an old school home computer. Other than that, I'd say you guys are on the right track as far as the feature set.
That's not why I'm here though. I, too, am having a devil of a time keeping all of the 816's native mode features straight. While Eyes and Lichty adequately explain things in their book, there's quite a lot of information to juggle in my head, and I'm hoping to study some more fully developed assembly routines than the examples they provide. Who knows, I may even get good enough to contribute to the Foenix development!
Got a question about the motherboard. There's a big green device that looks like it may be a rechargable battery at the front righthand corner. Are you guys taking a page from the TRS-80 Model 100 and using a battery-backed SRAM to preserve user files? That was something I was thinking about doing with the excess addressable memory space in my design.
Regards,
James
Note to self. When the computer boots up, the screen has 0 columns and 0 rows. If you call LOCATE to set the position of the cursor, that routine tries to scroll the screen up one row when the cursor position is below the last row of the screen.
Since the screen has 0 rows, the cursor will always be below the last row of the screen.
This is how you create infinite loops.
Ugh. This bit me again this weekend. I made a mistake and did my PHP first and PLP last. I kept pulling the wrong address off the stack and RTLing to the wrong address. I'm going to strongly suggest people consistently stick to the first pattern and just forget the hidden "B" register exists. Push P last and Pull P First. Unless there's a reason to preserve the top of A, PHP last/PLP first is the pattern I'm using everywhere.
And for our next topic:
Fun with Banks...
The 65186 can access 16 megabytes of RAM due to its 24-bit address bus. However, the internal address registers are still only 16 bits wide. The 816 handles this through the use of bank registers. There are three Bank registers on the 65816 CPU: The Data Bank Register, aka DBR or B (8 bits)
The Program Bank Register, aka PBR or K (8 bits)
The Direct Page Register, aka DP or D (16 bits)
Normally, we don't get to directly address any of these registers. In fact, the only way to load the Data Bank register is through the stack.
This example loads the accumulator with $F8 (the first bank of the kernel ROM), pushes it to the stack, then pulls that back out to the Data Bank:
SEP #$20 ; don't forget A has to be set to 8 bits LDA #$F8 ; Set the bank number PHA ; Push it to the stack PLB ; And finally pull it to the actual bank
Like so many other commonly repeated tasks, I've written this into a macro: "setdbr".
The Direct Page register is easier to access, as it has dedicated Transfer instructions:
TCD, TDC
TCD transfers the 16-bit Accumulator (also known as "C") to the Direct Page register. Since the keyboard buffer lives on the page starting at $0F00, I use the following code to initialize D before reading the keyboard buffer:
PHD ; remember the state of D so we can get back to it later REP #$30 ; sets A, X, Y to 16 bits LDA #KEY_BUFFER ; keyboard buffer location. TCD ; This actually sets the Direct page register ; to the 16-bit value in the accumulator. ... read from the buffer .... PLD ; finally, switch back to the original direct page
Finally... the Program Bank (K). The good news is that changing K is very simple. Far JMP and JSR instructions always set the program bank. That's the purpose, in fact. The key take-away from this is that a far JSR (also called a JSL) must always be paired with a far return, or RTL.
As a consequence of this, I've decided to write all of my kernel subroutines to perform RTL's at the end. This means you should always use JSL to jump to one of the public ROM subroutines. For example:
JSL GETCHW ; waits for and returns the next keystroke
This highlights something else worth mentioning: we are including a memory map for the entire system as a set of assembly include files, so everyone can (and should) use constants and named labels, like KEY_BUFFER or JSL GETCHW.
Having fun are we?
Make sure to get those screen routines ready as soon as possible, you never know when they could become useful! ;o)
Cheers!
Stefany