Some thoughts on a ROM BASIC interpreter for this system.
I grew up coding Commodore BASIC, and I'd love to keep the same keywords used in CBM BASIC, but also add some of the techniques and keywords later introduced in Quick BASIC and Visual BASIC.
The biggest thing I'd love to see is structured BASIC. This is a generic term that covers BASIC that does not use line numbers and which includes named subroutines and structures like WHILE/WEND, DO/LOOP, and IF/END IF. The other thing that really needs consideration is user defined types, aka structures and objects.
I'd also strongly encourage the use of long variable names and a requirement to DIM variables; this would naturally also mean including the concept of variable scope, something that's not present in line number basic.
To give you an idea of what a structured BASIC program might look like:
list
sub Hello(Name as String)
print "Hello" & Name
end sub
function GetName() as String
dim name as string
input "What is your name?", name
return name
end function
sub main()
dim name as string
name=GetName()
Hello name
end sub
Ready.
run
What is your Name? Tom
Hello Tom
Ready.
█
I'm thinking a full screen editor is also required... that's not hard to do, and having built in text editing should be useful for other applications, too.
Obviously, CBM BASIC was horribly slow; I can think of some ways to speed that up. One of those things is real time bytecode compilation, with all portions of a BASIC program compiled to binaries. A variable-name table would convert long variable names to 16-bit numbers, eliminating the need for the system to parse and look up variable names in real time.
I'd also introduce block I/O operations, extending GET# with a block size and and adding PUT# (similar to PRINT# but without the intrinsic carriage return at the end.)
Some 25 years ago I used C64 to learn programming - before learning the assembler I used BASIC - built in BASIC V2, Ultrabasic 64, some hacked Simon's Basic version, and (at the end) Warsaw Basic 3.2 (a polish BASIC extension, version 3.2 was released in 1991, quite powerful, but probably not distributed outside of our country). My 0.02 USD regarding BASIC (let's call it my dream Commodore BASIC) - please don't consider it a 'wishlist', this is just an idea:
0. Goals:
- reuse as much as possible from existing BASIC V2 keywords and general interpreter architecture concept
- reasonable compatibility with BASIC V2
- extend BASIC with features from BASIC V7, BASIC interpreters from other 8 bit machines, and BASIC extensions from 1980s (possibly early 1990s), but designed a little bit better; authors of these extensions had to 'tap in' into existing BASIC interpreter mechanisms, any Commodore employee wanting to extend BASIC could just modify the BASIC internal architecture.
1. For compatibility purposes, I think we should keep line numbers. We can have 'autonumber' feature in the screen editor, we can have keyboard shortcuts to insert a new line before the current with renumbering half of the program (for example, we are on line 110, previous was 109, and the user decides to insert a line before 110 by pressing some keyboard shortcut). This would keep the spirit of the old system, while modernizing it a bit.
2. Extensions should include longer variable names (already mentioned), modulo operator, native hex/bin support (present in many extensions of the era in different forms), PRINT USING directive, IF ... THEN ... ELSE, ON ERROR GOSUB ..., REPEAT ... UNTIL, DOS Wedge ('@'), etc.
3. We have to be reasonable - DLOAD might have been nice on C128, but on C256 (when we will probably have RAM disks, SD cards, probably a floppy drive, possibly some network storage, and who knows what else, and no tape at all) would be just plain stupid.
4. I like the idea of defining variables with specifying their types - but for compatibility reasons I would assume VA is float, VA% is integer and VA$ is a string (we can consider undefined variables as deprecated, though). The amount of variable types should be limited - STRING, INTEGER (24 bit, to match the addressing capability of the CPU), LIBRARY (my idea, more on this later), maybe BYTE, arrays (array of bytes might be especially useful - for sprite blobs, etc.)
5. Some kind of structuring code is a must if you are planning 1MB of BASIC memory; both Simon's Basic and Warsaw Basic had it (several other extensions probably too). But I think Tom went too far - there shouldn't be any Main, on a more modern Amiga there was no Main either; the first symbol was the main one. Besides... let me propose a different syntax of subroutines (this is how they were called in BASIC back then):
100 SUB checkStrLen, str AS STRING, num AS INTEGER, ret AS INTEGER REF
101 IF LEN(str) > num THEN ret = 0 ELSE ret = 1
102 RETURN
...
103 DIM result AS INTEGER
104 GOSUB checkStrLen, "Snaffucate me!", 5, result REF
Short, reuses existing keywords and behavior, does not extend the grammar too much ( Warsaw Basic used the pound sign for passing variables by parameters, but it is probably not the best idea), shouldn't be hard to implement even with BASIC V2 interpreter architecture. There is no "END SUB" or anything like that - this would brought back the parser complication to a whole new level, we don't want that. To simplify the interpreter implementation we can limit the amount of variables passed by reference to some sane amount (let's say 4).
In my opinion GOTO / GOSUB line number should be allowed, but considered deprecated.
6. In the previous example, on line 104 the interpreter has to search for checkStrLen, which hurts the performance. Later C256 BASIC revisions could have some dedicated BASIC area with subroutine name cache. Or each line being a start of a subroutine (and the first line of the code also) could hold the vector to the next one, to speed-up the subroutine lookup.
7. Simon's Basic used to have LOCAL and GLOBAL keyword, I don't remember the usage, but my proposal for C256 BASIC would be:
DIM globeVar AS INTEGER GLOBAL
defines a global variable - it goes to a separate 'global' area of variables. Afterwards, any time the interpreter wants to access a variable, it searches:
- the variable passed to current subroutines by reference (they can have different names than the original ones)
- local variables (part of the current subroutine context)
- global variables (common context)
8. Warsaw Basic 3.2 used to have one unique feature: it could store external procedures in separate files on the floppy disk, and I would really like to have this feature in my dream BASIC. The mechanism was a little bit awkward, but - as I mentioned earlier, the authors (Krzysztof Gajewski and Bogusław Radziszewski) had to 'tap in' into existing interpreter, manipulating vectors in a clever way to achieve the desired result. My proposal:
REM file on a disk, let's call it FOOBAR.LIB - only public subroutines are visible outside
100 PUBLIC SUB checkStrLen, str AS STRING, num AS INTEGER, ret AS INTEGER REF
101 IF LEN(str) > num THEN ret = 0 ELSE ret = 1
102 RETURN
REM now the main program
100 DIM fooBar AS LIBRARY: DIM result AS INTEGER
101 LOAD fooBar, "FOOBAR.LIB", 8
102 GOSUB fooBar.checkStrLen, "Shaffucate Me!", 5, result REF
Upon encountering LOAD directive with LIBRARY variable, the interpreter loads the code from file on device 8 - into a variable of type LIBRARY. Upon encountering GOSUB, it searches the library code for the checkStrLen public (!) procedure - if found, code from library is called (more on this later).
8. I think we need a nice way to combine the BASIC with assembler - my proposal: FOOBAR.LIB could be either BASIC code, or a position-independent assembler code (we might develop some simple file format, with identifier to distinguish it from BASIC and a BSS section to save storage space). GOSUB should provide the assembler code with a vector pointing to the beginning of checkStrLen; by calling a well-defined interface, it should be able to retrieve the parameters, starting from method name. Something like SYS 49152, 7, 34 on a C64, where assembler routine was calling BASIC to get the parameters.
9. The BASIC interpreter should be split into 2 parts:
- the core - BASIC V2 with the extensions which are mostly machine-independent; this way the 8-Bit Guy team could reuse C256 interpreter (and participate in the development...) - all tokens would be 1 byte here
- extensions providing additional commands (graphics, sound, etc.); each token would consist of 2 bytes: extension number (1st byte, one reserved for in ROM machine-specific extension), and internal extension token, additional extensions could be added by user in the same way (each of them could add up to 256 tokens, multiple extensions could be installed at the same time).
10. The BASIC memory organization - my proposal:
Area 0: current context GOSUB stack, vectors to start/end of other areas, etc. - in a predefined place in memory, to increase performance
Area 1: BASIC code, constant size
Area 2: local context: legacy 'GOSUB lineNumber. stack, loops (FOR, REPEAT, etc.) stack, variables (including the LIBRARY type variable, with content in the same format as Area 1), grows upwards
Area 3: free space
Area 4: global context (global variables), grows downwards
Calling a subroutine: for the purpose of the subroutine, at the beginning of free space we create a new local context ('Area 0' is temporary stored pushed at the end of old local context and restored during RETURN), global context is the same.
Calling a public subroutine of a library: we create a new local context (like before), the library variable content becames our new BASIC code area, there is no access to global context (we force a well defined interface, and providing a global context for a library could complicate the implementation too much and hurt the performance, we might have been forced to relocate it if some other areas grows in size; 8-bit world had it's limitations... unless we find some cheap way to implement this).
[edit1] @team, what WC65C816 assembler are you currently using? [edit2] do we already have some idea for an interface between the screen editor and BASIC/monitor/etc.?
[edit3] Looked into 65816 assembly spec... I'm not yet sure, but it might be much easier to implement the interpreter, if each of these areas (with the exception of free space) were limited in size to 64KB. We could then use a little bit of self modifying code to read/write data from each of the areas quickly, i mean using Absolute Long Indexed X/Y addressing modes, where we modify the absolute address of the relevant instruction(s) while switching the area. But it would be nice to move the code of the libraries loaded from the disk out of the area of variables, so that it wouldn't be affected by this limit. It's not that easy to design it, as it seems :)