Some thoughts on a ROM BASIC interpreter for this system.
I grew up coding Commodore BASIC, and I'd love to keep the same keywords used in CBM BASIC, but also add some of the techniques and keywords later introduced in Quick BASIC and Visual BASIC.
The biggest thing I'd love to see is structured BASIC. This is a generic term that covers BASIC that does not use line numbers and which includes named subroutines and structures like WHILE/WEND, DO/LOOP, and IF/END IF. The other thing that really needs consideration is user defined types, aka structures and objects.
I'd also strongly encourage the use of long variable names and a requirement to DIM variables; this would naturally also mean including the concept of variable scope, something that's not present in line number basic.
To give you an idea of what a structured BASIC program might look like:
list
sub Hello(Name as String)
print "Hello" & Name
end sub
function GetName() as String
dim name as string
input "What is your name?", name
return name
end function
sub main()
dim name as string
name=GetName()
Hello name
end sub
Ready.
run
What is your Name? Tom
Hello Tom
Ready.
█
I'm thinking a full screen editor is also required... that's not hard to do, and having built in text editing should be useful for other applications, too.
Obviously, CBM BASIC was horribly slow; I can think of some ways to speed that up. One of those things is real time bytecode compilation, with all portions of a BASIC program compiled to binaries. A variable-name table would convert long variable names to 16-bit numbers, eliminating the need for the system to parse and look up variable names in real time.
I'd also introduce block I/O operations, extending GET# with a block size and and adding PUT# (similar to PRINT# but without the intrinsic carriage return at the end.)
Some 25 years ago I used C64 to learn programming - before learning the assembler I used BASIC - built in BASIC V2, Ultrabasic 64, some hacked Simon's Basic version, and (at the end) Warsaw Basic 3.2 (a polish BASIC extension, version 3.2 was released in 1991, quite powerful, but probably not distributed outside of our country). My 0.02 USD regarding BASIC (let's call it my dream Commodore BASIC) - please don't consider it a 'wishlist', this is just an idea:
0. Goals:
- reuse as much as possible from existing BASIC V2 keywords and general interpreter architecture concept
- reasonable compatibility with BASIC V2
- extend BASIC with features from BASIC V7, BASIC interpreters from other 8 bit machines, and BASIC extensions from 1980s (possibly early 1990s), but designed a little bit better; authors of these extensions had to 'tap in' into existing BASIC interpreter mechanisms, any Commodore employee wanting to extend BASIC could just modify the BASIC internal architecture.
1. For compatibility purposes, I think we should keep line numbers. We can have 'autonumber' feature in the screen editor, we can have keyboard shortcuts to insert a new line before the current with renumbering half of the program (for example, we are on line 110, previous was 109, and the user decides to insert a line before 110 by pressing some keyboard shortcut). This would keep the spirit of the old system, while modernizing it a bit.
2. Extensions should include longer variable names (already mentioned), modulo operator, native hex/bin support (present in many extensions of the era in different forms), PRINT USING directive, IF ... THEN ... ELSE, ON ERROR GOSUB ..., REPEAT ... UNTIL, DOS Wedge ('@'), etc.
3. We have to be reasonable - DLOAD might have been nice on C128, but on C256 (when we will probably have RAM disks, SD cards, probably a floppy drive, possibly some network storage, and who knows what else, and no tape at all) would be just plain stupid.
4. I like the idea of defining variables with specifying their types - but for compatibility reasons I would assume VA is float, VA% is integer and VA$ is a string (we can consider undefined variables as deprecated, though). The amount of variable types should be limited - STRING, INTEGER (24 bit, to match the addressing capability of the CPU), LIBRARY (my idea, more on this later), maybe BYTE, arrays (array of bytes might be especially useful - for sprite blobs, etc.)
5. Some kind of structuring code is a must if you are planning 1MB of BASIC memory; both Simon's Basic and Warsaw Basic had it (several other extensions probably too). But I think Tom went too far - there shouldn't be any Main, on a more modern Amiga there was no Main either; the first symbol was the main one. Besides... let me propose a different syntax of subroutines (this is how they were called in BASIC back then):
100 SUB checkStrLen, str AS STRING, num AS INTEGER, ret AS INTEGER REF
101 IF LEN(str) > num THEN ret = 0 ELSE ret = 1
102 RETURN
...
103 DIM result AS INTEGER
104 GOSUB checkStrLen, "Snaffucate me!", 5, result REF
Short, reuses existing keywords and behavior, does not extend the grammar too much ( Warsaw Basic used the pound sign for passing variables by parameters, but it is probably not the best idea), shouldn't be hard to implement even with BASIC V2 interpreter architecture. There is no "END SUB" or anything like that - this would brought back the parser complication to a whole new level, we don't want that. To simplify the interpreter implementation we can limit the amount of variables passed by reference to some sane amount (let's say 4).
In my opinion GOTO / GOSUB line number should be allowed, but considered deprecated.
6. In the previous example, on line 104 the interpreter has to search for checkStrLen, which hurts the performance. Later C256 BASIC revisions could have some dedicated BASIC area with subroutine name cache. Or each line being a start of a subroutine (and the first line of the code also) could hold the vector to the next one, to speed-up the subroutine lookup.
7. Simon's Basic used to have LOCAL and GLOBAL keyword, I don't remember the usage, but my proposal for C256 BASIC would be:
DIM globeVar AS INTEGER GLOBAL
defines a global variable - it goes to a separate 'global' area of variables. Afterwards, any time the interpreter wants to access a variable, it searches:
- the variable passed to current subroutines by reference (they can have different names than the original ones)
- local variables (part of the current subroutine context)
- global variables (common context)
8. Warsaw Basic 3.2 used to have one unique feature: it could store external procedures in separate files on the floppy disk, and I would really like to have this feature in my dream BASIC. The mechanism was a little bit awkward, but - as I mentioned earlier, the authors (Krzysztof Gajewski and Bogusław Radziszewski) had to 'tap in' into existing interpreter, manipulating vectors in a clever way to achieve the desired result. My proposal:
REM file on a disk, let's call it FOOBAR.LIB - only public subroutines are visible outside
100 PUBLIC SUB checkStrLen, str AS STRING, num AS INTEGER, ret AS INTEGER REF
101 IF LEN(str) > num THEN ret = 0 ELSE ret = 1
102 RETURN
REM now the main program
100 DIM fooBar AS LIBRARY: DIM result AS INTEGER
101 LOAD fooBar, "FOOBAR.LIB", 8
102 GOSUB fooBar.checkStrLen, "Shaffucate Me!", 5, result REF
Upon encountering LOAD directive with LIBRARY variable, the interpreter loads the code from file on device 8 - into a variable of type LIBRARY. Upon encountering GOSUB, it searches the library code for the checkStrLen public (!) procedure - if found, code from library is called (more on this later).
8. I think we need a nice way to combine the BASIC with assembler - my proposal: FOOBAR.LIB could be either BASIC code, or a position-independent assembler code (we might develop some simple file format, with identifier to distinguish it from BASIC and a BSS section to save storage space). GOSUB should provide the assembler code with a vector pointing to the beginning of checkStrLen; by calling a well-defined interface, it should be able to retrieve the parameters, starting from method name. Something like SYS 49152, 7, 34 on a C64, where assembler routine was calling BASIC to get the parameters.
9. The BASIC interpreter should be split into 2 parts:
- the core - BASIC V2 with the extensions which are mostly machine-independent; this way the 8-Bit Guy team could reuse C256 interpreter (and participate in the development...) - all tokens would be 1 byte here
- extensions providing additional commands (graphics, sound, etc.); each token would consist of 2 bytes: extension number (1st byte, one reserved for in ROM machine-specific extension), and internal extension token, additional extensions could be added by user in the same way (each of them could add up to 256 tokens, multiple extensions could be installed at the same time).
10. The BASIC memory organization - my proposal:
Area 0: current context GOSUB stack, vectors to start/end of other areas, etc. - in a predefined place in memory, to increase performance
Area 1: BASIC code, constant size
Area 2: local context: legacy 'GOSUB lineNumber. stack, loops (FOR, REPEAT, etc.) stack, variables (including the LIBRARY type variable, with content in the same format as Area 1), grows upwards
Area 3: free space
Area 4: global context (global variables), grows downwards
Calling a subroutine: for the purpose of the subroutine, at the beginning of free space we create a new local context ('Area 0' is temporary stored pushed at the end of old local context and restored during RETURN), global context is the same.
Calling a public subroutine of a library: we create a new local context (like before), the library variable content becames our new BASIC code area, there is no access to global context (we force a well defined interface, and providing a global context for a library could complicate the implementation too much and hurt the performance, we might have been forced to relocate it if some other areas grows in size; 8-bit world had it's limitations... unless we find some cheap way to implement this).
[edit1] @team, what WC65C816 assembler are you currently using? [edit2] do we already have some idea for an interface between the screen editor and BASIC/monitor/etc.?
[edit3] Looked into 65816 assembly spec... I'm not yet sure, but it might be much easier to implement the interpreter, if each of these areas (with the exception of free space) were limited in size to 64KB. We could then use a little bit of self modifying code to read/write data from each of the areas quickly, i mean using Absolute Long Indexed X/Y addressing modes, where we modify the absolute address of the relevant instruction(s) while switching the area. But it would be nice to move the code of the libraries loaded from the disk out of the area of variables, so that it wouldn't be affected by this limit. It's not that easy to design it, as it seems :)
Hi, how are you friends? I'm doing a new basic for the color computer, I'm just starting and defining all the functions and improvements It is in Assembler and that in a very long situation for the 2, 4 or 8 kb that there is space. I do not know if I could help, or share things that help them
I like line numbers...
There, I said it! :-)
It is hard for me to imagine a "hobbyist" computer from the 1980s where you couldn't pick up a copy of Ahl's BASIC Computer Games book, do a bit of typing, and have some fun.
There were a lot of other things floating around available (Pascal, FORTH, C, etc.), but interpreted BASIC tended to be "foundational" while others were options.
I don't know if anyone has looked at EhBASIC, but it is a fairly robust 6502 implementation. I'm not sure what the licensing implications of the various flavors of BASIC would be. Lee Davison (who wrote EhBASIC) has passed away and he definitely had expressed an aversion to people commercializing his work without his permission. But it would be a solid, "independent" version of BASIC to use as a starting point.
One of the things I have been looking at is how to use an SD card with ROM images as a "bootstrap". That way I can load something like EhBASIC or drop back to something like a minimalist implementation of VTL (or Tiny BASIC) if I'm feeling nostalgic. That is probably a bit more primitive than where the C256 is headed though.
Thanks,
Jim
Old (ish) BASIC programmer here. (40 years of BASIC from mainframe type through 6502 Apple/PET/BBC Micro types through many others too numerous to mention...
Also someone who wrote their own "ideal" BASIC a few years ago, although mine is in C and really aimed at 32 (or 64) bit Linux systems.
Just wondering how the BASIC is progressing, really, and if anyone might be interested in some notes I have on the subject?
Cheers,
-Gordon
Hi @Tom ,
Have you considered using an open-sourced BASIC interpreter to ease your work ? I just googlized this topic and found https://www.thefreecountry.com/compilers/basic.shtml (for example I have quickly noticed Vintage Basic)
Will this C256 BASIC support commands for advanced graphics based on the hardware specs? I'm very curious to test it when it will be available, to play with them and see if it would be ok to create a GUI toolkit :-) I just tried lately to do something in assembly, but my skills are null, and in C128 Basic it's just too frustrating not to have structured language and data, and too slow for graphics.
Tom,
It would be nice to improve on the 2 character limitation for this project...I’ve also found programming for the 64 easier when in 128 mode because of improved commands such as RENUMBER and AUTO (Well, and 80 column mode too).
What improvements in math functions do you think this version of BASIC should have?
It is highly unlikely that in 1987 Commodore would abandon the use of line numbers! The Amiga's BASIC still used line numbers in the 90's. I can't imagine ever working with BASIC code that used label names in place of line numbers... it might as well be called "iC" (interpreted C) at that point. I feel BASIC should remain traditional BASIC as a ROM based utility, and have advanced BASIC or something that strays from the traditional fundamentals that BASIC has always been built on.
Just wondering...wouldn’t Commodore likely have gone with a GUI...such as GEOS... with the next gen? It would seem the logical step...perhaps as a dual/boot option?
e.g.
1. Boot BASIC
2. Boot GUI
or
press C= key to boot BASIC (like 64 mode on a 128)
defaulting to GUI with no keypress
?
....never mind on the assembler....😂. Needs a linker, debugger, monitor...leave that for software.
Yes, that’s the jump table I was referring to. Obviously,with this new system we don’t wanna be tied down to limiting features of the classic systems. I think we should take advantage of the real power this new computer has to offer. That being said...I’m not an engineer nor do I have the technical background most of you do...so my opinions will take on a less detailed “system user” perspective.
Anyway, I’m super excited to see the discussions on possibilities that are available and ways to detail to specifically be the “character” of the machine.
As for the idea of integrating a machine language monitor/text editor, as an end user I tend to like more bells and whistles...makes programming more fun. What would be the limitations on this? Is it flash/rom size? What about an integrated assembler? Or would that be opening up a can of trouble?
Evolution of BASIC 7.0 would definitely be awesome! My opinion is this was the best version on a Commodore. Enhancing it to take advantage of the Foenix would be a great next step. As to eliminating line numbers...I’m one of those who is a huge fan of “GOTO (linenumber)”, but I also see the advantages of evolving BASIC to run without them.
Not sure what commands would be implemented for all the new features...would be nice to have powerful commands taking advantage of the upgraded sound.
Oh, also, just wondering if the Foenix would continue with the ”standard” jump tables?
Let me remind people that it was possible to have a structured BASIC on a C64. We had an earlier structured language before on C64. Remember on the PET: Symbolic/Structured BASIC
https://telarity.com/~dan/cbm/languages.html
I believe this was ported to C64 at some point. I can also point you to DotBASIC Plus because it actually works on a C64 and it could be employed even on later C= 8 bit machines had Commodore continued the line.
http://dotbasic.cbm8bit.com/about.html
http://dotbasic.cbm8bit.com/files/pdf/dotbasic-manual.pdf
There was GeoBASIC for GEOS and then there was other languages like Fortran, Pascal, and even Ada as well as Forth.
All of these were available for C64.
Haha wow, I thought I was the very last VB6 developer out there. Nice to meet you! My work is boring, I do all my fun things at home. I used to do a lot of PowerPC and 8051 reversing/assembly but, it has been a long time. This sounds like a great project to get back into things.
OMG... yes, I write VB6, classic ASP, and c# code. All my assembly code is at home, for fun, on my Altair or Commodore. (I'm about to install a z80 in my Altair emulator, which is going to allow me to do more in CP/M.)
Heh, I'm working outside of my comfort zone, too. I'm a database and web developer; writing a programming language is somthing I've dreamed of, but never done as a complete project. So don't assume that my ideas will make the final cut or are official. I'm really hoping to see Stef get some excitement at CRX next month.
Oh, I have been writing Visual Basic 6 code everyday for the last 20 years (my work refuses to put effort into upgrading) so I'm at least very familiar with that syntax. Haha not that it will help much but, who knows.
Cool, I'm by no means even remotely qualified, but if you want to bounce ideas back and forth sometime, just let me know.
@Redline99 I don't think I mentioned that part. Bear in mind that this is mostly still a mental exercise at this point. Having said that... QuickBASIC and Visual BASIC (the definitive verions, as far as I'm concerned) pass variables by reference. However, you can explicitly pass a variable by value by using a modifier in the function declaration.
SUB DoThis(byval X, byref Y)
X = 3
Y = X * 2
END SUB
X=2
DoThis X,Y
---
After running this block of code, X is still 2 (because it was passed by value, which created a new instance on the stack), and Y is 6, because it was passed by reference.
As to the stack frame... all variables will be the same size on the stack (probably 9 or 12 bytes). Variable-length elements (strings or complex types) will be in the string table. The string table will be garbage collected when necessary.
How can I possibly be not excited by this!
=) Don't get too excited yet. That's basically a reskin of the terminal emulator that I use to run my Altair. I changed the emulator code to handle PETSCII control characters, so the reverse text at the bottom is actually drawn by uisng the CHR$ codes. This will let me prototype screen output code.
Tom, in these older versions of BASIC,how do they handle passing variables to procedures "By Value or By Reference"? Your ideas sound really nice. How do you intend to implement your stack frame, especially if you have reference type variables? Maybe I didn't catch it in your post, I'll reread it. :)
Tom, this is so awesome... Can't wait to see that running on the system... So cool! You're awesome!
Also, a prototype full screen text editor
My intent is to donate this to the project for inclusion on the system ROM. So any langauge can be coded on the system; you'd set the compile options at the top of your main source file.
You actually have good points, and those are things I've considered in the conceptual design.
What I've got in mind is just an extension of the current system of tokenization, but with every value, including variable names, being turned into a token. The thing is, CBM BASIC (and all of the related 8 and 16 bit interpreters) don't tokenize variable names, nor do they pre-process literals. Those are stored as literal ASCII values. I'd pre-parse the entire thing down to binary code, including the variable names.
This means programmers need to declare variable names ahead of time. That's a good thing.
So a statement like FOR X=1 TO 10 would be converted to binary values like
$01 (The next byte will be a BASIC keyword)
$04 (FOR. Get it?)
$21 (the next word is an integer variable reference)
$0000E (The 14th integer variable happens to be called X. It could be Y, Q, or FREDERICK.)
$11 (the next value is a in integer literal)
$0001 (1)
$11 (another integer literal)
$000A (10)
$00 (end of line)
So the nice thing about this system is executing statements is much quicker. The interpreter knows routine 4 expects 3 parameters and so it reads the next 6 bytes and drops those values into the Parameter slots in Direct Page. It then calls the FOR procedure in the BASIC ROM.
Earlier in the code, we had to declare X with the DIM statement:
DIM X AS INT or just DIM X INT
At that point, the interpreter created a variable on the heap, assigned its type as INT ($01), and remembered its name in the string table.
Speaking of strings... strings will be stored in their own bank, the string table. String literals, Variable names, function names, label names, and REMarks will be stored in the string table. The starting values of String variables will also be stored there; if strings are later modified, a new copy will be placed on the string table and the original marked for garbage collection..... although I expect we might add a length specifier to string declarations, like so:
DIM A$ STRING * 10
This would allocate 11 bytes for A$, 10 for its value, plus one byte for the null terminator. Also, note that the $ doesn't mean "string" any more. It's just an optional part of the variable name. That could just as easily be just A.
Okay, that's a long way to get around to saying... the parser will handle scope appropriately, including being able to handle recursion and multiple uses of the same variable Every DIM creates a distinct instance of a variable, and explicit declaration is part of that process.
So will this be compatible with CBM BASIC? Absolutely. When a CBM BASIC program is loaded the first time, its tokens will be turned back into text and then re-encoded using the Structured BASIC format. Since we know CBM BASIC doesn't expect or allow variable declarations, the interpreter will create a declaration section at the top of the program. It will also eliminate line numbers, adding labels as needed to satisfy GOTO and GOSUB statements.
I am very close to my community, it is very important to me... I feel that communication and transparency is key... (to what... I don't know... ;o) ) Too much greed, dishonesty and bullshit these day.
As far as I am concerned you can come and stir things up as much as you want! At least you are engaging and I can certainly appreciate that! ;o)
Cheers!
Yes, I am going back to my schematic once again!
S
Oh, don't mind me coming in here and stirring things up! I'm sure whatever is used it will work out nicely. It's really nice that you have engagement with your community, I never expected your quick response!
Get back to that schematic!
:)
Redline,
As much as I respect Tom and his ideas and he actually got me to rethink about my view about Basic. I still need to emphasize the fact that the Ideology behind this project is to create something that could have followed the C128 in 1987. So, it... is very easy to go overboard very quickly thinking that we can overcome the thing we didn't like back then and try to fix them with 2018 technology.
2018 Technology is everywhere and I for one is not looking for to go easy on myself just because it is make sense for 2018. Obviously and unfortunately there will be exception to that rule and unfortunately I will be the first one to break it...
What people wants to do with the unit when it exists is really not my business, but as what it is going to be released with as part of the machine itself needs to be as close as what Commodore could/would have decided to create after the mitigated release of the C128 and considering that Jack would have not left the company and also considering the Japanese competition coming in town... Lot's at stake in 1987.
So, keeping to the touch and feel of the time is super important. So in the best case, having a Microsoft (or compatible version) Basic version 10 or 12 or 20 (I don't really care) on boot that supports the hardware of the Feonix features would be awesome. I mean the C816 is already 14 Fold faster than a 64 + all the DMA features, etc... I think the FOENIX BASIC would be pretty awesome! ;o)
Alright... back to the schematic!
S
Writing a recursive decent parser for the structured syntax version can be tricky because of ambiguity it can cause for the lexer/parser. This might not be an issue in these versions that Tom mentioned because of the limited number of keywords (I'm not familiar with them). It might come down to how much compatibility you are looking for but, as a newcomer here, Tom's previous post sounds like a nice approach. I need to do some catch-up so I don't sound like such an idiot. :)
Yeah, it works like Lua or Python... The indentation doesn't actually matter, though. Instead, certain keywords (If/Then, Do, While, and For) begin blocks, and their matching keywords end them.
I actually started an editor prototype, but never got very far with it because I've been pretty loaded down with work and life events. I will have a little more free time this summer, so I plan to at least get that to the point where I can show what I'm talking about.
There's certainly no reason you can't target any version of BASIC you want. I like BASIC 7's keywords, but I'm thinking even that needs some pruning and some grafting from other flavors. Also, bulk I/O is a must - the lack of a simple "read n bytes from disk " is a huge shortcoming in C64 basic.
@Tom, also when you have a chance, could you email me on the web site a email I could reach you, I have other things, I would like to talk to you about off the Forum, thank you! Cheers!
S
@Tom, you know, the more I look at it, the more I think this structured BASIC could actually be cool with the full screen editor. I never put much thought about how I would go about writing a BASIC interpreter but I agree that the line number would be a pain in the ass. So this structured BASIC is BASIC but works like python, the indentations are the delimiter. Could we keep the same CBM vocabulary? + more specialized instruction? I like also the embedded text editor.
CBM BASIC is slow. But the power this system will have should mitagate that somewhat. I realize you are probably after a "clean" ROM for BASIC but just thoughts from someone who used CBM BASIC V2 thru V7. C128 had an excellent BASIC, as did the Plus 4 to a degree. compatibility with those keywords would be an awesome thing.