Let's Make Robots!

Is there interest in a new BASIC compiler for AVRs?

I want to find out how much interest there would be in a new BASIC compiler for AVR processors.  I have been working on one in bits and pieces for about a year and a half.  I would like to get motivated to complete it, so I want to know if others would be interested in it.  This is a VERY large project, so it will take a lot of time.  It will be about 10,000 lines of code for the first useable release. If I really get going on it, I think I could have an initial beta release before fall. My motivation level will be directly proportional to interest.  I figure LMRians are the best target audience.  Below are some details.  I would appreciate comments for or against.

Specifics of the project:

1.  A powerful, useful, yet simple to use and learn compiled language that runs fast.

2. Have true functions and procedures and local variables with good structure, like a "professional" language, but keeping the flavor of BASIC. Not like early microcomputer BASIC or PICBASIC or PICAXE BASIC, but not C or C++, either.  In between.

3.  Designed specifically from the ground up for small, embedded systems.  Fast, compact code and direct hardware access are primary goals.

4.  Easily extensible for new features.  Easily portable to other processors ( ARM, maybe PIC).

5.  Free and probably open source, but I would maintain control of what goes into the official version.  Feedback from users would be the main development criteria.

6.  Command line only to start, eventually integrated into some kind of IDE ( Eclipse, Arduino, or something similar).

7.  Eventually, a PC version ( compiled or interpreted ) so you could build and test your programs without having to download them.  Would provide much better debugging for a lot of code while running as close to the same as possible.  Possibly useful on its own

8.  Run on Linux (first) and Windows.  I don't have a MAC so that would have to be further down the road, but a Linux version should port easily.


If you are interested, please let me know your experience/ability level and whether you would be able/willing to use a command line version as an early tester.  If you think it is a bad idea, please tell me why.  I have most of the design work done, and if there is interest in it I will post more details about the language itself for comments, suggestions, and other feedback.  Doesn't mean I will implement your suggestions, but I will certainly listen.

Anyway, just fishing for interest right now.  Let me know what you think and if you have any questions, please ask.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Over the years I have written a lot of compilers and interpreters for both large and small systems. It was always fun and I would learn something with each write. Here are some of the things I have learned and will pass them on for whatever it's worth.

Don't completely re-invent the wheel each time. Use as many tools as you have disposal of (i.e. yacc, bison, lex etc) These are really helpful on the compiler side.

For small target systems especially, do all you can at compile time. Investing time in a good byte interpreter type system is very worth while if you plan to do non-native target code.

Add the least processing you can to the target. (I know I'm kind of stessing this point)

 If you don't go the byte compiler and interpreter route, then a code generator/translator is a great way of doing it. This route also helps with things like data types and more complex operations like fp operations (and expressions in general). You could easily fork the Arduino IDE and place your translator on top to generate the c/c++ then use the normally provided avr tool chain to compile and load to the target device. This would probably be the approach I would take for a system like Arduino where it has a good open tool set to build off on.

Just my .00000002c worth. Please take it with care :)


BTW, it's nice to see a kindred spirit in this discussion.  When I tell people (even other software engineers at work) I'm working on a compiler they look at me like I'm crazy.  I am, but that's beside the point.  I get a lot of "why" like many of the comments here.  Basically because I'm weird like that.  Just my way.  I don't like anything being a black-box that I don't understand completely.  I really do appreciate the advice and input from you and everyone else.  Always good. 

I understand and compilers have always been my favorite hacking projects! Gook luck and have fun!!!

Ha, ha, ha!  Maybe I should try that.  Learn by osmosis while I sleep. Wonder if they've made an audio book of it? My copy is a PDF rather than hardcopy.  A pc under the pillow might be uncomfortable.  Maybe Adobe can read it to me while I sleep.


My copy is old and very worn like my old K&R white book. Hmmm, a PDF copy would be nice!

Thanks for the advice.  I've written some smaller compilers and interpreters, and I'm aware of the usefulness of tools like Lex and Yacc.  For this project it's going to all be done by hand.  This is first and foremost an exercise for me.  A LARGE exercise, but still.  I do want something useful to come out of so much work.  The lexer is almost trivial, just takes a bit of time to make sure it's right.  I'm probably going to try something I haven't done before with the parser.  The ones I've written before were in Pascal and C.  All recursive descent.  This being C++ I will still use recursive descent, but in a very object oriented fashion, with a class for each grammar rule.  I haven't done it that way before.  The output will be native code for whatever embedded machines it targets (AVR first, ARM and probably pic, among others, as future additions) but for the PC implementation it will interpret the internal intermediate language.  I will be using quadruples internally, which should be easy enough to scan through and interpret.  I'm not really concerned about space or time efficiency on the PC.  It will be plenty fast and compact enough for its intended use. Rather than writing extra code for some mythical bytecode machine interpreter, I will just interpret the quads directly.  At this point I don't plan to support separately compiled modules.  I want the simplicity of compile and go, without linking.  Modular code can be included in the main file (like Turbo Pascal prior to ver 4.0) and get a hex file out.  That also gives a simple to solution to global optimization (although I don't plan a LOT of optimization).  I don't plan to "write" an IDE.  I think integrating with some like Eclipse or the Arduino IDE or something similar is a good route.  There are quite a few out there I could use.

In short the goals are:  a simple but powerful language.  Simple compilation process without all the details and overhead in most compiled languages. Good, fast, small native code generation for best performance.  Easily portable ( dev platform and target).

Thanks a lot for the advice and input.  It's very helpful, even if I don't follow it.

I have a lot of PIC ICs that I don't use because of lack of support (or ease of application development).

I would like to eventually have it produce PIC code, but that isn't going to be included at first.  I don't use PICs.  The PIC architecture is very crude and difficult to generate good code for.  The minimal instruction set and the segmented memory are a compiler writer's nightmare.  Compiled code can quickly become inefficient.  However, I would like to port the compiler to it.  It just isn't a first priority.

Of course, if you wanted to do that part... :-)

Thanks for the input

It certainly has my support.  Different people prefer different tools, this is always the case, so the more the merrier.  I learnt BASIC almost 40 years ago and do prefer it over Arduino.  To my mind 'String' 'STRING' and 'string' as commands should be identical, I hate Arduino C for being so finiky over capitalisation, but in reality, that is the langauge.  Beginners (and many not so beginners :) ) can have enough problems without worrying if you need to capitalise letters in the commands names.  Just my 2c worth (possibly worth less!).

Thank you for the input.  I have been programming for over 30 years, and now do it professionally.  Capitalization rules can be quite arbitrary in some cases, but can be very important for sanity when writing large programs.  I've chosen a mixture for this language:  all user defined names ( variables, functions, etc.) are case sensitive but keywords ( if, print, for, etc) built into the language are not.  That way you can spell your names with any capitalization you want, but you have to be consistent with the names you define.  I agonized a long time over requiring consistency in user defined names.  In the end, good software engineering practice won and I decided to make them case sensitive.