milliForth

285 points by binarycrusader 1 year ago | 83 comments
  • tromp 1 year ago
    > However, milliFORTH appears to be the smallest programming language implementation ever, beating out sectorLISP2, a mind-blowing 436 byte implementation of LISP, by 14 bytes.

    The sectorlambda implementation of Binary Lambda Calculus is shorter yet at 383 bytes [1].

    And the BLC self-interpreter is only 29 bytes [2].

    > A FORTH in 422 bytes — the smallest real programming language ever, as of yet.

    That may still be true, as BLC is an esoteric programming language.

    [1] https://justine.lol/lambda/

    [2] https://ioccc.org/2012/tromp/hint.html

    • fuzzballcat 1 year ago
      Author here! I was trying to be very careful about making sure to phrase things in a way that was clear I was referring to "real" languages (ones that are used for actual production purposes), but inevitably I forgot to do so where you've highlighted. I've updated the wording so it's more clear.
    • RodgerTheGreat 1 year ago
      The main advantage of a miniature Forth like this over BLC, Lisp, Brainfuck, etc, is that Forth grants low-level access to the hardware; you can easily bootstrap from it to something indistinguishable from a feature-rich Forth with a full suite of metaprogramming capabilities, or implement an entire operating system on top of it.
      • jlokier 1 year ago
        Miniature Lisps are similar to Forth in this way. Having worked with both I'm inclined to favour Lisps for providing cleaner structure and abstractions for code and data to build up from, with low overhead despite the superfical differences. (See SectorLisp2 (436 bytes) vs SectorForth (491 bytes) size: https://justine.lol/sectorlisp2/).

        Lisps are given low-level hardware access primitives (peek/poke etc) when they are designed for bootstrapping an OS, and low-level OS primitives (such as system calls) when they are designed for bootstrapping a rich environment on a different OS. Basically the same primitives as Forth for the same purposes.

        • tromp 1 year ago
          Is low-level hardware access part of the language definition [1] ?

          [1] https://forth-standard.org/standard/words ?

          • mepian 1 year ago
            Nothing is stopping Lisp from having low-level access to hardware.
          • eichin 1 year ago
            Hmm, I'd be a little surprised if this was smaller than some of the 8-bit era forths (the TIL book had a 50ish-byte "inner interpreter" for Z80 but that didn't include any of the baseline "words" that this does.) I'm sure it wins for 32-bit systems though.
          • fuzzballcat 1 year ago
            Funnily enough, milliFORTH is now smaller than the Binary Lambda Calculus 383 byte implementation.
            • n_plus_1_acc 1 year ago
              [flagged]
              • ikari_pl 1 year ago
                um, sir... please show us the implementation of eval.
                • rcxdude 1 year ago
                  it's evals all the way down.
                  • mjbrusso 1 year ago
                    Where do I find the implementation of int 0x16 and int 0x10?
                    • Y_Y 1 year ago
                      Easy, just make a CPU where "eval" is an instruction
                • PaulHoule 1 year ago
                  When I was in high school I wrote a nice FORTH for the TRS-80 Color Computer using the OS-9 operating system which was a Unix-like multitasking OS that would fit on a 6809 microcomputer.

                  I think it was around 2000 lines of assembly code to implement most of the FORTH-83 standard although mine was unusual in that it did not support the block-based I/O that was common on “language system” FORTHs but instead it had handle-based API for accessing files similar to Unix, C and MS-DOS in version 2 and up.

                  The programming environment was a lot like Linux overall in that I’d use an ed or vi clone to edit files, then run something like an assembler or C compiler. I’d run my FORTH binary and it would present an interactive environment like most FORTHs.

                  • anotherhue 1 year ago
                    I love projects like these, reminds me of the magic within the machine, as opposed to the normal cacophony of the world that comes via the machine.
                    • digitalsankhara 1 year ago
                      I do like this reasoning. Magic within the machine for me was programming forth as part of my post grad (controlling and processing proton precession magnetometers) and the device was a Triangle Digital Services (now defunct as a viable company I think) TDS2020 forth SBC.

                      Almost a zen like experience - you, the machine and your focus. No world.

                    • rep_lodsb 1 year ago
                      You could save one byte by replacing "0=" with "0<>" (well, except for the name, maybe use "0#"?):

                          pop ax
                          neg ax     ;sets carry flag if nonzero
                          sbb ax,ax
                          push ax
                      
                      And another two bytes in DOCOL:

                          ;add ax,4
                          ;mov si,ax
                          xchg ax,si
                          lodsw
                          lodsw
                      
                      Maybe even free up a word register to always hold the address of DOCOL, then you only need two bytes at each colon definition. If it's possible without adding any extra instructions, this should save 4 bytes in total (one stosw in COLON, the DOCOL.addr variable, one lodsw in the above version of DOCOL)
                      • fuzzballcat 1 year ago
                        Good thinking! I wouldn't have thought of that `xchg` one in DOCOL, very clever.

                        I'll futz around with making DOCOL's address as a register and see if I can make that work, though I don't know if that will work since I think some of the mechanisms in place have additional functions.

                        EDIT: So, I don't think this will work. I'd be using `dx` to store DOCOL's address, but this appears to get clobbered somewhere, fixing which would outweigh the 1 byte benefit you end up actually getting. I'll keep thinking about it though!

                      • lynx23 1 year ago
                        Porting JonesFORTH to x86-64 was one of the most rewarding fun-project experiences lately. Forth is fun to play with.
                        • alexisread 1 year ago
                          Looking at the code, this looks remarkably similar to sectorforth? Thing is, with sectorforth and this, 2 of the primitives are not required so reducing the VM to 6ops, so you can probably go smaller. Looks as though Cesar was happy with fitting into a sector: https://github.com/cesarblum/sectorforth/issues
                          • fuzzballcat 1 year ago
                            [Author here.] Indeed, sectorFORTH (as I have mentioned) was the primary inspiration and guide for this project! Ironically, in some places I even had a rather different design which converged on a more sectorFORTH-like design - it's really well put together.

                            I've removed some primitives further from milliFORTH, but I haven't touched the arithmetic base. I hadn't seen those issues on sectorFORTH though, very interesting - I will look into it!

                            • sophacles 1 year ago
                              Total tangent - I've often been annoyed at some library or whatever because I didn't like the way it was implemented or the api it exposed or whatever. So I try to reimplement it myself to my tastes, and eventually once I understand the problem and iterate a few times I end up with something remarkably similar to the library I initially didn't like.

                              It's a good reason to at least try to implement your own sometimes, rather than just use libraries - even if you end up just using the supported external dep, you'll understand it a lot better.

                              (not saying that's the case here, just that you reminded me of that with your convergance...)

                              • alexisread 1 year ago
                                Ah yes apologies, that'll teach me to look at the code before the readme!
                            • jart 1 year ago
                              When SectorLISP claimed to be the tiniest programming language in the world, it included (1) a Turing machine program, (2) an implementation of SectorLISP written in SectorLISP, and (3) a proof that it could run software that had been written for the original LISP 1.5 system built by LISP's inventor. If milliForth is making the same claim, then it should hold itself to the same standard. Otherwise it's just a calculator written by an anonymous person. I filed an issue here: https://github.com/fuzzballcat/milliForth/issues/8
                              • benj111 1 year ago
                                I recommend people follow that link just to see the post at the bottom showing numbers being defined from nothing (well @sp, 0 also has to be defined)
                              • mikewarot 1 year ago
                                I've always wondered how to boot a virtual machine from a single sector with no other OS... now I know, which is cool.
                                • badcppdev 1 year ago
                                  Does anyone in the Forth community know if Charles Moore is still with us?
                                  • thesuperbigfrog 1 year ago
                                    He has recently done some amazing low power work with GreenArrays:

                                    "GreenArrays is shipping its 144-core asynchronous chip that needs little energy (7 pJ/inst). Idle cores use no power (100 nW). Active ones (4 mW) run fast (666 Mips), then wait for communication (idle).

                                    Tight coding to minimize instructions executed will minimize power. The programmer can also reduce instruction fetches, transistor switching and duty cycle."

                                    https://youtu.be/0PclgBd6_Zs

                                    It is like the elegance of Forth in low power, multi-core hardware.

                                    • badcppdev 1 year ago
                                      I think you're right that he's still at Green Arrays. Still a director there apparently.

                                      I do have to note that the linked video is from 2013 so I assume they've moved on from that.

                                      • jpfr 1 year ago
                                        Well, he surely isn't doing it for the money.

                                        The Moore Microprocessor Patent Portfolio was very lucrative. Eventually it ended up with a patent marketer. Nearly everyone doing processors had to buy in.

                                        And as the inventor Chuck got his share out of it...

                                        https://web.archive.org/web/20071226155608/http://www.linuxe...

                                        • LeonenTheDK 1 year ago
                                          I have heard nothing new out of Green Arrays but the site still seems to be selling the boards. If I had free cash available I'd try ordering one just to see. It's a really interesting concept and I'd love to see it developed.
                                      • t-3 1 year ago
                                        He is. SVFIG's annual "Forth Day" meeting is on November 18th and he's likely to be there giving the traditional "Fireside Chat".
                                        • badcppdev 1 year ago
                                          That's awesome to know. I remember reading his blog years ago when he was setting up Green Arrays
                                        • weinzierl 1 year ago
                                          Yes, and so is Elizabeth Rather.
                                          • Stratoscope 1 year ago
                                            • tromp 1 year ago
                                              Still going strong at 85 years old...
                                          • tomcam 1 year ago
                                            justine is probably working on a version so small it will trigger the singularity irrevocably
                                            • fuzzballcat 1 year ago
                                              Indeed! It's going to somehow be smaller than the 99 byte BF one...
                                            • LordShredda 1 year ago
                                              Does the benefit of it being embeddable on a QR code outweigh the lack of quality of life features like subtraction? Nevertheless, truly an impressive feat that shows how simple computers can be without all these modern API layers.
                                              • gavinhoward 1 year ago
                                                For one specific purpose, the benefit could be worth it: bootstrapping.

                                                Guix bootstraps from a tiny audited binary, and milliForth could be used for the same purpose.

                                                Imagine bootstrapping a full Linux distro from a milliForth binary and source code for everything. Everything would be fully auditable, no Trusting Trust problem, and a full Software Bill of Materials.

                                                • eternityforest 1 year ago
                                                  I always thought of FORTH as the #1 hardest to think of uses for, out of all the non-esoteric and non-obsolete languages. That's a really neat application!
                                                  • defrost 1 year ago
                                                    Booting from a firmware Forth loader was how the early Sun SunOS (BSD) workstations did their thing - you could hotkey to stop the default OS load and boot from an image on an alternative drive, across a network, or modify the Forth loader to <imagination>
                                                    • nine_k 1 year ago
                                                      Beside bootstrapping, Forth works well on tiny MCUs.
                                                      • TerrifiedMouse 1 year ago
                                                        I believe Forth is used quite a bit in the embedded world - or used to be. I read, long ago from the Usenet days, you use it when you don't want to write assembly but C is too "heavy" for your hardware.
                                                        • adastra22 1 year ago
                                                          Bitcoin's smart contracting language is a Forth variant, and arguably this was a very astute and smart choice by Satoshi.

                                                          When designing a multi-party contract, the essential problem is that two people specify the terms they each want to apply to any case in which the funds must be spent, then these two sets of requirements must be merged into a single program. This is, in general, difficult to do securely. We can establish conventions that are relatively easy to follow, but it would be a lot nicer and more powerful if we could syntactically enforce that both sets of requirements are enforced in the combined program.

                                                          Concatenative languages like Forth meet this requirement nicely. If you represent the spend requirements as a program, then combining the two programs together in such a way that they both are equally enforce is as simple as literally concatenating the two programs together.

                                                          For example, suppose Alice's requirement is that Alice signs the transaction with her key, and Bob's requirement is that Bob signs with his key, and the spending transaction is after some specified time T. Expressed in bitcoin script:

                                                          Alice: <AlicePubkey> CHECKSIGVERIFY

                                                          Bob: <T> CHECKLOCKTIMEVERIFY DROP <BobPubkey> CHECKSIGVERIFY

                                                          The combined script that meets both these sets of requirements is as simple as putting Alice's script, then Bob's, unaltered:

                                                          Alice&Bob: <AlicePubkey> CHECKSIGVERIFY <T> CHECKLOCKTIMEVERIFY DROP <BobPubkey> CHECKSIGVERIFY 1

                                                          (The `1` at the end is a quirk of bitcoin that it has to finish with a non-zero value on the stack. This combined script could also be simplified in a couple of ways. Also there's a couple of ways in which this can fail in practice. Alice's script could contain OP_RETURN, for example, which causes the entire script to become unspendable. Or a mismatched IF/ELSE. A better designed and strongly typed Forth dialect would fix these issues.)

                                                          Bitcoin's Forth is not type checked, but suppose that it were. And furthermore, suppose that it had a powerful dependently typed system that captured various key signing and stack requirements at the type level. It could be used to track not just what a program does, but also the properties of a program. Alice could put a constraint in her program that says "lock time can be no later than April 2024," and this becomes part of both the input and output type requirements of her program. Then when Bob's program is specified with T=15 May 2024, then his program no longer type checks when concatenated to the end of Alice's.

                                                          No one has yet written a system like this, but it would be really powerful if it did exist. Alice writes here smart contract conditions all by lonesome self, and Bob writes his. Then they literally concatenate one program to the other, and if it type checks then Alice and Bob can be certain that both sets of conditions are satisfied.

                                                        • lifthrasiir 1 year ago
                                                          But that doesn't directly relate to the verifiability of milliForth itself. An extremely shortened code can be harder to verify, for example it may work as intended unless a very specific input is used to break out of its sandbox (so to say). Bootstrapping needs a short and readable enough seed for that reason, and I can't be entirely sure that it is indeed the case for milliForth.
                                                          • taneq 1 year ago
                                                            The sneakiest possible program representable in n bytes is never less sneaky than the sneakiest possible program representable in n-1 bytes, assuming you can pad a program out with nops.
                                                            • alexisread 1 year ago
                                                              Technically I guess what you'd do is hand type in binary code to create a hex editor, then bootstrap this forth off that, by hand-typing it in.

                                                              If you include formal verification tools as part of the stack then you can verify the tools by eye, type them in, and use them to verify the function of the stack. Admittedly Forth is tricky to do here, but something like lisp/scheme/wat can actually be proven out with say microkanren.

                                                              From there we can trust the software stack.

                                                          • Avshalom 1 year ago
                                                            subtraction is trivial to add in userspace once you've loaded it though

                                                              : - sp@ @ nand 1 + + ;
                                                            
                                                            or as it appears in the hello_world.FORTH file

                                                              : dup sp@ @ ;
                                                              : invert dup nand ;
                                                              : negate invert 1 + ;
                                                              : - negate + ;
                                                            
                                                            There is of course no benefit to this thing at all other than reclaiming the crown from those deviant lispers... well I suppose if you're making a your own computer from scratch like https://www.homebrewcpuring.org/ it might be a useful starting point.
                                                            • benj111 1 year ago
                                                              : - not 1 + + ;

                                                              Disclaimer: I've never written forth, so this may not be valid for any particular forth implementation.

                                                            • tomberek 1 year ago
                                                              Look at how much room you have for data! I wonder what we can fit in there.

                                                              More seriously, a metacircular example to draw from would be: https://github.com/kragen/stoneknifeforth

                                                              • wkjagt 1 year ago
                                                                Would it be possible to do a similar thing in a similar size for, for example, the 6502? I would love to give that a try, even though I probably lack the skills to make it that small. Or maybe it's even impossible.
                                                                • H8crilA 1 year ago
                                                                  How fast is the grabage collector? And is it web scale? I'm concerned by the lack of native JSON support.
                                                                • 1 year ago
                                                                  • mikewarot 1 year ago
                                                                    Has anyone figured out how to pipe input to the qemu vm under WSL 2? I've tried all sorts of things, and nothing seems to work.

                                                                    The included py_autotype.py takes forever, but nothing outputs in the qemu vm as it is running.

                                                                    WSL 2 does have an X server running (xeyes works).

                                                                    Update: running the script against gedit doesn't work either... it's not a qemu issue

                                                                    • fuzzballcat 1 year ago
                                                                      > The included py_autotype.py takes forever, but nothing outputs in the qemu vm as it is running.

                                                                      Keep in mind, nothing SHOULD output while it's running, until you get to the very end and there's a 'hello world' printed.

                                                                      This being said, the autotype's lethargy severely irks me. But I couldn't make it go faster, as pyautogui kept tripping up. If someone finds a better way (i.e. a way that isn't my first random thought), it will be much appreciated.

                                                                      • 1 year ago