A trustworthy, free (libre), Linux capable, self-hosting 64bit RISC-V computer

246 points by caned 1 year ago | 57 comments
  • amelius 1 year ago
    > The chip foundry wouldn't know what the FPGA will be used for, and where the proverbial "privilege bit" will end up being laid out on the chip, which mitigates against Privilege Escalation hardware backdoors. Exposure is limited to DoS attacks being planted into the silicon during FPGA fabrication, which yields a significantly improved level of assurance (i.e., the computer may stop working altogether, but can't betray its owner to an adversary while pretending to operate correctly).

    I suppose in theory the FPGA could contain a hidden CPU that has full read/write access to the FPGA program.

    Further, if the system becomes popular and more FPGAs need to be produced for the same system or the next generation, then the foundry has additional information and they can make a good guess of where the privilege bit will be. Even simpler, they could program an FPGA with the code and figure it out manually.

    • karma_pharmer 1 year ago
      I suppose in theory the FPGA could contain a hidden CPU that has full read/write access to the FPGA program.

      All of them do at this point. It isn't hidden.

      You can't buy a large FPGA without an ARM core in it. The ARM cores all have an opaque signed blob running in EL3 that you can't replace. This isn't a soft core on the fabric; it's dedicated silicon. And it has access to the ICAP (internal configuration access port) on Xilinx devices, and the equivalent on all the other manufacturers.

      • picture 1 year ago
        I'm not sure what you mean. The biggest FPGAs do not have hard processors built in. Those which do feature hard cores usually advertise it specifically.

        See Xilinx's UltraScale+ Kintex and Virtex series, both able to have more logic elements than the Zynq MP equivalents https://docs.amd.com/v/u/en-US/ds890-ultrascale-overview

        • almostgotcaught 1 year ago
          yea op (the person you're responding to) is wearing a tinfoil hat. source i work for AMD (formerly xilinx). there are about a billion reasons it isn't true (starting with we'd charge you more money for the IP instead of hiding it) and ending with ultrascales are sold to DoD and there's very little room for shenanigans there.
        • asddubs 1 year ago
          What is it for?
          • dailykoder 1 year ago
            The ARM Core?

            It's usually used for anything software you need and the FPGA gets used for any domain specific circuit for signal processing and what not in hardware. The ARM core usually runs a linux or any other OS and is used for communication with other devices. It's just able to run at a higher frequency than most soft-cores, because it's a fully integrated and optimized circuit. So in theory that's a nice thing and gives you more space for custom circuits on the programmable logic part.

            PS: The latest polarfire chips from Microchip use a RISC-V CPU. Since AMD has announced the microblaze-v (their proprietary soft-core CPU with RISC-V ISA), I assume they soon (tm) will release their zynq range with dedicated RISC-V in near future, too. But even if it's an open instruction set, it's still a closed source CPU

        • mrob 1 year ago
          I think backdooring the RAM would be easier. Modern DRAM has lots of complicated features (e.g. link training, targeted refresh, on-die ECC). I don't know exactly how it's implemented, but that's plenty of complexity to provide cover for backdoors.

          It should be possible to add something that watches for specific memory access patterns and provides arbitrary read/write capabilities when the correct pattern is detected. This could be used for privilege escalation from untrusted but sandboxed code, e.g. JavaScript. It could work with any CPU architecture or OS, because the arbitrary memory reads could be used to detect the correct place to write.

          This would be less effective with DIMMs or other multi-chip memory modules, but RISC-V computers are usually small single-board computers that only have a single DRAM chip.

          • eternityforest 1 year ago
            Could you encrypt the data you send to the RAM?
            • mrob 1 year ago
              Yes, and hardware support for encrypted RAM already exists:

              https://en.wikipedia.org/wiki/Trusted_execution_environment

              However, this will never be perfectly secure against backdoored RAM in a multitasking environment, because the memory access patterns alone leak information. Additionally, I don't think any of these systems support authenticated encryption, which means you could do things like corrupt branch targets and hope to land on a big NOP slide you control.

          • ooterness 1 year ago
            This sort of thing is analogous to the "Thompson hack" [1], where a malicious compiler has a self-propagating backdoor. It never shows up in the source code, but self-injects into the binaries.

            Thompson demonstrated this under controlled conditions. But realistically, the backdoor begins to approach AGI-level cunning to evade attempts at detection. It has to keep functioning and propagating as the hardware and software evolve, while still keeping a profile (size, execution time, etc.) low enough to continue evading detection.

            Work like this that rebuilds modern computing on a completely different foundation, would seriously disrupt and complicate the use of this type of backdoor.

            https://en.wikipedia.org/wiki/Backdoor_(computing)#Compiler_...

            • Taniwha 1 year ago
              FFS - when a smart person designs something clever it "begins to approach AGI-level cunning"? AGI doesn't exist, at the moment it's purely mythical
              • Asraelite 1 year ago
                I think they meant that a realistic attack would need to be an AGI, not that an AGI was built.
            • nxobject 1 year ago
              I wonder as well whether it wouldn't just be easier to snoop I/O and somehow exfiltrate the data. (This would be completely impractical for dragnet surveillance, of course – but I'm sure if a state actor knew that some organization was using this technique to avoid surveillance, _and_ was using a predictable software setup...)
              • duskwuff 1 year ago
                > I suppose in theory the FPGA could contain a hidden CPU that has full read/write access to the FPGA program.

                Even if it did, it would be exceptionally difficult for that CPU to identify which registers/gates on the FPGA were being used to implement which components of the soft CPU. The layout isn't fixed; there's no consistent mapping of hardware LUTs/FFs to synthesized functionality.

                • amelius 1 year ago
                  Even if the mapping changes, the network (graph of logic gates) will locally be similar. So a subgraph matching algorithm might be all that is needed.
                  • rowanG077 1 year ago
                    That would you mean you connect your hidden CPU to essentially every wire inside the FPGA. Trivial to detect, and extremely expensive, and probably even impossible considering timing model.
                • rowanG077 1 year ago
                  It's certainly non-trivial to put a hidden CPU in a FPGA that has full read/write access. The wire configuration inside the FPGA will be different for every design loaded into, hell even for the same design the place and router will do different things. So to what will you connect your hidden CPU?
                  • admax88qqq 1 year ago
                    Without having thought it through fully I feel like the classic "trusting trust" attack could work at the fpga/bitstream level.
                    • rwmj 1 year ago
                      For a nation state the most useful thing would be a "kill bit" where you can broadcast some signal or key and disable all your enemy's computers. That's fairly easy to do in an FPGA - the signal would be detected by the serdes block(s) and the kill bit could just kill the power or clock or some other vital part of the chip.
                      • 8372049 1 year ago
                        > For a nation state the most useful thing would be a "kill bit" where you can broadcast some signal or key and disable all your enemy's computers.

                        CNE is generally considered to be far more valuable than CNA. First of all keep in mind that all genuinely sensitive systems are air gapped, so you can't effectively broadcast a signal to them.

                        Second, CNA is a one-off; after the attack you will typically lose access. CNE access on the other hand can persist for years or even decades, and will be beneficial in both "cold" scenarios for political and economic maneuvers, and closer to a flashpoint. CNA, on the other hand, is usually only relevant when a conflict is turning hot.

                        • glitchc 1 year ago
                          No, it's the opposite. The FPGA makes it much harder to hide a trojan in the silicon. If the LUTs were biased, it would be detected fairly quickly. A dedicated circuit with an RF interface would be equally obvious in terms of chip usage and power draw.
                          • rwmj 1 year ago
                            I didn't say anything about modifying the LUTs or adding RF interfaces, I don't know where you got that from.
                          • greggsy 1 year ago
                            Even better, a logical fuse that would make recovery impossibly expensive and timely.
                        • buildbot 1 year ago
                          It’s really quite amazing to login a linux shell on an orangecrab FPGA running a RISV-V softcore, built using an open source toolchain. That was impossible not so long ago! At best you’d have something like Xilinx PetaLinux and all their proprietary junk.
                          • snvzz 1 year ago
                            Fun thing is that orangecrab's FPGA is not even a requirement.

                            A tiny iCE40 LP1K will fit SERV (and even QERV) no prob.

                            It's amazing how small a fully compliant RISC-V implementation can be.

                            • 3abiton 1 year ago
                              This is and will be a rallying moment soon for the community, both open hardware and software finally working together! This will be huge by the end of the decade.
                              • taftster 1 year ago
                                I guess using some local definition of "huge".
                                • almostgotcaught 1 year ago
                                  > This is and will be a rallying moment soon for the community

                                  My guy this thing is 4 years old. Spoiler alert: it wasn't.

                              • ruslan 1 year ago
                                I'm kind of going the same direction, but different route. My design is based on VexRiscv and all hardware is written in SpinalHDL. It does not run Linux yet because of limited SRAM (512KB) on my Karnix board, but it has Ethernet and HDMI. I have coded a CGA-like video adapter with HDMI interface that supports graphics (320x240x4) and text (80x30x16) modes with hardware assisted smooth scrolling. :)

                                If someone is interested, here's a rather brief README: https://github.com/Fabmicro-LLC/VexRiscvWithKarnix/blob/karn...

                                KiCAD project for the board: https://github.com/Fabmicro-LLC/Karnix_ASB-254

                                • dwheeler 1 year ago
                                  This is cool. I was happy see the prominent reference to my work on diverse double-compiling (DDC), which counters the trusting trust attack. If you're interested in DDC, check out https://dwheeler.com/trusting-trust
                                  • musicale 1 year ago
                                    Rebuilding the system on itself and validating that the bitfile is the same is nice.

                                    I'm amazed that it could be rebuilt in 512MB (and in "only" 4.5 hours on a ~65MHz CPU.) My experience with yosys (and vivado etc.) is that they seem to want many gigabytes.

                                    > A 65MHz Linux-capable CPU inevitably invokes memories of mid- 1990s Intel 486 and first-generation Pentium processors.

                                    50-65MHz* and 512MB seems comparable to an early 1990s Unix workstation. Arguably better on the RAM side.

                                    *4.5 Mflops on double precision linpack for lowRISC/50MHz

                                    • mntmn 1 year ago
                                      I did something similar in 2022, also with LiteX, but not self-hosting because it used a Kintex-7 FPGA which at least at the time required Vivado for the actual place-and-route. It did result in a open gateware laptop running Linux and Xorg, though (thanks to Linux-on-LiteX-VexRiscV): https://mntre.com/media/reform_md/2022-09-29-rkx7-showcase.h...
                                      • rramadass 1 year ago
                                        Also see RISC-V based Shakti : Open Source Processor Development Ecosystem from IIT-Madras, India - https://shakti.org.in/

                                        Good overview at wikipedia - https://en.wikipedia.org/wiki/SHAKTI_(microprocessor)

                                        • dmarinus 1 year ago
                                          hey this is the same guy who did some work for running osx on qemu/kvm. https://www.contrib.andrew.cmu.edu/~somlo/OSXKVM/
                                          • robinsonb5 1 year ago
                                            This is very, very cool. I've been thinking for a while that a fully self-hosted RISC-V machine is sorely needed. The biggest limiting factor at the moment actually seems to be finding an FPGA board which has enough RAM on board. The target board here has 512 megabytes, I think - but FPGA toolchains are much happier with several gigabytes to play with.
                                            • bitcompost 1 year ago
                                              While I love the idea of self-hosting HW and SW, I can't even imagine the pain of building stuff like GCC on 60Mhz CPU. Not to mention the Rocket CPU is written in Scala. I recently stopped using Gentoo on RockPro64, because the compile times were unbearable, and that's a system orders of magnitude faster than what they want to use.
                                              • sweetjuly 1 year ago
                                                You can definitely go considerably faster. A lot of these FOSS cores are either outright unoptimized or target ASICs and so end up performing very badly on FPGAs. A well designed core on a modern FPGA (not one of these bottom of the barrel low power Lattice parts) can definitely hit 250+ MHz with a much more powerful microarch. It's neither cheap nor easy which is why we tend not to see it in the hobby space. That, and better FPGAs tend not to have FOSS toolchains and so it doesn't quite meet the libre spirit.

                                                But, yes, even at 250MHz trying to run Chipyard on a softcore would certainly be an exercise in patience :)

                                                • shrubble 1 year ago
                                                  People used 50Mhz SPARC systems to do real work, and the peripherals were all a lot slower (10mbps Ethernet, slower SCSI drives) with less and slower RAM. But it might take a week to compile everything you wanted, I agree; of course there is always cross-compiling as well.
                                                  • RobotToaster 1 year ago
                                                    That was before everything became a snap package in a docker image.
                                                    • musicale 1 year ago
                                                      > That was before everything became a snap package in a docker image.

                                                      A modern app should consist of dozens of of docker images in k8s on remote cloud infrastructure, all running "serverless" microservices in optimized python*, connected via REST* APIs to a javascript front-end and/or electron "desktop" app, with extensive telemetry and analytics subsystems connected to a prometheus/grafana dashboard.

                                                      That is ignoring the ML/LLM components, of course.

                                                      If all of this is running reliably, and the network isn't broken again, then you may be able to share notepad pages between your laptop and smartphone.

                                                      *possibly golang/protobufs if your name happens to be google and if pytorch and tensorflow haven't been invented yet

                                                    • bitcompost 1 year ago
                                                      Oh I believe in theory a 50Mhz CPU is capable of doing almost everything I need, but it just lacks the software optimized for it. I think a week to compile everything is too optimistic.
                                                      • musicale 1 year ago
                                                        Old compilers/IDEs like Turbo Pascal or Think C were/are usably fast on single-digit MHz machines and emulators.

                                                        And even if the CPU is 50 MHz, modern DRAM and NVMe flash are very fast compared to memory and storage on 1990s (or older) machines.

                                                        Older versions of Microsoft Office (etc.) ran about the same on 50 MHz systems as Office 365 runs today.

                                                        • kwhitefoot 1 year ago
                                                          I did valuable work on a 2 MHz Apple II with a 4 MHz Z80 add-on running CP/M that I used to write the documentation. The documentation part was just as fast forty years ago as it is now but assembling the code was glacially slow. The 6502 macro assembler running on the Apple too forty minutes to assemble code that filled an 8 k EPROM.
                                                      • marssaxman 1 year ago
                                                        > I can't even imagine the pain of building stuff like GCC on 60Mhz CPU

                                                        Some of us remember what that sort of thing was like, not so very long ago...

                                                        • brucehoult 1 year ago
                                                          I remember when I got CodeWarrior on my PowerMac 6100/60 and suddenly I could answer questions online about weird MacApp problems by making a temporary project with their code and compiling the whole of MacApp in 5 minutes.

                                                          Previously that had taken about 2 hours (Quadra with MPW), and I did clean builds only when absolutely necessary.

                                                          Truly painful was trying to write large programs in Apple (UCSD) Pascal on a 1 MHz 6502.

                                                          • nineteen999 1 year ago
                                                            Back then GCC was much smaller, and only contained C code, not C++. But sure, let's compare apples and ... much bigger heavier apples.
                                                            • Gibbon1 1 year ago
                                                              I made a meme and sent it to my even older coworker with two guys from the office looking pensive. Titled 'The Build Failed Saturday and Again Sunday Night'
                                                            • dwheeler 1 year ago
                                                              At one time many of us dreamed of having a computer that could run as fast as 60MHz. The first computers I used ran around 1MHz. Compilation will take longer on a slower machine, but that really isn't a big deal. If the computer is reliable and the build scripts are correct, you can just let the process run over days or weeks. I've run many tasks in my life that took days or weeks. Cue "compiling": https://xkcd.com/303/

                                                              The real problem is debugging. Debugging the process on a slow system can be unpleasant due to long turn-arounds. Historically the solution is to work in stages & be able to restart at different points (so you don't have to do the whole process each time). That would work here too. In this case, there's an additional option: you can debug the scripts on a much faster though less trustworthy system. Then, once it works, you can run it on the slower system.

                                                            • Guestmodinfo 1 year ago
                                                              Wow, I am starting to read all your reading material that you have put up.

                                                              It's really what I have always wanted to do and it's more than that because you are using FPGAs. I am from India and I want to help you in any way I can because I also have wanted to go on this journey. It's just amazing I wish you all the blessings.

                                                              • indigodaddy 1 year ago
                                                                All the good stuff is here looks like (linked in the op article):

                                                                https://github.com/litex-hub/linux-on-litex-rocket

                                                                Cool, so didn’t know that you could use screen to connect to a tty/serial. Neat.

                                                                • rurban 1 year ago
                                                                  (2020)
                                                                  • Rip_boy09929 1 year ago
                                                                    [flagged]
                                                                    • ranger_danger 1 year ago
                                                                      There's nothing free about FPGAs...