Programs and more Commands for the Ben Eater 8-Bit Breadboard Computer

So now that I have a fancy programmable computer working with its guts bare to the world, what can I do with it?

Well, not much.

The 8 bits of information the computer can shift around necessarily has to be split into 4 bits for the memory address location and 4 bits for the program counter step. This means that there are only 24 = 16 memory address locations for us to store all the commands and variables for our program – not much to work with at all! While all computers are only notionally Turing Complete since formal Turing Completeness requires infinite memory, rarely do you actually bump up against such restrictive limits. Even just doubling this to a 16-bit computer would let you write far more wasteful complex programs that work with far bigger numbers.

However even with the 8-bit constraint, it is possible with a little creativity to fit programs that are both interesting and useful in 16 lines of memory! I have copied over Ben’s multiplication program, and have also created two original programs: one which generates the Fibonacci Sequence, and one which checks if a number is prime (and spits out its largest factor if it isn’t).

And if you want to go even further, I thought of some ways that you could extend the instruction set to create programs that are effectively 2-3 times as long as is possible with Ben’s original list of instructions. If someone can think up a way to use this information to create a program that is interesting enough to elicit a “wow!” reaction, shoot me a comment or email and I’ll make a video showing it in action on my computer! I’ve listed a few challenges at the end of the post.



This takes two numbers x and y, multiplies them together, and displays the result. This comes straight from Ben’s last video in the how-to series.

0:  LDA 14
1:  SUB 12
2:  JC  6
3:  LDA 13
4:  OUT
5:  HLT
6:  STA 14
7:  LDA 13
8:  ADD 15
9:  STA 13
10: JMP 0
11: -
12: 1
13: product
14: x
15: y

Fibonacci Sequence

Calculates and displays the Fibonacci numbers less than 255: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233].

The Fibonacci sequence

Ben Eater implemented a Fibonacci sequence program in this video, but it had a bug that stopped 233 from being displayed. This is my re-work of a Fibonacci program, which fixes that bug and also introduces a feature to halt the program at the largest number, with re-running being as simple as pressing the reset switch on the computer. Alternatively, you can easily modify this program to reset automatically and print the sequence in a loop, by changing the “Jump on Carry” instruction on line 11 to point at the beginning of the program. I should note that this relies on knowing beforehand how many numbers in the sequence it will fit in 8 bits – if you didn’t know when the program would reach its max then you would want another JC 13 command in between lines 6 and 7 (but that doesn’t fit!).

0:  LDI 1
1:  STA 14
2:  LDI 0
3:  STA 15
4:  OUT
5:  LDA 14
6:  ADD 15
7:  STA 14
8:  OUT 
9:  LDA 15
10: ADD 14
11: JC  13  # to loop the program, replace with JC 0
12: JMP 3
13: HLT     # to loop the program, this line is not necessary
14: x
15: y

Largest Divisor (Primality Test)

This takes a number and calculates and displays the largest number that evenly divides it. If the output is 1, that means the original number was prime! The number x you are investigating has to be entered in both addresses 14 and 15 for the first run, and re-entered in 14 for each repetition. Note the new JNC command, which I’ll explain further on.

I’m pretty proud of coming up with this one – it may be the most interesting program that can physically fit on the computer without modifications. Any hey, it means that you could use this computer to help break really, really basic cryptography!

0:  LDA 15
1:  OUT
2:  LDA 14
3:  SUB 13
4:  STA 14
5:  LDA 15
6:  SUB 14
7:  JZ  10
8:  JNC  2
9:  JMP 6
10: LDA 14
11: OUT
12: HLT
13: 1
14: x       # must enter each run
15: x       # must enter first run

Smallest Divisor (Primality Test)

Similar to the above, but starts at 2 and works up. Should be faster than finding the largest divisor in all instances! Halts on the original number if the original number is prime.

0:  LDA 15
1:  OUT
2:  LDA 14
3:  ADD 13
4:  STA 14
5:  LDA 15
6:  SUB 14
7:  JZ  10
8:  JNC 2
9:  JMP 6
10: LDA 14
11: OUT
12: HLT
13: 1
14: 1       # must enter each run
15: x       # must enter first run


The computer as designed by Ben only uses 11 of the 16 slots available for commands, which leaves a lot of room to expand the base instruction set. Here is a list of commands that I think could be useful and would work directly with no hardware changes:

  • Increment (INC): Add a specified 4-bit number directly to what’s in the A register. Saves from having to use a memory address to store a 1 (or other number) to count up and down by.
  • Decrement (DEC): Similar to INC, subtract a specified 4-bit number directly from what’s in the A register.
  • Load B / Store B (LDB / STB): Duplicates the load and store functions for the A register onto the B register.
  • Display (DSP): Outputs the value at a specified memory address to the screen.
  • Display Immediate (DSI): Outputs a specified 4-bit number to the screen. For example, if your program results in a true/false statement, you could spit out a 1 or 0.
  • Jump if Not Carry / Jump if Not Zero (JNC / JNZ): Jump if the condition is not met. Thanks to Micheal Burke with his awesome CPU8 app for this one!

For example, with all these commands you could rewrite the Smallest Divisor program with four fewer lines, which would free you up for things like prepopulating address 14:

0:  DSP 15
-:  LDI 1   # to prepopulate addr. 14  
-:  STA 14  # to prepopulate addr. 14 
1:  LDA 14
2:  INC 1
3:  STA 14
4:  LDA 15
5:  SUB 14
6:  JZ  9
7:  JNC 1
8:  JMP 5
9:  DSP 14 / HLT
10: -
11: -
12: -
13: -
14: 1       # must enter each run if not prepopulated
15: x

The real room for extensibility however comes from using all 8 of the microcode steps, rather than truncating at 5 to save execution time. Since the “meat” of each command takes only 1-3 steps (after the necessary 2 steps of loading the command instructions from memory), it’s possible to join together the meat of any 2 or 3 of the commands into one line. A useful one for example would be a Jump if Equal (JEQ), which would chain together a subtraction with a Jump if Zero. Or as a simple example, the HLT command can be tacked on for “free” at the last cycle of your last command.

And then if you really want to dive off the deep end, you could tailor-make your commands for a specific program, and fit in 2-3x the number of instructions you could otherwise. Every single address of program memory could have it’s own custom command – essentially you would be coding directly in microinstructions. I’d be really interested to see how complex a program someone could make with this approach!

One last room for optimization: if you know that you are going to need a constant value somewhere in your program, you can make one of the commands without an argument be that value specifically. For example, I need to decrement by 1 in the largest divisor program. I could make the HLT command be command 0000, and when I call it on line 12, call it as HLT 1. This would have unchanged behavior when that line is run, but it would also serve as the value of 1 (0000 0001) on line 13, and line 13 could then be used for something else.


Challenges to the reader which may or may not be possible, and will very likely require modifying the basic instruction set. If you figure out how to do any of these in 16 lines of memory, let me know and I’ll make a video of it running on my computer!

  • Compute and display powers: ab = c
    • 35 = 243
  • Sort 3 (or more) numbers in memory from smallest to largest: sort(c, a, b) = [a, b, c]
    • sort(3, 2, 1) = 1, 2, 3
  • Display the number of 1’s in the binary representation of an input number: countbits(a) = b
    • countbits(15) = countbits(00001111) = 4
    • Done 2019-07-19! Thanks to collaboration with Tyler Richard and Micheal Burke. See the comments to this post.
  • Compute the dot product of two vectors: dot(a, b, c, d) = e
    • dot(1, 2, 3, 4) = [1, 2] · [3, 4] = 1*3 + 2*4 = 11
  • Add together two 16-bit numbers: add16(a, b, c, d) = [e, f]
    • add16(7, 208, 11, 184) = [19, 136] → (7*28 + 208) + (11*28 + 184) = 00000111 11010000 + 00001011 10111000 = 2000 + 3000  = 5000 = 00010011 10001000 = (19*28 + 136)
  • Something else!
Counting the number of bits in the binary representation of 203 (11001011, which has five 1’s)

This Post Has 20 Comments

  1. Tyler Richard

    A really simple way to do the count bits function would to be add the number to itself and, if the carry bit is set, add one to the answer.
    0000: LDA 1111
    0001: ADD 1111
    0010: StoreA 1111
    0011: Jump Carry 1000
    0100: LDA 1101
    0101: Sub 1
    0110: Jump Zero 1100
    0111: Jump 0000
    1000: LDA 1110
    1001: Add 1
    1010: StoreA 1110
    1011: Jump 0
    1100: HLT
    1101: Var Count (Starts at 8)
    1110: Var Ans (Starts a 0)
    1111: Var A

    1. Scott

      Awesome, I think that would work! You would need to use the INC/DEC commands I propose rather than ADD/SUB since you don’t have room to store 1, but I see what you mean and this flows really nicely when I work it out on paper. Will try it out on the machine when I get back home in a week. It would be great to cut off one line in order to display the final count with a DSP 1110 command after line 1011, but I can’t figure out a place to trim any lines. Perhaps if you brought in a new command that combined DSP and HLT.

      Edit: Actually, I think you still need a STA 1101 after line 0101, otherwise your var count doesn’t get updated. Could probably get there by combining more commands.

      1. Michael Burke

        I find when you have a JZ followed by a JMP, you can save one instruction using the JNZ instead. I think in this case it would look like:

        0110: Jump Not Zero 0000
        0111 HLT

        That would free address 1100 for something else.

        Obviously, a DSP with address or HLT with display could also free up some address space.

        I was thinking the DSP instruction could display the contents at the address but add a new option for the HLT instruction to display the actual operand (immediate) value like a return code.

        1. Scott

          Yep, looks like a JNZ will save a line and do it! I haven’t tested this on the machine yet, but here’s a program that should count and display the number of 1’s in the binary representation of a number.

          0000: LDA 1111
          0001: ADD 1111
          0010: STA 1111
          0011: JC 1001
          0100: LDA 1101
          0101: DEC 1
          0110: STA 1101
          0111: JNZ 0000
          1000: DSP 1110 / HLT
          1001: LDA 1110
          1010: INC 1
          1011: STA 1110
          1100: JMP 0000
          1101: Count # initialize to 8 (0000 1000)
          1110: Ans # initialize to 0 (0000 0000)
          1111: X # must enter each run

        2. Scott

          Just tried this program out on the computer and it works like a charm! See the end of the post here for a new video of that running.

  2. Eric B.

    The following program should compute x^2 then display the result (assuming you implement the DSP instruction.)

    0: LDA 15
    1: JZ 11
    2: STA 14
    3: LDA 13
    4: ADD 15
    5: STA 13
    6: LDA 14
    7: DEC 1
    8: JZ 11
    9: STA 14
    10: JMP 3
    11: DSP 13
    12: HLT
    13: 0 # must enter each run
    14: –
    15: x # must enter first run

    You can switch around 13 and 14 to group the two variables you need to enter together. 14 is used as a temp variable.

    1. Scott

      I like it! Seems like the multiplication program, but saves you from inputting the variable twice and has a speed up to check if your answer is 0 ahead of time.

      1. Eric Beller

        Yeah, basically it’s just pre-programmed to calculate x*x instead of having to put it in twice. But might still be useful 🙂

  3. Tim Zou

    I’ve been trying to do the exponent, where I’m outputting x^y. This is what I have so far:

    0. LDA 22
    1. SBI 1
    2. STA 22
    3. LDI x
    4. STA 20
    5. JZ 17
    6. LDA 20
    7. SBI 1
    8. STA 20
    9. JNZ 14
    10. LDA 19
    11. ADD 21
    12. STA 19
    13. JMP 6
    14. LDA 19
    15. STA 21
    16. JMP 0
    17. LDO 19
    18. HLT
    19. sum
    20. z
    21. x
    22. y

    The new instructions I have are SBI (subtract immediate, where you load in the IR output immediately into the B register and subtract), LDO (which is just a combination of LDA and OUT), and JNZ (jump not zero). I made these just to reduce down the instructions, but I’m still over by a lot.

    The variables are: x for the base, y for the exponent (I’m using JZ to test for y=0 because I’m multiplying the base y-1 times), z for the counter index (it goes from x to 0 but uses JZ to test for z=0 because I’m adding x to the sum x-1 times), and sum for the final sum

    I’m kind of stuck because I can’t seem to reduce any instructions down. 22 is way over 16, so it’s not looking too good right now. Anyone have any ideas?

    1. Scott

      Hmm, my first thought is to combine HLT into line 17 which gets you to 21 lines.

      I don’t see how right now, but there might be a clever way to combine the chunks in lines 0-2 and 6-8 and knock down another 4 lines, getting you to 17. If that’s possible you’ve only got one line to go, though I’m having a hard time seeing where else you could cut something.

      This new command might work or not, but in lines 10-12, perhaps you could rearrange things like this?
      10. LDA 21
      11. ADD / STORE 19

      1. Tim Zou

        Alright thanks for the suggestions, I’ll play around with it and see. I think the 0-2 and 6-8 combination would be huge, so I’ll try that.

  4. Mike Sutton

    I have an alternative solution which, I think, sticks to Ben’s original instruction set. I don’t have a Ben original to test on though.

    My solution is to do away with the counter since we can finish when the input value reaches zero. The problem is that we can’t test for zero after the counter has been increased because increasing the counter changes the zero flag.

    Instead I test if the addition (left shift) produces a carry. If so I branch to code to increase the count. If no carry then I test for zero. If so I jump to the final output and halt code, otherwise I go around the loop again.

    This does mean there’s an extra trip around the loop: in the penultimate loop value will be zero and carry will be set. The next and final loop sets both value to zero and carry unset.

    One the plus side the code can abort early if the least significant bit(s) are zero.

    0000: LD A,value
    0001: ADD A,value
    0010: STA value

    0011: JPC add
    0100: JPZ done
    0101: JMP start
    0110: LDA count
    0111: OUT
    1000: HLT
    1001: LDA count
    1010: ADD one
    1011: STA count
    1100: JP start

    1101: 00000001 ;Constant
    1110: 00000000 ;Initialise to zero

  5. Jeff Truck

    Scott, does the Fibonacci Sequence program (along with the rest of the programs) run on a version of the Ben Eater design as he originally published? I’ve entered Fibonacci Sequence program, verified what I wrote down on paper from above, verified that it was entered correctly, but I’m not getting a natural progression. I see the following sequence: 0,1,2,3,5,8,13,8,16, 24, 40, 64, 104, 168. I suspect I have a hardware issue.

    1. Scott

      Hey Jeff, I didn’t make any hardware changes to the computer from Ben’s original design, and the Fibonacci sequence program uses all original microcode. Some of the other programs use new microcode, but if you’re having issues with the Fibonacci program it’s likely a hardware bug on your end. Since you’re getting a “close” output, it’s probably something diabolical that’s going to need a close stepping through of things. 🙂

  6. vassilis papadopoulos

    here one sort version of 0-255-0 program of Ben’s 8bit cpu with your extra commands. very simple..

    0000 OUT
    0001 INC 1
    0010 JNC 0000
    0011 DEC 1
    0100 OUT
    0101 JNZ 0011
    0110 JMP 0000

  7. Jon Binder

    I came up with this program that divides using only Ben’s commands, figured you all might be interested in it. Due to memory restrictions I couldn’t add a jump, so it will only divide evenly below 63. Please let me know any improvements!

    0 LDA 14
    1 SUB 15
    2 JZ 4
    3 JMP 6
    4 LDA 12
    5 OUT
    6 LDA 14
    7 ADD 13
    8 STA 14
    9 LDI 1
    10 ADD 12
    11 STA 12
    12 0x01
    13 Divisor
    14 Divisor
    15 Dividend

    13 and 14 have to be the same and 12 has to be 1..

    Thanks -Jon, 17

  8. Tom

    new instructions: JN or jump if negative ; STO or store output reg
    I came up with this program that takes 2 numbers and finds the remainder if they were divided.

    0: LDA 9
    1: SUB 10
    2: JN 4
    3: JMP 1
    4: STO 11
    5: LDA 11
    6: ADD 10
    7: OUT
    8: HLT
    9: 11
    10: 5

    I haven’t really worked it out so be sure to reply if there are any bugs.

  9. Brian

    I implemented three of the instructions you mentioned.

    1001 ADI (add immediate, analogous to your INC)
    1010 SBI (subtract immediate, analogous to your DEC)
    1010 DSP (same as yours)

    This allowed me to rewrite the multiplication program to zero out location 13 so you don’t have to reset by hand after each run by using DSP 13, which shortens the program by 1 byte because you don’t have to LDA with the result for the OUT instruction. It also frees up location 12 in case you want to execute another instruction.

    00: LDI 0
    01: STA 13 //Zero out product
    02: LDA 14
    03: SBI 1 //Frees up location 12
    04: JC 7
    05: DSP 13
    06: HLT
    07: STA 14
    08: LDA 13
    09: ADD 15
    10: STA 13
    11: JMP 2
    13: product
    14: factor x
    15: factor y

    I have a hard time making my Eater EEPROM programmer work, so I just add the instructions straight onto the chip with a TL866:

    (each for x = 0 to 3, at each location and location + 1, all values hexadecimal)


    x4A: 08 02
    xCA: 20 81


    x52: 08 02
    xD2: 20 C1


    x5A: 48 10
    xDA: 00 10

    Hope found this interesting.


  10. martin

    to know that you solved the primality test I took as a challange. Here my solution (with extra-commands, no hardware-changes):

    0 SUB 0011 1111
    1 STA 0100 1110 DIGIT
    2 STA 0100 1101 VARIABLE
    3 LDAO 1101 1111 (LOADS AND OUTPUTS)
    4 LDA 0001 1101
    5 SUBI 1011 0001 (SAME AS DEC)
    6 STA 0100 1101
    7 LDA 0001 1110
    8 ADD 0010 1101
    9 JZ 1000 1100
    10 JC 0111 0100
    11 JMP 0110 1000

  11. martin

    My solution for x^y. With some smaller hardware modifications (Reset A register; decrement a given ram address with one command, needs to reset B register and to set the least significant bit directly by opcode). The code is tested and works:

    0000 LDA R begin adding loop
    0001 ADD X
    0010 STA R
    0011 DEC C (decrement counter)
    0100 JNZ 0 end adding loop
    0101 LDA R
    0110 STA C
    0111 AR (reset A-register)
    1000 STA R
    1001 DEC Y (decrement exponent)
    1010 JNZ 0
    1011 LDAOH R
    1100 COUNTER ( = X at start)
    1101 RESULT (empty at start)
    1110 EXPONENT (program exponent – 1)
    1111 BASIS

Leave a Reply