Machine Code

MachineCode popularly speaking, is programming the computer in its own language, or close to it. The CPU really doesn't understand much; all it understands is if a bit is on or not. Many 8 bit CPU's have 16 address bits, while some 8 bit microcontrollers have only 8 address bits, all 8 bit CPU's have 8 bit data bits [?]. Let's take a simple 8-bit CPU with 16 address bits as example.

So to write MachineCode for it, you could use 16 on/off switches to specify an address, and 8 to specify data. This method was used by some programmers to program an EPROM, even though we have easier ways nowadays.

There are several abstraction levels, convenient for humans, above machine code: (one that is implied, is using HexaDecimal instead of binary powerswitches)

 +--------------+-----------------+ 
 |Level| Example  |
 +--------------+-----------------+
 |LowLevel?| Asm  |
 |MidLevel| C  |
 |HighLevel| Lisp, Smalltalk |
 +--------------+-----------------+

( NthGenerationLanguage discusses the possibilities of even higher abstraction levels )

The following code is supposed to be a program for the CommodoreSixtyFour, that will make your screen flicker with colours. The code has not been checked, it's typed mostly from memory, so there might be an error ... after all, its hard to remember opcodes and stuff after 5-10 years of little or no use.

Note that assembler is very simple, each instruction does very little, and there are only few instructions. In order to learn assembly, you actually just need to know a few things about the CPU, and have a list of instructions with explanation, then you're set to go. Of course, if you want to use the graphics card, other hardware, or software like a library or driver, you need to know how to do that also.

The following example use 7 commands, which works like this:

SEI = SEt Interrupt disable bit. - stop the interrupt, which is an interrupt of the execution of code, that interrupts execution on a regular basis. The idea is that the machine should check keyboard, and other stuff on a regular basis.

LDA #$.. = LoaD Accumulator immediate. There are several LDA instructions, but this one loads the byte following right into the 8-bit accumulator register. (First $80, which is hex for 128 decimal, then $2D). Note that the accumulator only keeps the numbers temporary. Its like the CPU's hand, that picks up the numbers and puts them where you want them.

STA $.... = STore Accumulator. Stores the value in the accumulator in the memory (RAM) addressed by the two bytes following, in LittleEndian. The memory addresses $0314 and $0315 had a special meaning on the CommodoreSixtyFour. Those two addresses held the interrupt "vector" (also in LittleEndian). Usually this "vector" pointed to the address $EA31, where the standard interupt procedure resided in the "Kernel ReadOnlyMemory".

CLI = CLear Interupt disable bit. This allows the interrupt to run again.

RTS = ReTurn from Subroutine. This pops the last 2 bytes on the stack and jumps there. Usually, you use "JSR" (Jump SubRoutine), which pushes the address of the next instruction on the stack, so that you can "return" with "RTS".

INC $.... = INCrement the memory address mentioned in LittleEndian right after the instruction. In this case, $D020 which was MemoryMapped? to the VIC graphics chip on the CommodoreSixtyFour, and controlled the colour of the border of the screen. Only the lower nibble was used, the upper nibble was always $F.

JMP $.... = JuMP to the memory location right after the instruction, again in LittleEndian. The address $EA31 was, as mentioned earlier, the usual address for the interrupt vector. The idea is that we let the machine work as it did before, only executing our code first. The code isn't quite clean though, since the INC operation might taint the "Carry" flag, and possibly the "Overflow" flag. I don't think that it matters a lot in this case, but if you wanted real clean code, you would have wanted to push the flag register as the first thing, then pop it just before the jump.

In AssemblyLanguage:

    Start:.org $8020
    SEI
    LDA#$80
    STA$0315
    LDA#$2D
    STA$0314
    CLI
    RTS
    INC $D020
    JMP $EA31

FixMe - this is just an example, but it's not very far from how real 8-bit AssemblyLanguage for a 6510/6502 looks.

In machinecode:

    802078
    8021A9 80
    80238D 15 03
    8026A9 2D
    80288D 14 03
    802B58
    802C60
    802DEE 20 D0
    80304C 31 EA

(If you have never programmed in assembly before, you should know that the four-digit number at the left is the address and the two digit numbers to the right are the contents of the bytes starting at that address. The bytes are broken up one instruction per line, and some instructions use more bytes than others.)

In C64 basic: (Much like games were listed in magazines, at least in Denmark)

The idea with the basic code, is that on the C64 you could not enter MachineCode directly, so this program will first "poke" the program into memory, and then make a SYStem call sys(32768 +32) to activate the code just poked in. That lines with "DATA" in them, are all the same opcodes in base 10, except '-1' which is used to mark the end of the code. The check with "50" is a simple checksum .. if the number of data is ok, we assume that it's ok.

The example serves to show a possible real life example of how MachineCode was widely used in the days of the CommodoreSixtyFour, AppleTwo, VicTwenty?, ZxSpectrum? and so on.

     10 BASE = 32768 + 32
     20 READ BYTE
     30 IF BYTE = -1 THEN BASE = BASE -1 : GOTO 999
     40 POKE BASE, BYTE
     50 BASE = BASE + 1
     60 GOTO 20
    999 IF BASE = (50 + 32768) THEN SYS(32768 + 32) : END
   1000 DATA 120
   1010 DATA 169, 128
   1020 DATA 141, 21, 3
   1030 DATA 169, 45
   1040 DATA 141, 20, 3
   1050 DATA 88
   1060 DATA 96
   1070 DATA 238, 32, 208
   1080 DATA 76, 49, 234
   1100 DATA -1
   9999 PRINT"ERROR: CHECK IF THERE IS A TYPO?"

The program example has been tested with "vice", http://www.viceteam.org/ there was a bug (OffByOne) that is fixed now -> in line 30, subtract one from base before going to line 999.

As another note, you could add a few pokes and stuff, then change the "sys command" to "sys 64768" (soft reset), but remember to save the program, before you run it, in that case. Maybe someone who remember the CommodoreSixtyFour will add this to this page, and we can all hack it good.

The idea is that if the keycodes (The C=64 wasn't using ASCII) for 'CBM80' was in memory at $8004, the computer when reset, would jump to the address specified in $8000 and $8001 (Little Indian). The bytes in $8002 and $8003 would contain the address to hit, when the restore key was hit.


Please note: "machine code" is something of an abstraction. The CPU has no language per se; it is the instruction set that is being referenced here. The keywords LDA (load accumulator with immediate) and STA (store accumulator to memory) are mnemonic aids to describe the operation that the CPU performs when it encounters these instructions in the program sequence. This is not code - it is the actual operation that the CPU performs.

Another note: The code executed by a VirtualMachine (as opposed to directly executed by a physically constructed processor) is often called ByteCode. For any given ByteCode there might one day be a physical processor able to execute it directly.

(Further info on this in JavaCompiler)


This was a classic bit of x86 MachineCode - "B4 00 B0 13 CD 10". It calls the interrupt to enable the classic 320x200x256 color palette mode. To enter "Mode X", you'd first place the VGA card into 320x200x250 mode (as with the code shown below), then tweak the VGA's CRTC and memory controller registers so that you have access to a larger resolution display.

The code "B4 00 B0 13 CD 10" in AssemblyLanguage is:

 mov ah, 0   - set msb to zero
 mov al, 13h - set lsb to 13 hex
 int 10      - call interrupt 10

So "B400B013CD10" is the code that really enabled all those great games of the late eighties and early nineties.

-- LayneThomas

Incidentally (assuming a 16-bit code segment) B8 13 00 CD 10 is an obvious optimisation. And of course, as hinted at above, the CD 10 (instruction that causes software interrupt 10h) on PC-compatibles (as implied here) is a call to the (video) ROM-BIOS's interface, which requests it to do the necessary actual I/O to program the card.


See: AssemblyLanguage,WriteAssembler, ByteCode, JavaByteCode, JavaAssemblerCode, JavaCompiler

See also MachineProgrammingLanguage for a discussion of MachineCode as a ProgrammingLanguage.

Compare with MicroCode (mentioned briefly on the AssemblyLanguage page).


CategoryMachineOrientation


EditText of this page (last edited December 16, 2014) or FindPage with title or text search