“EVM” stands for “Ethereum Virtual Machine”. A virtual machine is a program that runs arbitrary instructions. However, what makes a virtual machine different from “normal” programs is that it tries to emulate an actual, physical machine.
What does that mean, exactly? To answer, first let’s talk about physical machines.
The term “physical machine” normally refers to a CPU – a computer’s Central Processing Unit. This is a physical chip that exists in your computer – desktop, laptop, phone, etc. – and is the part of your computer that does the most actual work. It’s how your computer computes computations.
It can look something like this:
Over the decades, CPUs have been highly, highly optimized for speed of execution. They can execute instructions fast – trillions per second fast. That's a lot of instructions.
In contrast, the EVM is not designed for this high level of throughput. Because of gas costs, you'll run out of gas before getting anywhere near that magnitude of instructions within a transaction.
A virtual machine is a software program designed to behave like a physical machine. That is, it's software that – like a CPU – reads, interprets, and executes instructions.
The EVM is no different. It reads, interprets, and executes instructions encoded in a smart contract's bytecode. The tangible result from executing these instructions is updating contract state and returning data.
An instruction is a task you want your machine to perform. Each instruction is a small, low-level tasks.
For example, there is no instruction for high level tasks such as "make a network call" or "listen for this keystroke". Instead, you only have access to tasks like "add these two numbers together", "shift these bits", "write this value to memory", and so on.
On most platforms, the anatomy of an instruction consists of:
opcode + arguments
where an opcode is the task you want to perform, and the arguments are the relevant data you want to perform that task with.
But what do these two parts actually look like? Let's inspect and learn about each one.
First and foremost, machines operate on binary. Any instruction you send to a CPU is going to be sent as 0’s and 1’s in a pre-specified format.
What format? It depends on the platform! And since we’re talking about the EVM, let’s demystify an opcode right away:
Here is what the opcode for adding two numbers looks like:
00000001
That’s right, it’s just 8 numbers of binary – which is 8 bits, which is 1 byte. This is the entirety of an EVM opcode.
But as it turns out, reading binary isn’t that fun for humans. In our line of work, we rewrite the above as hexadecimal. Here are two examples of how that conversion happens:
00000001 - ADD opcode (binary)
=> 0000 0001
=> 0 1
=> 0x01 - ADD opcode (hex)
11111101 - REVERT opcode (binary)
=> 1111 1101
=> F D
=> 0xFD - REVERT opcode (hex)
This is much easier to read; it allows us to avoid counting and calculating which bits are which values, and read hex numbers and letters instead.
And that's it! You just learned what the entirety of an EVM opcode looks like: a single byte of data.
You might be thinking, "but what about the arguments?".
Let’s take the ADD
example again. While some platforms require you to specify where the two numbers are (e.g. registers), on the EVM, arguments are implicitly taken from the stack.
We will learn how exactly this works in 🧱 Working with the Stack.
So what does this have to do with machines vs virtual machines and the EVM? Well, to summarize:
How does the EVM execute binary?
Conceptually, you can think of execution as three parts:
0x62020f0960405260206040f3
0x62
is the PUSH3
opcode.0x62020f09
, the first byte (0x62
) is a PUSH3
opcode, which reads the next 3 bytes an its argument (0x020f09
).PUSH32
will increment the program counter by 33 bytes.JUMP
and JUMPI
opcodes allow you to set the program counter directly, to any position of a JUMPDEST
opcode.A major point you should learn is that the EVM, when given bytecode to run, attempts to run it as long as possible.
In other words, when executing bytecode, the EVM treats the first byte as an opcode, then runs the next byte as an opcode, and then the next, and then the next, and so on.
It continues to do this until one of the following happens:
Because of that last one, EVM execution is guaranteed to halt. Even an infinite loop will eventually halt, due to the amount of gas always being finite.
Also note that execution may not always be strictly from left-to-right; the JUMP and JUMPI opcodes allow the program counter to jump to any other part of the bytecode – as long as the destination byte is a JUMPDEST byte.
Below is an illustration of EVM execution with a program counter. Note that the rest of the execution environment (stack, memory, storage, gas) is omitted from this visual for simplicity – we will get to those details in future chapters.