In the previous chapters we learned how to work with the stack and how to work with memory.
In this chapter, we finally introduce the most fun and useful one of all: storage.
We will cover:
Unlike the stack and memory, contract storage is data that persists beyond the execution of a single EVM function call and parent transaction.
SLOAD and SSTORE are the only opcodes that interact with storage. They do what we’d expect: load the value for a given key, and write a value to a given key, respectively.
Storage is structured differently than memory or the stack. While memory is like an infinitely long string, storage is like a key-value database, where:
A storage key can be any 256-bit value. Unlike memory’s practical limitations, storage does not have any limitations on what keys you can use.
A storage value can only ever be exactly 32 bytes (256 bits).
The combination of a key and value is often referred to as a storage slot.
Each contract has its own isolated storage key namespace. This means two contracts can use the same storage key without conflict.
When running code on chain, nothing else can access a contract’s storage other than the contract itself. Specifically, other contracts cannot access a contract’s storage slots directly.
If you want your contract’s storage to be readable or writeable by other contracts, your contract will need to expose functions to do so. While it’s common to give external contracts ways to read and potentially modify your contract’s storage, it also introduces security risks, and should be done thoughtfully and carefully.
Like most of the EVM, storage keys and values do not have “types”; the EVM treats them as raw 32-byte values of zeroes and ones. It’s up to your contract to assign meaning to them.
Storage is used for data that we want to persist across multiple transactions, since the contents of the stack and memory are lost at the end of a transaction. Below are some common use cases for simple types.
uint256
is astronomically large, so almost any use cases for numbers we have will fit in a single uint256
.uint32
for a situation where we know our value won’t overflow the type.block.timestamp
(or equivalently, the TIMESTAMP
EVM opcode) is a Unix timestamp, so we can store timestamps as numbersuint256
will result in the lowest gas costs.uint32
, be aware that a uint32 timestamp only supports time up to February 7, 2106. Once this date is reached, the uint32 will overflow. You should either use a larger value, or write extra logic to take this case into account, or ensure your project won’t live beyond that time.Most Solidity contracts will have storage variables defined at the top level. These will be put into slots, starting from slot 0
. Consecutive items will generally be packed as efficiently as possible. For example:
contract MyContract {
uint32 a; // First slot
uint64 b; // First slot
uint128 c; // First slot
uint64 d; // Second slot
uint256 e; // Third slot
}
As shown in the animation below, a
, b
, and c
fit together in a single slot. d
does not fit, and so it starts a new slot. e
does not fit, so it starts a new slot.
We can often pack additional data into a slot alongside an address. The first three storage variables would fit into a single slot. Since an address takes more than half a slot, we can’t fit two addresses in one slot.
contract MyContract {
address a; // First slot
uint32 b; // First slot
uint32 c; // First slot
address d; // Second slot
address e; // Third slot
}
Solidity also packs struct members in a similar way. For more details, see 💠 How Solidity Storage Slots Work (coming soon).
Some familiar programming constructs, like strings and arrays, exist in languages that compile to EVM (such as Solidity and Vyper), but these constructs are not native to the EVM itself. Below is a brief introduction into how Solidity implements them.
slot(p)
. From here, items are packed according to the same rules as for fixed length arrays.p
at the key key
can be found at slot[keccak256(H(k, p)]
where H
is a function that pads k
and then concatenates with p
.The gas costs of operating on storage are much higher than operations on storage and memory. This makes sense, because changes to storage need to be persisted to disk by all nodes in the network after the transaction completes. In general, gas costs are higher for operations that require nodes to do more computation or use more resources.
To give a sense of how expensive the storage operations are:
SSTORE
operations can be up to 7,000 times more expensive than commonly used opcodes like ADD
or PUSH1
, andSLOAD
operations can be up to 700 times more expensive.Storage operations are more expensive than stack and memory operations because changes to storage need to be persisted on disk by all nodes in the network. While not the only factor, real-world cost to node operators is one of the primary factors used when deciding gas costs.
Note that to save space, node implementations of the storage data structure don’t retain keys when the value is 0. If the node’s storage data structure has no entry for a given key in a mapping, then the value is assumed to be 0.
The EVM’s rules for gas costs of storage operations are fairly involved, so let’s first define some relevant terms:
SSTORE
) or read from (via SLOAD
), that slot is considered “touched” for the rest of the transaction. Note that here and throughout this chapter, “transaction” refers to the full external transaction initiated by an EOA.2100
gas cost when touching a storage slot for the first time in a transactionADD
has a fixed cost of 3. It has no dynamic cost.MSTORE
has a static cost of 3, and a dynamic cost based on how much memory expansion (TODO: Link to chapter section) is necessary.SSTORE
has a fixed cost of 0, and a dynamic cost based on context, such as whether the slot is warm/cold, whether it’s clean/dirty, etc.SSTORE
operation has an associated refund (usually, there is a refund when the operation reduces the work required of nodes). Note that, as of the London hard fork, SSTORE
is the only opcode that can have a nonzero gas refund.The total cost of the instruction is: static_cost + dynamic_cost - gas_refund
. As mentioned above, the static_cost
is always 0 for the two storage-related opcodes, so we omit static_cost
in the discussion below.
Now that we have some terms defined, let’s cover some useful, common gas cost scenarios.
Writing a non-zero value to a key that previously had a 0 value is the most expensive operation. The total cost is 22,100
gas, which includes the 2100 cold touch cost.
This gas cost of this operation is high because nodes now have to allocate space on disk for the key-value pair.
When writing to a non-empty slot, the gas cost is lower, because the node running your code has already allocated space on disk for the slot.
If the slot is “clean”, writing a non-zero value costs 2900
gas if the slot is warm, and 5000
if it’s cold.
Reading from a cold storage slot costs 2100 gas. Interestingly, this cost is the same even if that key has never been written to.
When we change a non-zero value to zero for a clean slot, the dynamic_cost
is the same as when writing a new non-zero value to a non-empty slot (2900
if the slot is warm, 5000
if cold). However, in this case there is also a gas_refund
of 4800
. If the slot is cold, the gas cost of the SSTORE
will be 5000 - 4800 = 200
. If the slot is warm, the gas cost of the SSTORE
actually ends up being negative, but since the slot is warm, there must have been an earlier SSTORE
whose cost will more than outweigh the negative value.
If a storage slot is “warm”, i.e. has already been read or written to in the current transaction, the cost of reading from it is only 100 gas.
If a slot is “warm” and the value we’re writing to the slot matches what’s currently there, the gas cost is 100. Note that since the cost to read from a warm slot is also 100, reading a slot to check that the value is different before writing to it will not result in any gas savings.
Another scenario that results in a gas refund is when we change a storage value back to its original value from the start of the transaction.
Suppose the value in our slot at the start of the transaction is 17. We use an SSTORE
to change the value to 18 - as discussed above, the cost of this SSTORE
is 5000
, because the slot is cold. Now suppose we use another SSTORE
to change the value back to 17 - the dynamic_cost
is 100, and there is a gas_refund
of 4800
, which almost entirely cancels out the cost of the first SSTORE
.