Todays newsletter continuous with chapter 1 from Bitcoin: A Work in Progress. But first a few updates.
You may have forgotten about it with all the recent drama, but there were two major Lnd node crashes recently. Both were caused by specially crafted Taproot transactions. These transactions were perfectly valid, but some node software and libraries failed to process them correctly. Aaron van Wirdum and I explained all this in episode 66 and the first part of episode 67 of Bitcoin, Explained. For even more juicy details, including an explanation by the person who created these transactions, listen to Citadel Dispatch episode 79.
I reinstalled my Mastodon server so you now follow me by searching @sjors@sprovoost.nl, or just click here. It’s nice to see more people show up outside the Bitcoin bubble. The filter feature is great too, I recommend adding one for "RT @".
Finally, I just received my Pixel 7 so I can start playing with GrapheneOS:
Yes, the box arrived damaged.
Last week we explained how an address is a representation of an encumbrance, i.e. a constraint that specifies who can spend it. Usually that someone is the owner of a private key, and the address is a hash of the public key corresponding to that private key.
An address is a convenient way to communicate which script needs to go on the blockchain. As we said above, the purpose of this script is to constrain the coin so that only the recipient can spend it. The address itself doesn’t exist on the blockchain. It doesn’t even contain the full script.
Of the two main types of scripts in use back in the day, addresses were only used for Pay-to-Public-Key-Hash (P2PKH). When a wallet sees this address, it produces a script for the Bitcoin blockchain, which requires that the person spending it has the public key belonging to the hash (chapter 10 on Miniscript contains the actual script). Only the hash is published, so the public key remains secret until the recipient spends the coins.
An address starts with the number 1, followed by the hash of the public key. It’s encoded using something called base58. Here’s an example:
1HLoFgMiDL3hvACAfbkDUjcP9r9veUcqAF
To understand what base58 is, it’s important to first understand more about base systems in general.
With base10, think about your hand. You have 10 fingers. So if you want to, for example, express the number 115 (1, 1, 5), you can make three gestures with your hands by showing 1, 1, and 5. That’s also how you write down numbers, which — since the invention of clay tablets and paper — is more convenient than using fingers. So base10 is a decimal system that uses 10 different symbols, in various combinations, to represent any number (integer).
However, there have been — and still are — different bases. For example, the Babylonians used base60. And to read machine code, typically you’d use hexadecimal, which is base16 — 0 to 9, and then A to F. Meanwhile, computers tend to use base2 internally — a binary number system — because transistors are either on or off. This translates to using two digits, either 0 or 1, to do everything, and you can express any number that way.
Satoshi introduced base58, which uses 58 different symbols: 0 through 9, and then most of the alphabet in both lowercase and uppercase. But there are some letters and numbers that are skipped because they’re ambiguous and users could easily mistake them for the wrong one — for example, the number 0 and the uppercase letter O, and capital I and lowercase l.
Have you ever seen email source code for an attachment or similar? There are a lot of weird characters. That’s base64, and base58 is based on that. But base64 includes characters like underscores, plus, equals, and slash. These are omitted in base58 to make visual inspection easier and to behave nicely as part of a URL.
So how does this relate to P2PKH? Well, the address is expressed as a 1, followed by the public key hash, which is expressed in base58.
That’s the information you send to somebody else when you want them to send you bitcoin. You could also just send them 0x00, and then the public key. And maybe they’d be able to interpret that, but probably not. (Note: a pair of hexadecimal digits, prefixed by 0x, is often used to denote bytes, which can take on one of 16 × 16 = 256 different values, so this represents one byte with the value 0.)
In theory, you could send somebody the Bitcoin script in hexadecimal, which is the format used on the blockchain, because that’s just binary information. The blockchain has this script that says, “If the person has the right public key hash and the public key belonging to this public key hash, then you can spend it.” To learn more about how Bitcoin scripts work, refer to chapter 10 of the book.
But even with all these options, the convention is that you use this standardized address format, which explains why all traditional Bitcoin addresses start with a 1, and why they’re all roughly the same length.
In addition to using base58 for sending a Bitcoin address, you can also use it to communicate a private key. In such a scenario, the leading symbol is a 5, which represents 128 and is interpreted as a version byte. That’s then followed by the private key.
In the past, users had paper wallets they could print. And if they were generated securely without a back door, then on one side of the piece of paper would be something starting with a 1, and on the other side of the paper would be something starting with a 5. And then it specified that only the Bitcoin address should be shown, but the private key shouldn’t be shared.
There are also addresses that begin with a 3, which is for coins encumbered by the hash of a script, rather than the hash of a public key. We’ll cover Pay-to-Script-Hash (P2SH) in chapter @sec:miniscript. Usually these are multi-signature addresses, but they could also be SegWit addresses.
Although base58 addresses worked fine, there was room for improvement. We’ll talk about that next time. Can’t wait? Also want to read the footnotes, just buy the book.