Bech32

Todays newsletter continuous with chapter 1 from Bitcoin: A Work in Progress.

I’ve also published this chapter as a blog post, mainly as an experiment to see if I can rank (my book) for "What is a Bitcoin address?".

Since last week I managed to install GrapheneOS on my new Pixel 7. I used command line tools for the job and found it pretty easy, but there’s a also a WebUSB approach which is apparently even easier. I then proceeded by finding equivalents of my iOs apps in the Play Store. This is the point where I have to admit I don’t really understand what GrapheneOS is doing, but it’s somehow keeping the Google beast at some distance, despite letting you use most of the good bits. Most apps work fine so far, even dinosaur banking ones. That said, having used iOs for over a decade the Android way of doing things will take a while to get used to. It’s like learning to write with your other hand.

Any recommendations for a good Youtube video or podcast episode explaining GrapheneOS?

Can’t get enough of RBF discussions? Last week’s Bitcoin, Explained episode 68 has some!

Yesterday morning I attended the preliminary hearing of Tornado Cash developer Alexey Pertsev. My observations are in this Twitter thread.

Bitcoin Addresses

Last week we covered base58, which is used for "legacy" addresses like 1HLoFg... But can we do better?

Along Came Bech32

In March of 2017, Pieter Wuille gave a presentation about a new address format, bech32, and it’s been used since SegWit arrived on the scene. As the name suggests, it’s a base32 system, which means you have almost all the letters, and almost all the numbers, minus some ambiguous characters that you don’t want to have because they look too much like other numbers or letters.

One of the biggest differences between bech32 and base58 is that there isn’t a mixture of uppercase and lowercase letters. Instead, each letter is only in there once — either in all uppercase or all lowercase — which makes reading things out loud much easier. The precise mapping of which letter or number corresponds to which value is, like in base58, fixed but arbitrary: The fact that Q means 0 and P means 1 has no deeper meaning.

A bech32 (BIP 173) address consists of two parts separated by 1, e.g. bc1q9kdcd08adkhg35r4g6nwu8ae4nkmsgp9vy00gf.

The first part is intentionally human readable, e.g. “bc” (Bitcoin) or “lnbc” (the Lightning network on Bitcoin). The values represented by “b,” “c,” etc. have no meaning. Rather, they’re there so humans can recognize, “OK, if the address starts with bc, then it refers to Bitcoin as the currency.” However, wallets will look for the presence of these values as a confidence check, and it’s included in the checksum.

The 1 is just a separator with no value. And if you look at the 32 numbers, 1 isn’t included — it means “skip this.”

The second part starts with the SegWit version number. Version 0 is represented with Q (bc1q…). Version 1 is what we call Taproot (see part @sec:taproot), as it’s represented with “P” (bc1p…). For version 0 SegWit, the version number is followed by either 20 bytes or 32 bytes, which means it’s either the public key hash or the script hash, respectively. And they’re different lengths now, because SegWit uses the SHA-256 hash (32 bytes) of the script, rather than the RIPEMD160 hash (20 bytes) of the script.

In base58, the script hash is the same length as the public key hash. But in SegWit, they’re not the same length. So by looking at how long the address is, you immediately know whether you’re paying to a script or you’re paying to a public key hash. As an aside, Taproot removes this length distinction, thereby slightly improving privacy.

So the new part is that there’s a set of 32 characters, but otherwise, things are very similar to base58. It’s again saying, “OK, here’s a P2PKH address.” In this case, it’s a Pay-to-Witness-Public-Key-Hash (P2WPKH), where witness refers to SegWit, but it’s the same idea. There’s a short prefix that tells both humans and the computer what the address is about, and this is followed by the hash of the public key or script.

Thirty-Two Dimensional Darts

However, conciseness isn’t the only benefit here. Another is error correction, or at least detection.

If there’s a typo in an address, then in the worst case scenario, you’re sending coins to the wrong hash of a public key. When the recipient tries to spend the coin, they reveal the public key, but due to the typo, its hash won’t match what the blockchain demands. The coins are forever lost.

Fortunately, base58 addresses contain a checksum at the end. That way, if you make a typo, the checksum at the end of the address won’t work. Your wallet will alert you to this, and it’ll refuse to send the transaction (the blockchain won’t protect you; only your wallet will, hopefully). But if you’re really unlucky, a typo can be such that it produces a correct checksum by sheer coincidence.

Bech32 was designed in such a way to make such a disastrous coincidence extremely unlikely. In addition, it won’t just tell you that there is a typo; it can tell you where the typo is. This is determined by taking all the bytes from the address and then hashing it using some sophisticated mathematical magic. You can make about four typos and it’ll still know where the typo is and what the real value is. If you do more than that, it won’t.

To illustrate this conceptually, it’s like if you have a wall and you draw a bunch of non-overlapping circles on it. The bullseye of each circle represents a correct value, whereas any other spot within the circle represents a typo. If you’re a good dart player, most of the time you’ll hit the bullseye, i.e. you typed the correct value. If you slightly miss the bullseye but you’re still within that big circle, the value will be slightly incorrect. Error detection is knowing that you missed the bullseye. Error correction is the equivalent of moving the dart to the nearest bullseye.

The idea there is you want the circles to be as big as possible, to facilitate even the sloppiest dart thrower, but you don’t want to waste too much space. Similarly, we don’t want Bitcoin addresses to be hundreds of characters long. That’s the kind of optimization problem mathematicians love.

In the case of bech32, instead of a two-dimensional wall, you have to somehow imagine a 32-dimensional “wall” with 32-dimensional hyperspheres. You’re hitting your keyboard, and somewhere in that 32-dimensional space, you’re slightly off, but you’re still inside this hypersphere, whatever that might look like. In that case, your wallet knows where the mistake is, and it prevents you from sending coins into the ether.