Bitcoin is an open source, peer-to-peer electronic currency created by Satoshi Nakamoto and maintained by a small team of developers. As part of what's turning into an ongoing series on the distributed Web, I talked to contributor Gavin Andresen about how the software works. This is a technical overview. If you're interested in an economic or political look at the software, you can read the Wikipedia entry or Niklas Blanchard's essay on the project.
Klint Finley: Could you give us a brief overview of what Bitcoin is for the unfamiliar?
Gavin Andresen: Sure. Bitcoin is the first peer-to-peer currency - it is money created by people instead of by a central bank or government.
And how does it work?
Everybody trying to create bitcoins and everybody trading bitcoins is connected by a peer-to-peer network. And the code everybody is running makes sure nobody else is cheating - nobody else is creating more bitcoins than are allowed, nobody is trying to spend their bitcoins more than once, and that bitcoins are only being spent by their rightful owners.
The really novel idea is a mechanism for preventing bitcoins from being spent more than once WITHOUT relying on a central authority.
The other mostly new idea is limiting the supply of bitcoins without relying on a central authority.
How do you accomplish these things without a central authority? And how do Bitcoin clients and servers find each other?
Let me tackle the easy one first - how do Bitcoin clients find each other:
All p2p networks have "the bootstrapping problem" - without central servers, nodes (machines) on the network need to be able to find each other. Bitcoin solves it using three mechanisms:
- By default, Bitcoin clients join an IRC chat channel and watch for the IP addresses and ports of other clients joining that channel. The name of that channel (and the name of the IRC chat server) is hardcoded into the Bitcoin software.
- There is a list of "well known" Bitcoin nodes compiled into the software in case the IRC chat server is unreachable for some reason.
- You can manually add (via configuration file or command-line option) IP addresses of other machines running Bitcoin to connect.
Once you're connected to the Bitcoin p2p network, other machines send you messages containing IP addresses (and ports) of other machines they know about, so after bootstrapping you find other Bitcoin nodes via the Bitcoin network itself.
There is a lot of discussion about alternative bootstrapping mechanisms, so I wouldn't be surprised if alternative Bitcoin implementations that use something else pop up in the next year or so.
I'm guessing you can also change the IRC server and channel manually as well?
No, actually, you can't - you'd have to recompile Bitcoin to do that.
Why is that?
The whole idea with bootstrapping is to define one place where you can go to find other nodes; establishing a second IRC channel by yourself would be pointless, and there's no real limit to the number of clients that can connect to the one IRC Channel, so there's really no reason to use more than one.
But what if the IRC server gets shut down?
Bitcoin actually did change which IRC server is used for bootstrapping a few months ago - a new client was released that used a different IRC network. If the IRC server is down temporarily, then brand-new Bitcoin clients will use the backup hard-coded 'well-known' list of nodes. Old Bitcoin clients will connect to nodes that they connected to before - IP addresses are remembered (in an 'addr.dat' database file) from the last time Bitcoin ran.
Is there any mechanism to require clients use the most recent version of the software in order to connect?
Yes... and no.
Old clients aren't prevented from connecting, and we work hard to support old clients in an upward-compatible way. But there was a major bug a couple months back that prompted the lead developer (Satoshi) to implement a mechanism so that anybody running an insecure version could be notified that they should upgrade.
That didn't, of course, help people running the really old versions of Bitcoin. Hindsight is always 20/20
So how about the harder part - preventing bitcoins from being spent more than once and limiting the supply of bitcoins without relying on a central authority?
Right: I'll start with limiting the supply.
Bitcoin is designed so that there will only EVER by 21 million bitcoins created. Right now, they are being created at a rate of 50 every ten minutes or so.
First, everybody who is trying to create bitcoins is racing to be the first to create what is called a 'block'. A 'block' is just all the transactions ("I give/sell/trade these bitcoins to you") that have happened since the last block. So to generate bitcoins, my machine sits and listens for new transactions, bundles them all up into a block, and then repeatedly runs a hashing algorithm on that data. It is running a hashing algorithm (which transforms that arbitrary-length data into 256 bits) to try to find a hash that is small enough to declare "I won the race to find the next valid block" to the rest of the network. If it does, it gets rewarded with 50 bitcoins.
Right now, every new block is worth 50 bitcoins. All of the nodes on the network evaluate how many blocks have been created about every two weeks, and automatically (and independently) adjust the 'difficulty' so that, on average, across the whole network, one block will be found about every 10 minutes. That's how the supply is limited without any central server.
The only way to create a block is to solve the "find a small hash" problem, which, as far as we know, just takes a lot of brute-force computation (computing hashes over and over and over on slight variations on the block data). And everybody on the network can check to make sure you actually DID solve that problem (computing just one hash is really easy), can make sure you only included VALID transactions in the new block, and you only claimed 50 bitcoins for yourself in that new block.
How many transactions make up one block?
An arbitrary number - at least one (the first transaction is always a 50-bitcoin transaction to reward whoever found the block), up to an arbitrary limit that can be fairly easily changed in the future. The Bitcoin Block Explorer website will actually show you any of the 100,000 bitcoin blocks and what is in them.
And the other part? Making sure a coin is only spent once?
All transactions are broadcast to all the nodes on the p2p network... which is the first part of the solution.
If a double-spend is broadcast, then some of the nodes will see one of the spend transactions first, and others will see it second.
Which one 'wins' is determined by which node happens to solve the next block - whichever node finds the small hash will include the first valid transaction involving those bitcoins that it sees. That one become "THE" spend, and the other attempted double-spend is considered invalid.
Things get really mind-warping every once in a while when two nodes on the network happen to find valid blocks at about the same time.
Do you have any insight into whether many people are trying to cheat?
Good question! I wish somebody would write a tool to listen on the network for attempted double-spends.
Can double-spends happen accidentally?
It is really hard to double-spend accidentally.
There are several developers involved in Bitcoin - how is the project managed?
We're in the middle of a transition from a basically one-person project (Satoshi writing all the code and being a gatekeeper for all changes) to a more open-source, community-driven model. I've volunteered to try to 'herd the cats' and create a more distributed way of making Bitcoin-the-software better. We'll be trying a Linux-like process, where patches are submitted (using Github), reviewed, and then accepted or reworked.
Can you recommend any resources for developers interested in getting started with distributed computing?
Hmm... good question! So much of distributed computing is still cutting-edge research, I'm not sure where to start. My personal background is not in p2p networking, so I'm not the right person to ask.
Do you think distributed computing, combined with strong cryptography, can create a decent alternative to "cloud computing" for developers that can't afford to rent servers?
Yes, although it is VERY inexpensive to deploy applications in the cloud. I'm using Google's App Engine for the front-end to my Bitcoin-related projects, and VPSes and Amazon's EC2 for the back-end Bitcoin-network nodes.
Creating an application using App Engine is a good, free way to start learning about scalable, distributed computing.
Do you think Bitcoin's source code be used as the foundation to build other types of peer-to-peer communities?
No - it is really designed to do one thing and do it really well. Other peer-to-peer communities will have different networking needs.