The developer version of Debian GNU/Linux (“wheezy”) contains 17,141 packages of software, or 419,776,604 lines of code. With that figure, James Bromberger estimates that Debian would cost about $19.1 billion to produce. Bromberger also looks at the cost of individual projects like PHP, Apache and MySQL. Even at more than $19 billion, the figure is likely far short of what it would actually cost to produce.
Bromberger is following the same methodology that was used back in 2001 to estimate the cost of developing Debian 2.2 (“potato”). Debian 2.2 was estimated at $1.9 billion, which demonstrates just how much Debian and the upstream software ecosystem has grown in 11 years.
The estimate uses the SLOCCount tools from David A. Wheeler, which “automatically estimate the effort, time, and money it would take to develop the software.” It uses the Constructive Cost Model (COCOMO) to estimate costs. This assumes that the average developer wage is about $72,533.
What about individual projects within Debian? Samba would cost about $101 million, Apache’s httpd about $33.5 million, and MySQL runs about $64.1 million. Considering that Sun forked over about $1 billion for just MySQL, I suspect that these figures are rather on the low side.
The $19.1 billion doesn’t even count the additional work added by Debian developers to upstream software. Debian developers actually have to package the software, configure it all to work together, add patches to the upstream software, and so forth. It also doesn’t take into account what it costs to run Debian’s infrastructure (such as its build machines), and so forth. Nor do the figures seem to take into account the overhead that would be required to actually carry fully loaded salaries with benefits, management, etc. If those things were taken into account, the figure would likely be much higher.
But even if it’s not 100% accurate, it’s still interesting to have a rough idea what all this development might cost if you were looking at paid software development.
As a side note, Bromberger also looked at the programming languages that make up Debian’s software. ANSI C and C++ are far and away the most used. ANSI C accounts for 40% of the Debian codebase, while C++ comes in with 20%. Next is Java, with 8% of the codebase. Compare that with language rankings according to GitHub and Stack Overflow or the TIOBE Index.
And what would you pay for all that software? No cost at all. Not a bad deal, that.