Randall Munroe
3,899,812 views • 9:29

So, I have a feature on my website where every week people submit hypothetical questions for me to answer, and I try to answer them using math, science and comics.

So for example, one person asked, what would happen if you tried to hit a baseball pitched at 90 percent of the speed of light? So I did some calculations. Now, normally, when an object flies through the air, the air will flow around the object, but in this case, the ball would be going so fast that the air molecules wouldn't have time to move out of the way. The ball would smash right into and through them, and the collisions with these air molecules would knock away the nitrogen, carbon and hydrogen from the ball, fragmenting it off into tiny particles, and also triggering waves of thermonuclear fusion in the air around it. This would result in a flood of x-rays that would spread out in a bubble along with exotic particles, plasma inside, centered on the pitcher's mound, and that would move away from the pitcher's mound slightly faster than the ball. Now at this point, about 30 nanoseconds in, the home plate is far enough away that light hasn't had time to reach it, which means the batter still sees the pitcher about to throw and has no idea that anything is wrong. (Laughter) Now, after 70 nanoseconds, the ball will reach home plate, or at least the cloud of expanding plasma that used to be the ball, and it will engulf the bat and the batter and the plate and the catcher and the umpire and start disintegrating them all as it also starts to carry them backward through the backstop, which also starts to disintegrate. So if you were watching this whole thing from a hill, ideally, far away, what you'd see is a bright flash of light that would fade over a few seconds, followed by a blast wave spreading out, shredding trees and houses as it moves away from the stadium, and then eventually a mushroom cloud rising up over the ruined city. (Laughter)

So the Major League Baseball rules are a little bit hazy, but — (Laughter) — under rule 6.02 and 5.09, I think that in this situation, the batter would be considered hit by pitch and would be eligible to take first base, if it still existed.

So this is the kind of question I answer, and I get people writing in with a lot of other strange questions. I've had someone write and say, scientifically speaking, what is the best and fastest way to hide a body? Can you do this one soon? And I had someone write in, I've had people write in about, can you prove whether or not you can find love again after your heart's broken? And I've had people send in what are clearly homework questions they're trying to get me to do for them.

But one week, a couple months ago, I got a question that was actually about Google. If all digital data in the world were stored on punch cards, how big would Google's data warehouse be? Now, Google's pretty secretive about their operations, so no one really knows how much data Google has, and in fact, no one really knows how many data centers Google has, except people at Google itself. And I've tried, I've met them a few times, tried asking them, and they aren't revealing anything.

So I decided to try to figure this out myself. There are a few things that I looked at here. I started with money. Google has to reveal how much they spend, in general, and that lets you put some caps on how many data centers could they be building, because a big data center costs a certain amount of money. And you can also then put a cap on how much of the world hard drive market are they taking up, which turns out, it's pretty sizable. I read a calculation at one point, I think Google has a drive failure about every minute or two, and they just throw out the hard drive and swap in a new one. So they go through a huge number of them. And so by looking at money, you can get an idea of how many of these centers they have. You can also look at power. You can look at how much electricity they need, because you need a certain amount of electricity to run the servers, and Google is more efficient than most, but they still have some basic requirements, and that lets you put a limit on the number of servers that they have. You can also look at square footage and see of the data centers that you know, how big are they? How much room is that? How many server racks could you fit in there? And for some data centers, you might get two of these pieces of information. You know how much they spent, and they also, say, because they had to contract with the local government to get the power provided, you might know what they made a deal to buy, so you know how much power it takes. Then you can look at the ratios of those numbers, and figure out for a data center where you don't have that information, you can figure out, but maybe you only have one of those, you know the square footage, then you could figure out well, maybe the power is proportional. And you can do this same thing with a lot of different quantities, you know, with guesses about the total amount of storage, the number of servers, the number of drives per server, and in each case using what you know to come up with a model that narrows down your guesses for the things that you don't know. It's sort of circling around the number you're trying to get. And this is a lot of fun. The math is not all that advanced, and really it's like nothing more than solving a sudoku puzzle.

So what I did, I went through all of this information, spent a day or two researching. And there are some things I didn't look at. You could always look at the Google recruitment messages that they post. That gives you an idea of where they have people. Sometimes, when people visit a data center, they'll take a cell-cam photo and post it, and they aren't supposed to, but you can learn things about their hardware that way. And in fact, you can just look at pizza delivery drivers. Turns out, they know where all the Google data centers are, at least the ones that have people in them.

But I came up with my estimate, which I felt pretty good about, that was about 10 exabytes of data across all of Google's operations, and then another maybe five exabytes or so of offline storage in tape drives, which it turns out Google is about the world's largest consumer of.

So I came up with this estimate, and this is a staggering amount of data. It's quite a bit more than any other organization in the world has, as far as we know. There's a couple of other contenders, especially everyone always thinks of the NSA. But using some of these same methods, we can look at the NSA's data centers, and figure out, you know, we don't know what's going on there, but it's pretty clear that their operation is not the size of Google's.

Adding all of this up, I came up with the other thing that we can answer, which is, how many punch cards would this take? And so a punch card can hold about 80 characters, and you can fit about 2,000 or so cards into a box, and you put them in, say, my home region of New England, it would cover the entire region up to a depth of a little less than five kilometers, which is about three times deeper than the glaciers during the last ice age about 20,000 years ago.

So this is impractical, but I think that's about the best answer I could come up with. And I posted it on my website. I wrote it up. And I didn't expect to get an answer from Google, because of course they've been so secretive, they didn't answer of my questions, and so I just put it up and said, well, I guess we'll never know.

But then a little while later I got a message, a couple weeks later, from Google, saying, hey, someone here has an envelope for you. So I go and get it, open it up, and it's punch cards. (Laughter) Google-branded punch cards. And on these punch cards, there are a bunch of holes, and I said, thank you, thank you, okay, so what's on here? So I get some software and start reading it, and scan them, and it turns out it's a puzzle. There's a bunch of code, and I get some friends to help, and we crack the code, and then inside that is another code, and then there are some equations, and then we solve those equations, and then finally out pops a message from Google which is their official answer to my article, and it said, "No comment." (Laughter) (Applause)

And I love calculating these kinds of things, and it's not that I love doing the math. I do a lot of math, but I don't really like math for its own sake. What I love is that it lets you take some things that you know, and just by moving symbols around on a piece of paper, find out something that you didn't know that's very surprising. And I have a lot of stupid questions, and I love that math gives the power to answer them sometimes.

And sometimes not. This is a question I got from a reader, an anonymous reader, and the subject line just said, "Urgent," and this was the entire email: "If people had wheels and could fly, how would we differentiate them from airplanes?" Urgent. (Laughter)

And I think there are some questions that math just cannot answer. Thank you. (Applause)