Narayanan Shivakumar on Google's Hardware
Submitted by Dale on July 30, 2006 - 3:15am
The following notes are part 1 of 3 from Narayanan 'Shiva' Shivakumar's presentation at the July 27th VanHPC meeting. These notes cover Shivakumar's discussion of the Google hardware infrastructure.
- Google's goal to organize the worlds information
- How do we keep scaling? - We like to pre-compute as much as we can.
- How do we built the right computing platform?
- Focus on price/performance
- Insure app developers can use the infrastructure for other things
- Early decision was to use lots commodity PCs
- Decided early on they needed to partition across a lot of machines because of data scale
- Lots of PCs requires the ability to swap bad hardware and have things continue to work
- Lots of PCs means big heat problem
- In house rack, pc motherboard, low-end storage, linux
- Buy things out of the backs of trucks if we can
- Key challenges: affordable high performance networking, power (in)efficiency
- Problem with networking is cost of fast NICs
- Networking doesn't scale in a good way
- We are always looking at how to hook up thousands of machines
- Question: Do you know mean time to failure on your PCs
Answer: It's not just the machine, it's switches and other stuff to. We've built up a model of this. Can't give specific numbers because of variability. e.g., different hardware platforms. - Question: On networking, have you done any experiments or
analysis on multicast
Answer: There are certain apps that this works for, it's not a large group. Search architecture doesn't need multicasting. Copying lots of data there is a limited application - Question: Have you looked at the hardware costs of
multicasting
Answer: Yes, but I can't say anything about that. Data copying is a big deal for us. - Question: Using SNMP or custom apps for monitoring?
Answer: Early on we used perl scripts. On our current scale we've gone through 3 generations of monitoring. It's a large scale problem because of aggregation. - Question: Doesn't the manpower of
equipment management (hardware swapping) out-weight the benefit of
using cheaper machines?
Answer: Look at the cost of Sun equipment, the human cost of switching is not the significant cost - Question: In your statistics have you found a product that
never failed (hardware)
Answer: I suspect not, but don't have an answer for that - Question: You don't have Windows boxes, if you buy a
company with a windows product, what do you do
Answer: We have a variety of client software in Windows like desktop search. We have server products for windows market, as well - We recognized that power was going to be an issue for us, performance per wattage is an issue, even though other metrics are getting better
- Working with the chip manufacturers on cooling
- Lots of things going on in the area of general cooling
- Just providing the raw electric power can start to become an issue
- Question: Do you truly have heterogeneous systems,
different machines all over the place?
Answer: Yes, we have a large Linux group to make sure things happen. We do try to keep hardware in same specs (e.g. memory speed) - Question: Are these desktop motherboards or workstation
mother boards?
Answer: These are not desktop motherboards because we use multithreading - Question: My inner hippy is screaming to ask what you do
with dead computers. Does Google care about environment?
Answer: First time I've been asked and I'm not sure what we do with the machines. I will find out. - Question: Are your data centers in different regions or
one place?
Answer: We have a number of data centers in different places. This helps keep network transit time down - Question: What is the scale of your production
infrastructure?
Answer: If you read newspapers, more than 10,000. Google doesn't comment publicly on this. It's a lot! Problem we have, this is a competitive advantage. We talk about it as much as we can without trying to give anything away. - Question: In terms of hardware/software faults, what is
typical ratio? Where are most of the faults?
Answer: How do you define a fault? Just failures? Slow speed of a query return response? Can't really answer the question - Question: Does an application ever migrate across data
centers?
Answer: Yes. MapReduce will address this a little - Question: What happens if you see a lot of uptime, a cause
of concern? (i.e., Preventative reboot)
Answer: Pretty sure our application will crash before then - Question: Do you do prevent maintenance?
Answer: Yes, via routine replacement. If we know a batch is really bad it gets replaced earlier. Sometimes have to code around issue. One engineer said this job was the first time he'd had to do preventive programming because hardware memory error checking didn't work.