Google data center near The Dalles on the Columbia River, Oregon

BACK IN 1993, in a midnight email to me from his office at Sun Microsystems, CTO Eric Schmidt envisioned the future: “When the network becomes as fast as the processor, the computer hollows out and spreads across the network.” His then-employer publicized this notion in a compact phrase: The network is the computer. But Sun’s hardware honchos failed to absorb Schmidt’s CEO-in-the-making punch line. In which direction would the profits from that transformation flow? “Not to the companies making the fastest processors or best operating systems,” he prophesied, “but to the companies with the best networks and the best search and sort algorithms.” George Gilder, Wired Magazine

While it is rarely discussed, I believe that Google’s infrastructure assets are a major barrier to any competitor who thinks they can catch up. Google has been very secretive about the proprietary approaches they have innovated for designing, building and operating their networks and data centers. And to keep pace with the exponential growth of the internet content and traffic means Google’s infrastructure must also grow exponentially. Accomplishing that feat both economically and reliably is not at all straightforward. Here’s one public example of the scale of Google’s expansion in 2006 — tapping into plentiful, cheap hydropower power, underutilized dark fiber and a 15-year Oregon tax holiday:

On the banks of the windswept Columbia River, Google is working on a secret weapon in its quest to dominate the next generation of Internet computing. But it is hard to keep a secret when it is a computing center as big as two football fields, with twin cooling plants protruding four stories into the sky.

The complex, sprawling like an information-age factory, heralds a substantial expansion of a worldwide computing network handling billions of search queries a day and a growing repertory of other Internet services.

…Google remains far ahead in the global data-center race, and the scale of its complex here is evidence of its extraordinary ambition.

…”Google has constructed the biggest computer in the world, and it’s a hidden asset,” said Danny Hillis, a supercomputing pioneer and a founder of Applied Minds, a technology consulting firm, referring to the Googleplex.

The design and even the nature of the Google center in this industrial and agricultural outpost 80 miles east of Portland has been a closely guarded corporate secret. “Companies are historically sensitive about where their operational infrastructure is,” acknowledged Urs Holzle, Google’s senior vice president for operations.

Behind the curtain of secrecy, the two buildings here — and a third that Google has a permit to build — will probably house tens of thousands of inexpensive processors and disks, held together with Velcro tape in a Google practice that makes for easy swapping of components. The cooling plants are essential because of the searing heat produced by so much computing power.

The complex will tap into the region’s large surplus of fiber optic networking, a legacy of the dot-com boom.

Regarding the logic of Google’s secrecy, I can assure you that every competitor would kill [figuratively] to know how Google’s infrastructure works and what it costs. E.g., from one of the first articles published by a journalist allowed into the new Oregon center:

…Patchett addressed an area of puzzlement for some local people: Google’s unwillingness to divulge the number of employees in The Dalles. “How would that give Google’s competitors an advantage?” we asked him.

“It’s an extremely competitive industry we’re in,” he said. “The ability to determine how some things are done within the Internet space may also be determined by the number of people it takes to operate a particular facility. So when we talk about how much capacity we have to process the information, that is what’s valuable to our competitors; how fast and how well do we process our information. Some piece of that may able to be derived by the number of people it takes to actually run and support that.”

In a Wired article, George Gilder explains a bit more of the significance of Google’s innovations — an excerpt:

…FOR THE MOMENT, at least, the power of massive parallelism has far outstripped the promise of alternative computing architectures. But reliance on massively parallel computing may come to define the limits of what can be accomplished by a computer-on-a-planet. Two decades ago, Carver Mead, the former head of computer science at Caltech and key contributor to several generations of chip technology, pointed out that a collection of chips arrayed in parallel can’t do everything a computer might be called upon to do. “Parallel architectures,” he noted, “are inherently special-purpose.”

Hölzle admits as much. Since he arrived at Google, he says, the company has been through six or seven iterations of its search software and perhaps as many versions of the hardware backend. “It’s impossible to decouple the two,” he explains. “The search programs have to fit with the hardware systems, and the hardware systems have to work with the software.”

All previous parallel architectures, from Danny Hillis’ Thinking Machines to Seymour Cray’s behemoth supercomputers to Jim Clark’s Silicon Graphics workstations, have fallen before this problem. Their software and hardware became too specialized to keep up with the pace of innovation in computing. Scalability is also an issue: As the number of processors grows, the balance of activity between communications and processing skews toward communications. The pins that connect chips to boards become bottlenecks. With all the processors attempting to access memory at once, the gridlock becomes intractable. The problem has an acronym – NUMA, for nonuniform memory access – and it has never been solved.

Google apparently has responded by replicating everything everywhere. The system is intensively redundant; if one server fails, the other half million don’t know or care. But this creates new challenges. The software must break up every problem into ever more parallel processes. In the end, each ingenious solution becomes the new problem of a specialized, even sclerotic, device. The petascale machine faces the peril of becoming a kludge.

Could that happen to Google and its followers?

Google’s magical ability to distribute a search query among untold numbers of processors and integrate the results for delivery to a specific user demands the utmost central control. This triumph of centralization is a strange, belated vindication of Grosch’s law, the claim by IBM’s Herbert Grosch in 1953 that computer power rises by the square of the price. That is, the more costly the computer, the better its price-performance ratio. Low-cost computers could not compete. In the end, a few huge machines would serve all the world’s computing needs. Such thinking supposedly prompted Grosch’s colleague Thomas Watson to predict a total global computing market of five mainframes.

Google’s Urs Hoelzle talks a bit at EclipseCon, reported by CNET — outlining some of the Google infrastructure. In closing he touches on the power issue:

“The physical cost of operations, excluding people, is directly proportional to power costs,” he said. “(Power) becomes a factor in running cheaper operations in a data center. It’s not just buying cheaper components but you also have to have an operating expense that makes sense.”

Technorati Tags:

3 thoughts on “Google data center near The Dalles on the Columbia River, Oregon

  1. Great post. I see where Amazon is rolling out a Data Center Project in Boardman, Oregon.

    Is there potential in converting the old Goldendale Aluminium Plant on the Washington side into a Data Center? I think that there is an amazing amount of green power under construction in the area including a potential large scale solar project near the plant that could be connected to this facility as well as additional wind capacity. I suppose that Google and Amazon selected the Oregon side of the River because of the Dark Fiber. Any thoughts?

  2. Marshall,

    Thanks for your comments

    Re Goldendale — of which I know nothing, what would be the benefits from conversion to a data center?

    Re Oregon side — dark fiber is a good guess. OTOH, could it be so employees can live where there are no state sales taxes 🙂

Comments are closed.