A laser beam intrusion detection system, customized electronic access cards and biometric iris scans are just some of the multilevel security measures that Google has implemented to control access to its data centers.
Other measures include dual authentication systems, vehicle-access barriers, high-resolution cameras, metal detectors, perimeter fencing and so-called circle-lock portals to prevent tailgaters from entering protected areas by following too closely behind someone with a valid access card. Access to the actual data center floors itself is often only possible through a security corridor featuring multifactor authentication systems.
Less than 1 percent of Google’s more than 60,000 employees are authorized to set foot in any of the company’s data centers. The measures are designed to ensure only that online data center personnel have access to it and no one else, Google’s Vice President of Data Center Operations Joe Kava said in a blog post that offered a rare glimpse at Google’s elaborate data center security measures.
In addition to the physical security measures, Google also employs what Kava described as a “strict end-to-end chain of custody” for data storage. From the time a hard disk goes into a machine until the time all data on it is completely expunged or the disk itself is destroyed, everything that happens to data on it is tracked and monitored, he said.
To minimize vulnerabilities, Google ensures that its data center servers do not include any unnecessary features or components such as peripheral connectors, video cards and chipsets. For the same reason, too, all of the production servers at Google run a stripped-down and hardened version of Linux and server resources are all dynamically allocated with minimal human interaction.
Machine learning technologies are central to Google’s efforts to keep its data centers running optimally. “As you can imagine our data centers are large and complex, with electrical, mechanical and control systems all working together to deliver optimal performance,” Kava said.
Given the sheer number of interactions and potential settings for each system, it is impossible for a human to figure out how best to optimize performance. “However, it’s fairly trivial for computers to crunch through these possible scenarios and find the optimal settings,” he wrote.
Google periodically shares best practices and data on various aspects of its infrastructure so others can learn from the company’s experience in building and maintaining large, scalable Internet operations.
Earlier in March, for instance, Google publicly released details of a software-defined load balancer technology dubbed Maglev that it has been using for the past eight years to manage traffic on Google Cloud Platform. Last year, the company offered a similar peek at its Jupiter data center network fabric technology, which it has described as being capable of delivering more than 1 petabit per second of bandwidth.
Like many of the technologies in Google’s data centers worldwide, Maglev and Jupiter were developed internally and have gone through multiple upgrades and improvements for as long as the company has been using them. Unlike many other large companies, which outsource data center management to third parties, Google says it only uses only its own staff to operate and manage the company’s cloud infrastructure.