I've recently spent some time researching how Microsoft is adapting its Datacenters for Environmental Sustainability or how we are transforming to "Green Datacenters".
One of the items I never realized was how much more $$ is spent on cooling and powering the Datacenter than the actual hardware (servers, network devices, etc).
In the past, the cost of physical space was a primary consideration in data center design. More recently, the cost of power and cooling has risen to prominence. Data center managers now must prioritize investment in efficient power and cooling systems to lower the total cost of operating (TCO) of their facilities.
Microsoft responded to this shift by adopting the following top 10 best practices for energy efficiency in Microsoft data center operations:
Microsoft is planning to publish some further specifics on this subject; anyone interested on this in general should check out HP distinguished engineer Christian Belady's post on Datacenter power consumption and the relevance it now has with CFOs and CIOs.
You can check it out here: http://www.electronics-cooling.com/articles/2007/feb/a3/
Microsoft's approach to designing a data center is to look at the building as if it were a big computer that must run 24 hours a day, seven days a week. Computers work best when they are tailored to the specific needs of their users. The same principle applies to data center design; the most resource-efficient designs meet the requirements of the data center's users and specific site conditions.
Microsoft continually evaluates many different technologies for power distribution, cooling systems, and server rack/container systems. To optimize the data center environment, Microsoft uses tools like Computational Fluid Dynamics to test different configurations.
Caption: Illustration showing hot and cool areas in a server row. Data centers visualize this information to ascertain areas of excessive heat and overcooling. Overcooled areas are a waste of energy and areas that are too hot jeopardize equipment performance.
To create an optimal environment, designers must take into account all the costs: building, land, power equipment, cooling equipment, electricity, water, network and staff. Microsoft uses software tools to create a heat map to help determine ideal locations for its data centers. After a location is selected, Microsoft evaluates building design and equipment to create efficient configurations with low TCO over the life of the facility. Rather than decentralizing ownership between multiple teams in the organization, a single organization in Microsoft has been created for site selection, building design, and operations. This creates singular accountability for the data center and ensures lower TCO over the life cycle of the data center.
Most data centers run for years under a partial load. It is possible to run only the part of a data center's infrastructure that is required. This is an energy efficient opportunity that many miss. Microsoft implements a modular design in which only parts of the data center infrastructure run when the data center is under partial load. It's not efficient to run 100% of your infrastructure when demand is less than that.
Another technique is to design power and cooling systems divert power from areas where it cannot be used. Stranded power can result in millions of dollars of unused capacity. For example, you may have an area set up to receive a specific amount of power, but the equipment installed there does not use the capacity. While this power is unused, a power shortage might exist nearby. To ensure that power goes where it is needed, Microsoft develops flexible designs that allow power and cooling systems to be reconfigured and share power.
Another method is to locate hardware where it is most efficient for power and cooling. In some situations, it's impossible to put a piece of equipment in the ideal location, but wherever possible Microsoft removes physical barriers. Microsoft business units are charged for their true operating costs, including energy consumption and cooling costs, and not by the space they occupy.
Caption: Schematic of power and cooling systems in a data center. Illustrates areas data center operators need to monitor power consumption and equipment efficiencies.
To increase efficiency, you must first develop and deploy monitoring tools to capture performance, temperature, and power throughout the data center. Measuring server temperatures in real time throughout the data center provides information about how well the cooling system is working.
Overcooling is an energy drain in many data centers. Microsoft maintains close control over inlet temperatures to eliminate wasteful cooling.
In addition, create a historical archive of your data, which you can mine to develop a comprehensive understanding of how to improve operations.
Caption: Graph of how many servers are in the data center and how much power is being consumed. This information helps educate the data center operator on historical capacity and corresponding power consumption, which can be leveraged in future data center planning.
One of the first steps in an energy efficiency effort is to create awareness, and that responsibility falls on a team whose job it is to create systems that monitor, report, and analyze the data center efficiencies. Microsoft has made data center metrics part of regular communication in running its Web services, and has developed internal tools to communicate information about data center operations. Microsoft's Web services decision makers now receive energy efficiency reports about data center performance in a closed feedback loop that tracks improvements and changes.
Caption: Screenshot of Microsoft data center monitoring and analysis information, which includes carbon emission footprint and PUE.
Power usage effectiveness (PUE) is a metric Microsoft has used for years to help improve the efficiency of the data center.
Caption: A description of power usage factors and the PUE calculation from The Green Grid.
Note that PUE is a dynamic number that can vary, owing to a variety of factors such as outside temperatures, equipment changes, and the load on the servers. Without monitoring and instrumentation, it is impossible to determine the cause and effect of PUE changes. PUE enables datacenter operators to quickly estimate the energy efficiency of their datacenters, compare results against other datacenters, and determine if any energy efficiency improvements need to be made.
Caption: A graph of a Microsoft data center and its PUE over a three-year period. This demonstrates PUE performance over a period of time, which enables further analysis of the energy efficiency of the data center when compared to other data centers.
These techniques can help improve temperature control and airflow distribution.
Caption: Graphic of cold and hot air flow in a data center. Visulaizing layout aids data center operators in the design of efficient cooling systems.
The mixing of hot and cold air is a sign of inefficiency and cooling costs that can be reduced. The following techniques eliminate the mixing of hot and cold air:
Caption: Example of a hot and cold isle data center design.
One factor to consider with regard to a site location is whether you can run an economizer to cool the data center. Two types of economizers are available; one, water-side economizers which use outside air to cool chilled water; and two, air-side economizers, which bring outside air directly into the data center. Microsoft uses both of these economizers in different data center locations.
Microsoft participates in and shares best practices with The Green Grid; Climate Savers Computing; Environmental Protection Agency; Lawrence Berkeley National Labs; American Society of Heating, Refrigeration, and Air-Conditioning Engineer; and the Association for Computer Operations Management. Membership in these organizations promotes knowledge sharing within the industry and provides an information exchange on different data center strategies and best practices.