Brian Puhl's Weblog

These postings are provided "AS IS" with no warranties, and confer no rights...WHEW...glad we got that over with, let's get to the good stuff now...

Blogs

How Does MSIT Do...DC Placement?

  • Comments 1
  • Likes

In the past couple of months, I've been asked at least 3 or 4 times how MS IT determines where on their network to place domain controllers.  The questions are usually coming from larger, enterprise type customers and usually sound something like this:

  1. How does MS IT determine the number of users or workstations resident on a given subnet (which let's you know how many are in the site, based on site/subnet associations)
  2. Based on #1, how does MS IT use this info to determine how many DC's are needed to service the site?
  3. How do you determine over time that you need to add or remove DC's?

The short answers to these are:

  1. We don't
  2. See #1
  3. Based on performance

But short answers really miss the whole point, what they are really looking for is how we do DC placement.  To start with, we review our DC placement twice yearly, and our capacity planning (performance reviews) 4 times yearly.  Based on experience, we should actually do both more often, however it's just not practical.  For example, in a single 6 month period we had over a dozen sites in South America with WAN links to North Carolina, change to use Redmond as their hub...  (this is where you might ask why the network guys aren't talking to the AD guys?...good question...)

So during our DC placement reviews, we're looking at the following network criteria:

WAN availability - Greater than 99.5% uptime between the end user and their nearest DC

Max Average Latency - Less than 500ms...this is a loose target, based on feedback from our users in the regions.  This tends to change with our environment, but the feedback we get is that if a user enters their username/pw, goes to get coffee, comes back and it's still "Apply Policies"...they'll call help desk and let us know.

Max 95th % utilization - Less than 90% - Typically this isn't an issue, although some of the sites in Africa and the Caribbean come close...

With the network topology understood, we then consider some other factors:

Site Classification - This is one of our general categories that tells us whether the building has a secure physical location for a DC, and what the primary type of users are in the site (ie. Sales, PSS, Dev, etc...)

Critical Applications - There was a time, when this category was called "Exchange", however our Exchange team has done a massive amount of consolidation and we've uncovered some other applications which are business critical

With all of this information in hand, we throw it together in a pot, sprinkle a little sweat on the keyboard and a dash of Excel, and come up with any places where we either need to add or remove DC's to support our users.

This is normally the part where people say..."Yeah, but what about the number of users in a site?  Doesn't that matter?"  Not for us, we don't actually count the number of users in a site when determining DC placement.  There are two great examples, one is a call center in Texas that has several thousand users, redundant links, high bandwidth, low latency, etc...  No local DC.  By contrast though, Microsoft Game Studios has a development center in the same area with 50 developers who are using applications that have extremely sensitive authentication requirements, and their business has incredibly aggressive deadlines such that they couldn't tolerate ANY possible network outage impact.  They get a DC.  Does the number of users matter to us?  No, but how the users leverage the DC's does matter when it comes to determining where to deploy the servers.

Comments
  • Very interesting post!

    Regarding the « acceptable » latency in Startup/logon times
    You only use helpdesk calls as benchmarking reference  ? I expected that you use script to gather startup or logon information and draw conclusion about GPO application behavior. BTW, what is the average number of GPO applied? Is the AD team responsible for that or is is a cross-team task to evaluate, design and implement GPOs?

    Regarding DNS Service
    I suppose that MS IT use AD-integrated zone(s). How do you take it into account when it comes to DC placement and DNS service availability?