Gauging Size Differences in AD Databases

Gauging Size Differences in AD Databases

  • Comments 2
  • Likes

We occasionally receive support calls which revolve around the topics of “why is the Active Directory database on DC A different in size than that on DC B?”. It’s easy to dismiss the question out of hand but there are real life scenarios where this can be an important question. And there are real life AD uses that can bring you to the point where you are asking that question.

In previous blog posts, for different reasons, I’ve occasionally touched upon the fact that AD is stored essentially on one monolithic file, the NTDS.DIT. Each domain controller in a domain and forest contains its own copy of that database that is continually updated with changes from its peers. That updating process is termed AD replication.

What causes occasional consternation is when it is noticed that the file sizes of the Active Directory database files are different from domain controller to domain controller. Or that the Active Directory database is suddenly and unexpectedly growing at a quick rate on all DCs. In itself the database size being different is not a bad thing-they will never be precisely the same size on disk.  There is actually a term for excessive database growth that can be applied to extreme examples where one or more DCs are growing in database size much more quickly that others: DIT bloat.  For those times when there is a larger discrepancy this blog post will give you a few techniques you can apply to get more information about what is happening and why. 

Since most data in AD needs to be kept consistent from DC to DC we’ll start with AD replication. Naturally repadmin.exe is our weapon of choice here, as it is the Swiss Army knife of AD replication. Repadmin.exe is primarily useful in sizing questions when used to verify that the different replicas are actually in synch.

Consider a forest that has a few global catalogs that are noticed to have AD databases which are substantially (say fifty percent) smaller than their peers. That is certainly worth looking into. That discrepancy can occur as the result of those GCs being out of synch for an extended period or following a migration or provisioning. In other words the databases aren’t the same size because they haven’t received all of the updates that would make them larger, or has updates that should have been garbage collected but have not.

As in other instances we would simply want to use the command below to see if the GCs with larger directory sizes on disk are in synch or not with their peers:

Repadmin /showrepl * /csv >repl.csv

Some data in Active Directory is not replicated however. That may sound strange but is true nonetheless. Examples of data that is not replicated are the indexes used to assist database searches. Though the data which is referenced in the index is replicated the index itself will be compiled and to a certain extent unique on each domain controller replica. This is true even when an attribute is arbitrarily indexed by someone for their own business needs. Marking an attribute to be indexed, and how it can be indexed, is a change to the schema object for that attribute and is replicated via AD replication of the schema naming context. The actual index itself, once made, is never replicated but is instead compiled and maintained separately on each individual DC of the forest.

This may lend itself to a certain amount of difference in each DCs database size on disk. In most cases the index sizes should not be seen to be enormously different and if they it could indicate that the database needs to be checked for problems.

To do that we should use NTDSUTILs nice command of Files-->Integrity after booting to Directory Services Restore Mode. When ran this command gives a file %systemroot%\ntds\NTDS.INTEG.RAW which may have interesting data about the database health.

In addition it can also help to use the Semantic Database Analysis “Go Fixup” command. It too puts out a little more information; that information is stored in a sequentially numbered file named DSDIT.DMP.X.

There may be a few people chafing at the bit, waiting for me to talk about another situation: when the database suddenly begins growing at such a rate as to raise concerns about filling up very large hard disks. How do you find out what is taking up so much space and what is causing that behavior?

Of course you could always monitor replication, but you already knew that. One different method would be to use the DSASTAT.EXE tool. DSASTAT has been around as long as AD has but is a tool that has a limited range of uses and is not well known. The syntax you can use is a little self evident:

C:\>dsastat -loglevel:debug -output:both

Most DSASTAT information you get from that command will not be useful for this concern but the final information which appears under the header of “DSA Diagnostics” may be. Here’s a snippet from that:

-=>>|*** DSA Diagnostics ***|<<=-

Objects per server:

Obj/Svr ADFSACCOUNT Total

builtinDomain 1 1

classStore 1 1

computer 2 2

container 82 82

dfsConfiguration 1 1

<snip>

organizationalUnit 2 2

rIDManager 1 1

rIDSet 1 1

rpcContainer 1 1

samServer 1 1

secret 5 5

user 8 8

---

Total: 201 201

. . . . . . . . . . . . . .

Bytes per object:

Object Bytes

builtinDomain 161

classStore 155

computer 1164

container 15225

<snip>

organizationalUnit 465

rIDManager 153

rIDSet 135

rpcContainer 164

samServer 153

secret 956

user 2328

. . . . . . . . . . . . . .

Bytes per server:

Server Bytes

ADFSACCOUNT 49586

Information from DSASTAT is a snapshot of the state of a domain controller at a particular time. For unfettered growth issues it’s going to be more useful to get a sequence of snapshots using DSASTAT taken over a period of time that the growth is seen to occur within. Once you have them it’s a simple matter to compare them and see which number is getting bigger progressively over time.

We have another tool in our arsenal for AD sizing concerns: ESENTUTL /MS. I believe I’ve mentioned how NTDSUTIL.EXE is the AD specific version of ESENTUTL.EXE before, and that generally it’s a bad idea to use ESENTUTL rather than NTDSUTIL. This is an exception to that rule.  The value the ESENTUTL /MS command gives is that you can see the size of the indexes which is not something the DSASTAT command above gives. Here’s a sample from that tool:

Microsoft(R) Windows(R) Database UtilitiesVersion 5.2Copyright (C) Microsoft Corporation. All Rights Reserved.Initiating FILE DUMP mode...

Database: c:\windows\ntds\ntds.dit

******************************** SPACE DUMP ***********************************

Name Type ObjidFDP PgnoFDP PriExt Owned Available

===============================================================================

c:\windows\ntds\ntds.di Db 1 1 256-m 1536 210

datatable Tbl 8 35 90-m 1126 45

<Long Values> LV 66 86 1-m 126 41

Ancestors_index Idx 15 42 1-m 19 0

clean_index Idx 28 46 1-s 1 0

deltime_index Idx 12 39 1-s 1 0

DRA_USN_CREATED_index Idx 14 41 1-m 13 0

DRA_USN_CRITICAL_inde Idx 30 48 1-s 1 0

DRA_USN_index Idx 29 47 1-m 13 0

INDEX_00000003 Idx 118 693 1-m 41 10

<snip>

MSysUnicodeFixupVer1 Tbl 6 33 2-s 2 0

secondary Idx 7 34 1-s 1 0

quota_rebuild_progress_ Tbl 125 911 2-s 2 1

quota_table Tbl 124 909 2-s 2 1

sdproptable Tbl 19 237 2-m 6 1

clientid_index Idx 21 241 1-s 1 0

trim_index Idx 20 238 1-s 1 0

sd_table Tbl 22 243 2-m 36 11

<Long Values> LV 123 713 1-m 18 5

sd_hash_index Idx 23 244 1-s 1 0

-------------------------------------------------------------------------------

463

Operation completed successfully in 1.372 seconds.

More data than you were likely hoping for appears in the above result but the key takeaway is to look for the growing numbers in the Owned columns and perhaps decreasing numbers in the Available columns. Owned relates to the total number of pages in the database for that index or table that contain data. Available is the amount of space left for growth. Similar to the DSASTAT we can run the ESENTUTL /MS commands sequentially over a period of database growth to see which Owned column is increasing over time.

A veritable flood of ESE data can be found here. Be careful not to overdose on database specific information when reading that article.

In this post we’ve gone over a few different things which can easily be done to get a handle on or better understand your Active Directory database. It’s important to keep in mind that how you use your database is the most relevant piece of information you can apply to any database concern you see. The ‘classic’ example if placing photos of the user into the user object in AD. This is sure to increase the size of the object itself, and greatly increase each replica’s NTDS.DIT.

Until the next post, take care out there, and a belated Happy Valentine’s Day!

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment