Goatee PFE

Blog of Microsoft Premier Field Engineer Ashley McGlone featuring PowerShell scripts for Active Directory.

Using PowerShell to Find Stale and Duplicate Active Directory Groups

Using PowerShell to Find Stale and Duplicate Active Directory Groups

  • Comments 6
  • Likes

StanleySpadowskiPowerShell… “My mop!”

One of the undisputed greatest movies of all time was UHF in 1989.  Stanley Spadowski and his mop were an amazing force for good.  He gave us all an important life lesson… “Life is like a mop.  Sometimes it gets dirty, and you have to clean it out.”

I have often told customers…

“Most companies clean up stale users,
a few companies clean up stale computers,
but no one cleans up stale groups.”

Generally it is easy enough to tell if a computer or user account is stale, but how do we do that for groups?  Today’s post is going to give you two reports to analyze group staleness, population, and duplication.  (If you would like to report on the dates individual group members were added to a group, then see this previous post.)

The Problem

I was recently working with a customer that wanted to compare group memberships across the entire company. Like many companies they had merged other domains into their environment, and that came with a bunch of groups that did not share the same naming standard. Then repeat that a couple times. Eventually these groups stepped on each other and created duplication in the domain. This is a fairly common scenario, so I thought I would share a script that can help.

My goal is to find groups whose membership is close enough that one can be eliminated. In the TechNet Script Gallery there are a couple scripts to compare membership of two groups. But none of these give you a report comparing all groups in the domain. I want statistics about how closely all groups match. Since we have to get a list of all groups at the start we’ll go ahead and dump out a bonus report for staleness as well.

The Math of the Problem

I remember when I took at C++ class at the local community college over 10 years ago. The professor introduced us to a concept called Big O Notation. Essentially it means that you need to pay attention to the iteration and processing time of your functions. Things can get out of hand quickly with poor code.

In this case we need to compare the group membership of each group in the domain to the group membership of every other group in the domain. (n * (n-1)) Ouch! That is a lot of processing. In order to reduce that crazy number of comparisons I did four steps of elimination:

  1. We don’t need to compare the group to itself. (n * (n-1))
  2. After we compare GroupA to GroupB, then later we don’t need to compare GroupB to GroupA again.  This cuts the total number of comparison in half.  ((n * (n-1)) / 2)
  3. We skim off a large number of comparisons by ignoring groups of 5 members or less.  Empty or sparsely populated groups are likely out of scope. (((n-x) * (n-1-x))/2)
  4. Two groups with more than a 25% difference in group membership count will likely not have enough commonality to eliminate one of the groups.  In other words a group with 1000 members is not close enough a match to a group with 750 members. (((n-x-y) * (n-1-x-y))/2)

Let’s see how this works out when we do the math.  Take a mid-to-large company with 5000 groups to compare:

  • 5000 groups total
  • Don’t compare a group to itself (5000 * 4999)
  • Don’t compare the same two groups twice (5000 * 4999) / 2
  • Less approximately 100 (random guess) groups of 5 members or less
  • Less approximately 2000 (random guess) group where the membership counts are more than 25% different
  • Before optimization: 5000 * 4999 = 24,995,000
  • After optimization: (((5000 - 100 - 2000) * (5000 - 1 - 100 - 2000))/2) = 4,203,550
  • That is approximately an 80% savings on processing time!

Yes, 4 million comparisons is still huge, but we’re going to run this over night anyway.

In the script for this post you can tweak the minimum number of members to ignore and the percentage difference between counts. Using these two numbers you can tune the comparisons to your own needs. This will obviously have an impact on the total number of computations.

The Solution

Scripting this solution involves two components:

  1. Get a list of all groups in the domain, including group membership and staleness-related properties. Pipe this list out a number of ways into assorted staleness reports.
    1. Empty groups
    2. Groups not modified in X days
    3. Etc.
  2. Compare each group’s membership to the membership of all other groups, looking for matches.
    1. Use Compare-Object to do the heavy lifting.

 

List All Groups

For this task we simply use Get-ADGroup.  Then we pipe it into a Select-Object that calculates some staleness vectors.  Then we can pipe all of that out to a couple different Where-Object filters to find empty and old groups.

$GroupList = Get-ADGroup -Filter * -Properties Name, DistinguishedName, `
        GroupCategory, GroupScope, whenCreated, whenChanged, member, `
        memberOf, sIDHistory, SamAccountName, Description |            
    Select-Object Name, DistinguishedName, GroupCategory, GroupScope, `
        whenCreated, whenChanged, member, memberOf, SID, SamAccountName, `
        Description, `
        @{name='MemberCount';expression={$_.member.count}}, `
        @{name='MemberOfCount';expression={$_.memberOf.count}}, `
        @{name='SIDHistory';expression={$_.sIDHistory -join ','}}, `
        @{name='DaysSinceChange';expression=`
            {[math]::Round((New-TimeSpan $_.whenChanged).TotalDays,0)}} |            
    Sort-Object Name            
            
$GroupList |            
    Select-Object Name, SamAccountName, Description, DistinguishedName, `
        GroupCategory, GroupScope, whenCreated, whenChanged, DaysSinceChange, `
        MemberCount, MemberOfCount, SID, SIDHistory |            
    Export-CSV .\GroupList.csv -NoTypeInformation

Notice the calculated columns for staleness filtering:

  • MemberCount – How many members are in this group?
  • MemberOfCount – This group is immediately nested in how many other groups?
  • SIDHistory – Does this group have one or more SID history entries?
  • DaysSinceChange – How many days have gone by with no changes to the group?

Now filter this for fun either in PowerShell or Excel:

$GroupList | Where-Object {$_.MemberCount -eq 0}            
$GroupList | Where-Object {$_.DaysSinceChange -gt 90}            
$GroupList | Where-Object {$_.SIDHistory}            

Compare All Group Memberships

This is really the heart of the solution.  First we do some juggling to eliminate the unnecessary comparisons described above.  Then we pair up the groups and send them into the battle arena using Compare-Object.  There they duke it out to see where they agree.  This handy cmdlet should be in your PowerShell toolbelt for any time you need to see the differences or similarities between two items.

Using the data from this report you can then go investigate groups for consolidation based on high match percentages. Groups of all types are compared against each other in order to give a complete picture of group duplication (Domain Local, Global, Universal, Security, Distribution). If desired, mismatched group category and scope can be filtered out later in Excel when viewing the CSV output.

Here is a simplified view of the key script lines.  You can get the full script on the TechNet Script Gallery.

$CountA = $GroupA.MemberCount            
$CountB = $GroupB.MemberCount            
            
$co = Compare-Object -IncludeEqual `
    -ReferenceObject $GroupA.Member `
    -DifferenceObject $GroupB.Member            
$CountEqual = ($co | Where-Object {$_.SideIndicator -eq '=='} | `
    Measure-Object).Count            
            
$report += New-Object -TypeName PSCustomObject -Property @{            
    NameA = $GroupA.Name            
    NameB = $GroupB.Name            
    CountA = $CountA            
    CountB = $CountB            
    CountEqual = $CountEqual            
    MatchPercentA = [math]::Round($CountEqual / $CountA * 100,2)            
    MatchPercentB = [math]::Round($CountEqual / $CountB * 100,2)            
    ScopeA = $GroupA.GroupScope            
    ScopeB = $GroupB.GroupScope            
    CategoryA = $GroupA.GroupCategory            
    CategoryB = $GroupB.GroupCategory            
    DNA = $GroupA.DistinguishedName            
    DNB = $GroupB.DistinguishedName            
}            
            
$report |             
    Sort-Object CountEqual -Descending |             
    Select-Object NameA, NameB, CountA, CountB, CountEqual, MatchPercentA, `
        MatchPercentB, ScopeA, ScopeB, CategoryA, CategoryB, DNA, DNB |             
    Export-CSV .\GroupMembershipComparison.csv -NoTypeInformation            

Disclaimers

  • You can modify this basic Get-ADGroup query according to instructions in the script comments to target the group queries more specifically.  For nuances of cross-domain group scripting see this post.
  • This script is compatible with PowerShell 2.0 as long as you have the Active Directory module installed (contained in the RSAT).  If you are running it against Windows Server 2003 domain controllers see this post for necessary prerequisites.
  • Only immediate group members are compared.  Nested group memberships are not in scope for this script.

The Results

Here is a trimmed sample of the final output:

image

The output lists all the compared groups twice: once as GroupA/GroupB and once as GroupB/GroupA.  This makes it easier when viewing the list of all groups in the first column. For each group we list the following:

  • CountA – How many members are in GroupA?
  • Count B – How many members are in GroupB?
  • CountEqual – How many members are common to both groups?
  • MatchPercentA – CountEqual is what percentage of CountA?
  • MatchPercentB – CountEqual is what percentage of CountB?

Additionally we list other key attributes of the group like scope, category, and distinguished name.

Using this data we can now see which group memberships match 100% or are close.  Keep in mind that a 90% match is pretty good if there are only 10 members total.  A 90% match for a group of 5000 members is not going to be close enough, though.  Also, it is entirely possible that GroupA and GroupB have different match percentages.  Pay attention to the scope and category as well; it is very likely there is overlap between distribution groups and security groups, for example.

Note that this does not actually show you the matching members. This is intended as a first pass to see how much group member duplication exists in the domain. You can then use Import-CSV to read the data back in and then run Compare-Object against group pairs to get the specific details.

Get Out The Mop

Using these reports you are now armed with data to start mopping up stale and duplicated groups.  Group cleanup usually involves significant effort.  My next post will show you how to find where these groups are used on your file servers.  Happy Scripting!

Download the full script at the TechNet Script Gallery.

Can you help me?  Yes!

If you would like to have me or another Microsoft PFE visit your company and assist with the ideas presented in this blog post, then contact your Microsoft Premier Technical Account Manager (TAM) for booking information.

For more information about becoming a Microsoft Premier customer email PremSale@microsoft.com.  Tell them GoateePFE sent you.

Sharing Links
Comments
  • Great article. I especially like the part where you describe the thinking process and your approach.

  • Thanks

  • Outstanding Ashley! Never looked at it like that before. I agree with Bjorn.

  • cheers for that Ashley. 10/10 for the powershell stuff, but minus several million for the UHF reference. I'll not be able to settle now until i find a dvd/download copy (or get my old vhs working) and show it to the kids. SUPPLIES!

  • 30 hours later script still running.

  • Let me clarify, I have 10,000+ distribution lists, many with several hundred recipients. I have 100,000 mailboxes that could be mixed up into each of these dls. So, 30 hours to consider that many combinations is understandable.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment