Hi Everybody!
I just wanted to write a quick note to let you all know that the RODC compatibility pack for Windows Server 2003 and Windows XP clients is available for download.
Description of the Windows Server 2008 read-only domain controller compatibility pack for Windows Server 2003 clients and for Windows XP clients
http://support.microsoft.com/kb/944043
Not familiar with RODC?
check out these links:
Read-only Active Directory Features
Step-by-Step Guide for Read-only Domain Controllers
Thanks!
Justin
Hi Everybody!
This one comes at the request of several customers. Many of you out there are trying to determine which version of Server 2008 you will deploy. For most it comes down to deciding between two of the five major versions: Server 2008 Standard edition and Server 2008 Enterprise edition. Given the amount of features included in the OS and all of the different versions we shipped, trying to determine what version includes what feature can be confusing.
Fortunately we have released a very detailed treatise on this very subject in the form of a 247 page document appropriately titled, Windows Server 2008 Reviewers Guide.
Inside you will find a pretty thorough support matrix and technical nuggets like:
- In Standard Edition you are limited to one standalone DFS Namespace. (DFS Root) This limit does not apply to domain-based DFS implementations.
- Cross-File Replication for DFS-R is not available in the Standard or Web editions.
- Server Core is available in all editions except for Itanium.
- Hyper-V is included in Enterprise, Datacenter and Standard editions as long as you don't buy the version that say "without Hyper-V"
- TS Licensing in Windows Server 2008 now allows you to track per-user CALs
- Still no support for Cluster (failover) in Standard Edition, but you can now have 16 nodes with the Enterprise and Datacenter editions (8 with Itanium)
Here are some screen snags taken right from the guide:
Enjoy!
It can be a little tedious to verify replication status in a large Active Directory environment via the Sites and Services snap-in. Here is a command I use quite frequently to check the replication status of all domain controllers:
REPADMIN /SHOWREPL * /CSV >showrepl.csv
View the file in Microsoft Excel and perform the following filtering options to get a good quick overview of replication health:
1. Hide columns A and B
2. Select the row just under Column headers and choose Window / Freeze Pane (In Excel 2007: View tab, Window, Freeze Panes, Freeze Top Row)
3. Highlight the entire spreadsheet and choose Data / Filter / Auto-Filter
4. Click on the down-arrow for the "Last Failure Status" column, and choose "does not
equal" then type in "0" (In Excel 2007: Uncheck the box next to "0")
You are left with a list of domain controllers having replication problems. From a cmd prompt, use:
"net helpmsg ErrorCodeNumber" to identify the replication error
(eg. net helpmsg 1396)

Hi Everybody!
This is just an update to let you all know about two software releases:
We just released Virtual Server R2 SP1 on Monday.
You can download it for free from here.
Some of the updates:
- Support for hardware-assisted virtualization
- Supports up to 256 GB of physical memory, and up to 512 virtual machines
- Quick Migration
Today we released the Release Candidate of Windows Home Server.
From the Home Server site:
"With Windows Home Server, you can store your music, photos, and other files on a central hub-like hard drive, accessible from every PC in your house. Protect your files and your PCs with automatic backup and a simple restore process—even gain access to files on your PCs from anywhere with an Internet connection through secure Web access."
You can sign up to get it here. (also free, but not RTM code)
Thanks!
Justin
For this tip you will need a somewhat newer version of ntfrsutl.exe
You can grab a version out of the Service Pack 2 Support Tools download here.
Beginning with the version of ntfrsutl.exe in KB 823230 we have the ability to force FRS replication to occur across site boundaries immediately instead of waiting for the schedule to open up.
Here is the command's syntax:
ntfrsutl forcerepl [computer] /r SetName /p PartnerDnsName
= Force FRS to start a replication cycle ignoring the schedule
The PartnerDNSName is the FQDN of the server that you want to source from.
Here is an example using a DC Name of ContosoDC1 and a PartnerDNSName of ContosoDC2:
ntfrsutl forcerepl contosodc1 /r "domain system volume (sysvol share)" /p ContosoDC2.Contoso.com
Running the command initiates replication, and returns the following information:
LocalComputerName = contosodc1
ReplicaSetGuid = (null)
CxtionGuid = (null)
ReplicaSetName = domain system volume (sysvol share)
PartnerDnsName = ContosoDC2.Contoso.com
As you can see there are two additional parameters that you can specify, ReplicaSetGuid and CxtionGuid, but neither are required.
I've added a new tag called "Quick Tips." These are going to be smaller posts where I offer some time-saving tip, or some other similar type of goodie.
Look out for the first one soon!
Some time ago I did a webcast presentation on Active Directory User and Group Restore.
I've included the link for those of you that may have missed it.
Check out the on-demand presentation here:
http://www.msusapartnerreadiness.com/WS_abstract.asp?eid=15004864
(Unfortunately registration is required, but that takes only a few seconds)
Let me know if you would like to see more like this one.
Thanks!
I see a lot of customers unnecessarily using the boot.ini /3GB switch. Explaining when and when not to use it takes quite some time. Thanks to the Platforms Performance team, I now have a very nice quality post to point them to.
The helpdesk phone had been ringing incessantly all day. Many people throughout the AD forest were unable to login to their respective domains. It seems that accounts throughout the forest had somehow been deleted. John, tired from having been up all night watching "White and Nerdy", was called in to help identify what was going on. Fortunately he had recently enabled auditing for account deletions due to a recent problem that he had. After some serious filtering he was able to find the following event in the Security event log:
Event Type: Success Audit
Event Source: Security
Event Category: Account Management
Event ID: 630
Date: 1/17/2007
Time: 12:30:44 AM
User: Contoso\JuniorAdmin
Computer: DisgruntledXP
Description:
User Account Deleted:
Target Account Name: JustinTurner
Target Domain: Contoso
Target AccountID: Justin Turner []DEL:3f4567f2-f90b-493e-81a3-dcfc75596cd7
Caller User Name: JuniorAdmin
Caller Domain: Contoso
This was a little offsetting to say the least. "JuniorAdmin" was the name of the account for one of his Junior Network Administrators that they just fired for getting them into that last mess. He quickly disabled the account, and then attempted to identify what kind of mess they were in now. His heart sank into his stomach when he discovered that JuniorAdmin was a member of the Schema and Enterprise Admins security groups...
I had planned on providing an in-depth discussion about forest recovery, and then realized that there is already more than enough information on this topic. Since I have already advertised this, I will go ahead and provide what I hope will serve as a good general overview, and then point you to a few good resources for the process. There is now a Server 2003 specific forest recovery whitepaper, but the process is unchanged from Windows 2000. There are some additional server 2003 specific goodies added however. (like repadmin /removelingeringobjects)
Before we dive right into the process I want to point out a couple of reasons for why you might have to perform an Active Directory forest recovery.
There are a few reasons that I won't mention, but the two most common I see are:
1. The security of your directory has been compromised either through virus, hacker, or disgruntled employee.
2. A change was made to the schema which needs to be undone.
This really is a big deal, and is not something you want to jump straight to without first consulting Microsoft PSS/CSS/EPS/Platforms Support. (we've had so many different names, I don't remember the current one :-) The team you would be dealing with for this particular issue would be Platforms Directory Services. We want to try to determine what caused the forest failure, and also to ensure that a forest recovery is the best recovery option. An entire forest recovery is obviously one of the last steps you would want to try, so it really is best to explore all other recovery options first.
The five hundred thousand foot overview of the process is:
1. Recover one dc from the forest root domain first from backup.
2. Recover one dc from each of the remaining domains from backup.
3. Restore additional DC's by promoting them via dcpromo.
What follows is a general overview of the process that is outlined in both the Windows 2000 and Server 2003 forest recovery whitepapers referenced earlier. Please reference the particular whitepaper for the specific steps.
There are three major stages of a forest recovery:
Pre-recovery, Recovery, and Post Recovery
Pre-Recovery:
1. Determine the current forest structure/topology
2. Find one trusted backup to use per domain
3. Shutdown, and disconnect if possible, all DC's in the forest
Recovery:
1. Isolate the server, (unplug network cable) and perform a system state restore (ensure you choose the Advanced option to perform a Primary restore of Sysvol) Only choose this option for the first DC in a domain.
2. Verify DC was successfully restored after rebooting
3. Configure DNS
4. Disable Global Catalog (if enabled)
5. Raise RID pool by 100,000
6. Seize FSMO roles
7. Perform metadata cleanup of all other DC's in the forest root domain (also delete DC computer objects for dc's that will not be restored from backup in this domain)
8. Reset machine account twice
9. Reset the krbtgt account password twice
10. Reset all trust passwords twice
11. Restore the first DC in each of the remaining domains from backup (perform Recovery steps 1-10 to recover one dc in each of the remaining domains)
As you restore each DC, you will want to point them to the recovered forest root DC for DNS.
12. Connect the restored DC's back to the network (prior to performing this step ensure that no old dc's are still online)
13. Perform a full replica set sync of AD
14. Enable forest root dc as a GC
15. Seize schema master on forest root dc (if the schema master wasn't the dc that was restored)
16. Recover additional DC's in each of the domains using dcpromo
Post-Recovery:
1. Revert forest back to original DNS configuration
2. Redistribute FSMO roles
3. Enable additional Global catalog servers
4. Get a good system state backup from at least two dc's in each domain
As you can see, this is a very lengthy process. The whitepaper walks you through each step in detail. There is a good index in the paper that has step by step instructions for every single process as well.
Finally I just want to expand on a couple of the items listed above.
Some considerations to take when identifying which DC's to restore:
You will only be restoring one DC per domain. The recovery process will go much quicker if the restored DC was a DNS server, and was not a GC at the time the backup was taken. For some of you this may be an easy choice as you may only be able to find one good backup. I find that when it comes to these situations, many have trouble locating a decent system state backup. (but maybe my view is skewed because the customers that have tested their disaster recovery plan don't call us?) Additionally the process will go by quicker if the DC that you restore in the forest root domain was the Domain Naming and or Schema master. Selecting one that was a RID master will also help. If you are unable to locate a backup from one of these FSMO masters then you will just need to seize the role after the server is restored. To help you out with this there is a cool repadmin command that shows you the last time a dc's system state was backed up: repadmin /showbackup DCName
Don't try to shortcut this process by leaving out steps:
For example: When it says to shutdown and/or disconnect each dc. Do exactly that. We want to ensure that a restored dc does not replicate in bad data from a dc that we forgot to (or couldn't) shutdown. So at the very least ensure that you have your servers that you are restoring disconnected from the network. Also ensure that you reset each of the passwords listed twice. Ensure that you are very thorough with your metadata cleanup stage. Otherwise you will have a not so fun time troubleshooting why your DC's aren't replicating.
There is a typo several times in both whitepapers that greatly changes the meaning of the step:
"Delete server objects and computer objects for all domain controllers in the forest root domain that you are restoring from backup..."
This should read "...that you aren't restoring from backup" I will attempt to get this changed in the whitepapers.
Repadmin is your friend:
There are a few steps where you will use various repadmin commands. Learning repadmin syntax ahead of time will aid in the process. It is also very useful for performing day-to-day AD operations as well.
Some options that you will need to use:
/showbackup
/syncall
/showreps
/options
You may also end up having to use /add, /sync, and /removelingeringobjects as well. However, if you follow the step where it says not to restore a DC that was a GC (or just uncheck that after the restore) then you shouldn't have to worry about lingering objects.
Well that's all I have to say about that. :-) I'll add more later if I think of something else that I left out.
Post any comments or questions you have about this or any other topic that I have blogged about.
Up next: Cluster service failure troubleshooting
Thanks for reading!
Justin
Check out this cool poster size image now available for download from our website.
Link courtesy of Michael Kleef.
The "Active Directory Component Jigsaw" picture is a pretty cool overview of AD that was originally only available to TechNet Magazine subscribers. Full picture size is over 8 MB.
Enjoy!
Hi Everybody!
I just wanted to drop a quick note wishing you all a happy new year, and to let you know that I haven't given this up yet. :-)
I just got back into the office from vacation, and plan to have a technical post up here by the end of the week.
I didn't really make any resolutions this year other than to post a photo a week to my photography blog. I started something like this last year---posting a picture a day. (lasted till end of Feb)
Up next: Active Directory Forest Recovery!
****EDIT Hey Guys, I goofed on this post: This post discusses a utility used during the course of a Microsoft support call. It is not available to send to customers, and is not available for download as I had originally thought.
The version posted on the download site does not contain the same functionality referenced here. If you email me through the blog I will do my best to help out. Due to my tremendous workload my response may be delayed. If this is an urgent matter then you may want to consider opening up a paid incident with Microsoft Support: http://support.microsoft.com/ ****
This is part 2 of my earlier post on the whole "missing or corrupt system hive" issue. Okay, so we have a copy of the bloated/corrupt registry hive. Now what do we do with it? Chkreg.exe is your friend. Chkreg is a command line utility that you can use to repair a corrupt registry hive. You can also use it to just display registry key size. The majority of the issues that I see are not due to a corrupt system hive, so I use chkreg to help me identify what is taking up all of the hive size.
The ability to view registry key size wasn't added until a later version of chkreg than what is available at Microsoft.com.
The main version that you will find is actually used along with the XP Setup disks. In that version it is placed on disk 6, and after you boot to the recovery console it automatically attempts to repair the system hive. This version does not let you run it from the GUI. You will get this message if you try: "chkreg.exe application cannot be run in Win32 mode."
I thought the newer version was available on our site, unfortunately it looks like you have to call us in order to get this special version of chkreg. With this version of chkreg you get the /S, /O, and /D options.
/S Displays space usage for the bin. When bin is not specified, displays usage for the entire hive.
/O Ordered by size
/D Dump subkeys
I typically put the bloated hive in a folder such as c:\temp, and so my command would be:
chkreg.exe /F c:\temp\system /S /O /D >regbloat.txt
This will output the keys listed largest to smallest to a file called regbloat.txt
Here is an example of two such bloated keys from the txt file:
Size Subkeys
552027 ControlSet002\Control\DeviceClasses\{28d78fad-5a12-11d1-ae5b-0000f803a8c2}\##?#Root#RDPDR#0000#{28d78fad-5a12-11d1-ae5b-0000f803a8c2}
547031 ControlSet001\Control\DeviceClasses\{28d78fad-5a12-11d1-ae5b-0000f803a8c2}\##?#Root#RDPDR#0000#{28d78fad-5a12-11d1-ae5b-0000f803a8c2}
In this example, the same key in both ControlSet keys are causing the registry size problem. This is a known issue that occurs when you have the Spooler service disabled on a Terminal Server.
I remove the bloated keys, and then run chkreg again, but this time with the /C switch to compress the hive. The last step is to swap the hive back out via recovery console in order to boot off of it.
There is a utility that you can use to correct the problem called scrubber.exe, but it only corrects the issue if it is due to the issue mentioned here: KB 277222
Tune in next time when I will discuss: Active Directory Forest recovery or something else equally exciting. :)
Thanks for viewing!
Justin
Just a quick note to say that they did update KB 269229 with my comment about requiring the SERVICE account to be included in the "Impersonate client after authentication" user right. (reference this post for background info)
From the article:
"Note If you create a Group Policy setting to update the Impersonate a client after authentication rights policy setting, make sure that the Cluster service account is listed in the policy setting in addition to the Local Administrators group and the account that is called SERVICE."
It is still easy to overlook this in the article, so I don't anticipate and end to these issues. If any of you find this requirement missing from other MSFT documentation then please comment the article, or post a comment here and I will get it corrected.
Thanks,
Justin
Part 1 of 2
The on-call pager went off at two in the morning. John rushed in to discover that one of their main Windows 2000 file and print servers was sitting at a black screen with the following error displayed:
Windows 2000 could not start because the following file is missing or corrupt:
\WINNT\SYSTEM32\CONFIG\SYSTEMced

Well was it missing or was it corrupt!? He booted the server into Recovery Console, and went to the location mentioned in the error. It appeared to be missing---he couldn't find a file called "Systemced" anywhere...
There really isn't a file called SYSTEMced. The error message has just overwritten the message that normally appears there during system boot: "For troubleshooting and advanced startup options for Windows 2000, press F8."
Here is the message that is normally displayed at this point in the boot process: (notice that the ced from SYSTEMced is actually the last part of the word "advanced")
The method of recovery for this issue is actually documented fairly well here: KB 269075 here: KB 323148 and here: KB 302829 There are several other articles that describe various methods of correcting the problem, but these cover the basic steps required.
Usually when I see the issue it is because the system hive is too large to load into memory. In Windows 2000 (and NT 4) we are limited to 16 MB of memory at boot time. You will likely first run into this problem when the system hive reaches just a little over 10 MB. Thankfully this memory limitation has been greatly increased in Server 2003.
Essentially what you do is boot using an alternate system hive, and then either restore the hive from backup, (in the case of a corrupted hive) or clean up space in the system hive if the boot failure is caused by the system hive being too large.
This is a very common problem. I've seen it three times in the past week. (and countless times in the last few years) Some customers have this problem so often that they have a process in place to check the size of the system hive before they reboot a server. (you know who you are ;-) ) Hopefully with this and the next post, I can convince some of you to correct the problem that causes the bloated hive in the first place so that you never have to see this error on reboot.
In the next post I will go over the chkreg.exe utility that I use to correct this problem, and ways to prevent it from happening in the future.
Stay tuned...
Justin
Users were unable to connect to their shares. John discovered that the Cluster service wasn't started, and that any attempts to start it resulted in an error 1068. He attempted to ping the virtual server's IP address and it returned a "request timed out" message. He got the same error when trying to ping the cluster node's public adapter.
When he got to the node he found the Cluster service in a Starting state. He soon discovered that he had no network connectivity to or from either Cluster node, and that their network cards were missing from "Network Connections" The only changes made to the network were just a few minor group policy settings to lock down permissions a bit. Maybe that had something to do with this? It looked like it was going to be a long night...
This is another fairly common problem. This is not really just a Cluster problem, but that is usually how it is presented to me. Of course if networking is not functional, then Cluster isn't going to work either. :) I have worked at least three of these issues in the last two months, and thought it warranted discussion since there isn't a public KB article on this particular scenario yet. I hope to fully document every error encountered here, so that others may find this post when they run into this situation. (KB articles sometimes take a while to get published)
System event log:
SAM event ID: 12291 "SAM failed to start the TCP/IP or SPX/IPX listening thread"
IPSec event ID: 4292 "The IPSec driver has entered Block mode."
DfsSvc event ID: 14523 "DFS could not contact any DC for Domain DFS operations."
Application event log:
EventSystem event ID: 4609 "The COM+ Event System detected a bad return code during its internal processing. HRESULT was 80004015 from line 142 of d:\nt\com\complus\src\events\tier2\service.cpp."
Other problems discovered with this node:
The Com+ Event System, Network Connections and Shell Hardware Detection services were in a Starting state.
The following services failed to start:
Cluster Service: Error 1068: The dependency service or group failed to start.
File Replication: Error 1068: The dependency service or group failed to start.
---dependencies opens up a window titled "Service Dependencies" and the message is: Wind32: Access is denied.
IPSEC Services: Error 1899: The endpoint mapper database entry could not be created.
System Event Notification: Error 1068: The dependency service or group failed to start.
--trying to view the dependencies on the server returns the following message: Win32: Access is denied
Task Scheduler: "The endpoint mapper database could not be loaded"
We have three services failing with "the dependency service or group failed to start."
When we try to view the dependencies we get an access denied message.
Let's look in the registry to see what each of these services depend on:
Cluster service:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ClusSvc
DependOnService:
ClusNet
RpcSs
W32Time
NetMan
File Replication:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NtFrs
DependOnService:
EventLog
RpcSs
EventSystem
System Event Notification:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\SENS
DependOnService:
EventSystem
So the common dependencies are RpcSs and EventSystem
RpcSs is the Remote Procedure Call (RPC) service, and EventSystem is the Com+ Event System service. We know from earlier that Com+ Event System is one of the services stuck in a Starting state, so that is why the File Replication and System Event Notification services haven't started. One of the other dependencies for the Cluster service is NetMan, which is the Network Connections service. Network Connections is also one of the services stuck in a Starting state.
So now the real question is: Why are the Com+ Event System and Network Connections services not starting?
If we view the dependencies for these two services, we just find RpcSs listed. So it all boils down to RPC. However, the Remote Procedure Call (RPC) service is actually started.
If you do a search in the knowledge base on these errors, you are likely to come across this article:
909444 Systems that have changed the default Access Control List permissions on the %windir%\registration directory may experience various problems after you install the Microsoft Security Bulletin MS05-051 for COM+ and MS DTC
This discusses changes made by a hotfix that would cause these problems. The fix is to correct NTFS permissions on the %SystemRoot%\Registration directory. However the permissions here are the same as in the article.
You may also come across this one:
916254 COM+-related events may be logged in Event Viewer when you install Windows XP Service Pack 2 and join the computer to a domain
Most would come across this second article and instantly dismiss it since it says "Windows XP Service Pack 2." However, we have a lot of the same symptoms, and since XP SP2 and Server 2003 SP1 include a lot of the same security changes it warrants further investigation.
One of the security changes in SP1 for Windows Server 2003 was to change the Logon Account used for RPC.
RPC use to log on as Local System and now uses an account with less privileges: Network Service.
The article states that this issue occurs if the SERVICE account is missing from the policy setting "Impersonate a client after authentication"
We can see if SERVICE is missing from this policy by performing the following steps:
1. Open up Local Security Policy in order to see what the effective settings are:
Start, Run, secpol.msc
2. Expand Local Policies, User Rights Assignment and then open up "Impersonate a client after authentication"
At minimum the following should be listed: Administrators and SERVICE
The problem that I have seen recently happens when someone decides to change the "Impersonate a client after authentication" user right in group policy. Typically how it goes is they decide to lockdown their servers, and only give specific accounts certain privileges. However, after incorrectly removing the SERVICE account from this privilege the server loses all network connectivity. Fortunately this problem doesn't show up until after a reboot. (You have an opportunity to identify that the problem exists before causing a major outage of all servers in a large OU.)
The fix is simple for the servers that haven't been restarted:
1. Correct the policy and then force group policy to be reapplied. (gpupdate /force)
(To correct the policy: just add SERVICE and Administrators to this policy setting in addition to the other ones defined)
If you have already rebooted the servers after applying the incorrect policy settings they will not be corrected by just simply changing the policy back since they have already lost network access. (unless the policy change was made locally to begin with)
1. Export the following registry key:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\RpcSs
2. In the services snap-in: Change Remote Procedure Call (RPC) to start up with the Local System account instead of Network Service, and then reboot
3. At this point the majority of the services should be started and we should now have network access. Ensure that the offending group policy has been corrected with the proper accounts, force group policy to apply, (gpupdate /force) and then reboot.
4. Change the logon account for Remote Procedure Call (RPC) service back to Network Service by importing the reg file that you exported in step one, and then reboot. Alternatively: navigate to the following reg key and then reboot
:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\RpcSs
Change the ObjectName value from LocalSystem to: NT Authority\NetworkService
For more information regarding this security setting see article on Technet: SeImpersonatePrivilege
I have commented KB 269229 to reflect the requirement for SERVICE to be included in this User Right.
Please let me know if you like the format of this post or if you have any questions.
Until next time.
Thanks,
Justin Turner
This posting is provided "AS IS" with no warranties, and confers no rights.