Given that the NAC (Network Access Control) market is one of the hottest segments in the industry (I think virtualization has that distinction at the moment) it is fitting to take a look at the variety of options available from Microsoft's Network Access Protection (NAP). NAP supports a variety of what we call enforcement methods. In the NAP space, and enforcement method is simply a term that defines the way a machine connects to a network. In NAP, these are DHCP, 802.1x (wired or wireless), VPN, IPsec, or via a Terminal Services Gateway.
The most common method of the list is 802.1x for a variety of reasons. First, the industry has been selling 802.1x network authentication for the last 10 years. 1x gained tremendous popularity as wireless networking became prevalent in the late 90's and early 2000's and has been proven to be a viable solution to identifying assets and users on your network. For customers that have invested in 802.1x capable switches and access points, NAP can very easily be implemented to complement what is already in place. The Network Policy Server (NPS) role Windows Server 2008 has been dramatically improved to make 802.1x policy creation much simpler to do, however, what many people don't realize is that there really are 2 rather distinct ways to deploy 802.1x based NAP, and this is what we will be discussing today. These 2 methods are commonly referred to as the use of VLAN's or Port ACL's.
VLAN
Since we are talking about this in the context of NAP, this would be a good time to introduce the fact that taking the VLAN approach essentially requires that you involve the folks that own your switching infrastructure in your NAP plans. Why you ask, because you will now be asking them to touch all the switches and AP's on the network to create the VLAN structure that you will need for your NAP deployment. At a minimum, you would want to create 3 different VLAN's. One for 'healthy' or compliant computers, one for 'unhealthy' or non-compliant computers, and a third VLAN for guests, or unknown devices that cannot pass the ports requirement to do 802.1x authentication.
In the VLAN scenario, on your RADIUS server (i.e. our NPS server) you would create a policy that had a set of attributes with values that matched the VLAN you have created on the switch. The most common attributes used are Tunnel-Private-Group-ID, Tunnel-Tag and Filter-ID. The values for these attributes usually would match the VLAN name, or number you created on the switch.
As an example, let's say on your switch VLAN 100 is the compliant VLAN and VLAN 200 is the non-compliant VLAN.
To make this work when you walk through the wizard in NPS to create 802.1x policies you will create a compliant and non-compliant policy. When prompted to insert values for these attributes you will enter "100" for your compliant policy (i.e. Tunnel-Private-Group-ID = 100) and "200" for the non-compliant policy. Our wizard based configuration makes this very easy.
Once completed, when a machine comes onto your network and meets the criteria of one of the policies you created, the NPS will send back this tunnel information to the switch to instruct the switch to put that machine in the proper VLAN. Pretty simple and straight forward.
Port ACLs
There are 2 approaches here.
- You send the switch a 'reference' to an ACL you have already created on the switch
- You send the switch vendor specific attributes with values that tell the switch how to ACL the port
In scenario 1, you would do the heavy configuration on the switch by creating the ACLs you would want for compliant and non-compliant machines. Most likely those ACL's would restrict protocols and ports and access to only certain IP addresses. For this example let's say you have named your ACL's "compliant" and "non-compliant".
In your RADIUS server you would use something like the Filter-ID attribute (this is the most commonly supported attribute) with a string value of "compliant" or "non-compliant". When received the switch will then know what ACL to apply to that port.
In scenario 2, instead of configuring and sending the Filter-ID attribute, you would create Vendor Specific Attributes (VSAs) (this is a common concept in the RADIUS protocol) that tell the switch explicitly what ACL's to apply to that port. For example, the HP ProCurve line of switches will accept the following Vendor Specific Attribute (VSA)
permit in udp from any to 10.10.10.2 53
This essentially says 'allow any DNS traffic on this port to IP address 10.10.10.2'. The assumption is that 10.10.10.2 is your DNS server.
The pros and cons of the 2 port ACL approaches are fairly similar as well.
- Pros, simplified RADIUS server configuration, less prone to mistakes in the RADIUS server configuration; Cons, required to touch your entire switching infrastructure, ACL configuration isn't centralized
- Pros, doesn't require you to touch your entire switching infrastructure, configuration can be centralized on your RADIUS servers; Cons, more complex RADIUS server configuration, prone to mistakes in ACL configuration on the RADIUS server
Comparing the 2 approaches
Now that everyone understands what is required for each approach, let's take a look at some of the pro's and con's of each.
VLAN
+ The concept of VLAN's is one that is easy to explain that even a manager can figure out!
+ Doesn't require extensive knowledge of the RADIUS protocol to set up and anyone who's anyone at a switch CLI could get this set up pretty easily
+ Makes helpdesk troubleshooting a bit simpler by being able to quickly find out why a machine can't connect to (insert your answer here). It would go something like "Oh, you can't get to your mail because you're in VLAN 200!"
- The user experience can be very poor if the machine is being dynamically moved from VLAN to VLAN (which is what NAP does essentially). The reason why is because when a machine changes VLAN's the interface on the machine is torn down and essentially does an ipconfig /release /renew
- If not properly designed, this can be a real helpdesk nightmare. A common mistake here is to ACL down the non-compliant VLAN to not have any corporate access, which is a mistake since that machine may need to re-authenticate itself with the network after NAP has remediated it
- Requires you to touch all of your switches and AP's to do the VLAN creation and management.
- For NAP, your AP's and switches will need to support the ability to do dynamic VLAN assignment and not all switches and AP's support this concept. In fact, not all firmware versions from the same manufacturer support this, so an upgrade may be required.
Port ACL
+ Can possibly be implemented without having to touch all your switches and AP's since the configuration would reside on the NPS Server. This can also be seen as a political positive as well since infrastructure folks and server folks are commonly separate teams with separate objectives that rarely overlap.
+ The actual enforcement of the ACL is done at the switch or AP and thus offers the user a more pleasant experience since even if the machine is moving from a compliant to a non-compliant state (or vice versa) it is handled at the switch and not on the client machine (no ipconfig /release /renew)
+ The attributes and values required in your NPS policy to make this scenario work are commonly supported and have been for some time, so the chance of having to do a hardware upgrade in this scenario are less likely
- To really make this work effectively in an enterprise you really need to know the ins and outs of your switches and what is and is not supported, not to mention you must be a pretty good RADIUS geek as well to get this working (we are a dying breed these days… J)
- Troubleshooting and helpdesk support in this scenario is a bit more complicated since your NPS policy for this could have multiple ACL's in it that look like this (permit in udp from any to 10.10.10.2 53). It would not be
uncommon to have 10-12 lines like this in your policy and trying to figure out why a machine can't connect to a resource on the network
- Finding accurate documentation on exactly what attributes and values are supported for your device(s) can be a challenge
In conclusion
Hopefully now you have a better understanding of what 802.1x authentication support in NAP can offer you. 1x is a very powerful means of maintaining and safe and healthy network, but it's not the ultimate solution by any means. Network security and health is an ongoing exercise that may require multiple solutions to achieve your business goals (like using 1x and IPsec together for instance).
NAP is a very compelling feature with a lot of moving parts, so in future posts we will be visiting more topics around NAP. In the mean time, please visit the NAP development team's blog at http://blogs.technet.com/nap
One of the most compelling capabilities being added in IAG SP2 (which will also be available in UAG) is the 'virtual appliance' installation option. A virtual appliance is a preconfigured, ready to use Virtual Machine that already has Windows Server and IAG / UAG installed. Microsoft will build the VHD and make it available for customers to download. Customers will then take the Virtual Hard Drive (VHD) and drop it into a child partition on a Hyper-V host. At this point, the VM would function like a classic IAG installation, with all the normal features and capabilities customers have come to expect. The reason we've added this capability in IAG is to give customers options for how they want to deploy IAG in their networks. For many customers, the pre-tuned, dedicated hardware appliances available from our partners are a great option that fit in well with their overall management methodology. For other customers, they prefer a more standardized hardware platform in their datacenters and thus the virtual appliance on Hyper-V is preferred. Note that it's not a question of which is 'better'; the two options allow customers to chose the solution that best fits their environment.
For customers looking at deploying the virtual appliance, a common question is what is the best way to provide a secure virtualization environment for the IAG/UAG VM? There are three primary design options to choose from. Again, it's not a question of what option is best; rather, customers should look at each model and decide which best aligns with their management approach.
Option 1: Classic Physical Appliance
It may seem strange to list a physical appliance as an option here, but arguably the dedicated physical appliance is the most hardened configuration out of the box. The reason for this is that the OEM appliance vendors take Windows Server and IAG and really mold the entire hardware platform around them. In doing so, they reduce the attack surface of the machine by disabling services not critical to IAG, ensure necessary updates are installed, and then put that image on top of a hardware platform designed for them. Because IAG is built on top of Windows Server, it's possible for a customer to take many of the same software steps the OEMs do, but the benefit of the appliance is that it's all been done and tested for you. For customers looking for the most secure out of the box experience with IAG, physical appliances provide some unique benefits.
Pros: minimal configuration; pre-hardened operating system; hardware designed specifically for remote access gateway
Cons: limited hardware choice; potentially non-standard device and software configuration in an otherwise rationalized datacenter
Option 2: VM on Dedicated Hardware
While one of the key benefits of virtualization is the ability to run multiple operating systems simultaneously on the same physical hardware, it's by no means a requirement that a Hyper-V machine have more than 1 child partition. In other words, it's fully supported to run a Hyper-V system with only a single child. Why would you do this? If you want to have the manageability benefits of virtualization, but have workloads that can scale up and maximize an entire physical server, this approach is an effective model for getting the best of both worlds. Particularly when you use the Server Core option of Windows Server 2008 to run the parent partition, you have very minimal overhead incurred by doing so. In fact, key Microsoft web sites like TechNet and MSDN use this exact model in their production environments. When you think about this model for hosting IAG, the benefits are that you don't have concerns about resource contention between VMs (though Hyper-V has resource management controls available) and you don't have to worry about sharing the remote access gateway physical platform with any other workloads. Because Hyper-V supports the same huge catalog of server hardware that Windows Server 2008 does, you have great flexibility in what the physical layer looks like. Whether you prefer 1U, 2U, blades, and regardless of OEM, you'll be able to easily integrate the Hyper-V host and its IAG child partition into your existing datacenter. Finally, because you can use whatever hardware you prefer, it's easy to place the server wherever it needs to go within your network. For example, it is often easier to provision a new blade into the DMZ network to host IAG than it is to securely route traffic from the DMZ to a larger virtualization system in the internal network.
Pros: great choice in hardware; can use existing organization standards for hardware and operating system images; with Server Core, very low overhead for parent partition; great flexibility in network placement
Cons: may require greater setup effort to configure hardware and parent partition operating system
Option 3: VM on Existing Virtualization Environment
For customers that already have a Hyper-V environment, they may wish to simply add the IAG VM to the existing hosts. This is particularly true if a customer has already invested in building a highly reliable, well tuned hosting environment, using tools like Failover Clustering. In these cases, there's no problem with running IAG in a child partition on an existing physical server already running other VMs. So long as the traffic is properly routed to the VM, IAG can function perfectly well in such a configuration. However, when sharing physical resources with other child partitions, it's particularly important to allocate sufficient capability to the IAG VM. This should be done both by allocating enough memory and CPU capability to VM, as well as ensuring that Hyper-V prioritizes requests through the IAG VM appropriately. Additionally, there are significant performance and security benefits to dedicating physical network adapters solely to the IAG VM, rather than sharing them with other VMs. Having dedicated NICs ensures that IAG will not need to compete for network IO and simplifies the routing of remote access traffic to and from the VM.
Pros: efficiency of reusing existing investments in Hyper-V physical platform, such as Failover Clustering
Cons: more planning required to ensure sufficient resources for IAG child partition; potentially more complex network routing needs if the existing environment does not already receive traffic from internet hosts
Virtual appliances are all about customer choice; providing you with the right options for security and placement while allowing you to chose your own hardware platform or reuse one you already have. There's no right choice that applies to all situations, so think about your environment and goals, and chose the option that fits your network best.
One of the least known yet most powerful management features to ship with Windows Vista and Windows Server 2008 is built-in Event Forwarding which enables large scale health and state monitoring of a Windows environment (assuming health and state can be determined from Windows Events - which they usually can). Not only is this feature built into the latest versions of Windows, but it's also available for down-level OSs like Windows XP SP2+ and Windows Server 2003 SP1+ (here).
Note: True enterprise class Windows eventing is included with enterprise monitoring solutions like System Center Operations Manager.
This new Windows Event Forwarding (also known as Windows Eventing 6.0) is exceptional for the following reasons:
- Standards Based: No really! It leverages the DMTF WS-Eventing standard which allows it to interoperate with other WS-Man implementations (see OpenWSMAN at SourceForge).
- Agentless: Event Forwarding and Event Collection are included in the OS by default
- Down-Level Support: Event Forwarding is available for Windows XP SP2+ and Windows Server 2003 SP1+
- Multi-Tier: Forwarding architecture is very scalable where a "Source Computer" may forward to a large number of collectors and collectors may forward to collectors
- Scalable: Event Collection is very scalable (available in Windows Vista as well) where the collector can maintain subscriptions with a large number of "Source Computers" as well as process a large number of events per second
- Group Policy Aware: The entire model is configurable by Group Policy
- Schematized Events: Windows Events are now schematized and rendered in XML which enables many scripting and export scenarios
- Pre-Rendering: Forwarded Windows Events can now be pre-rendered on the Source Computer negating the need for local applications to render Windows Events
- Resiliency: Designed to enable mobile scenarios where laptops may be disconnected from the collector for extended periods of time without event loss (except when logs wrap) as well as leveraging TCP for guaranteed delivery
- Security: Certificate based encryption via Kerberos or HTTPS
This implementation will walk through the following example design where via Group Policy a domain computer group will be configured to forwared Windows Events to a single collector:

Implementation steps are as follows:
-
Step 1: Create Event Forwarding Subscription
-
Step 2: Configure WinRM Group Policy
-
Step 3: Configure Event Forward Group Policy
-
Step 4: Test
Step 1: Create the Event Forwarding Subscription on the Event Collector
In the Windows Event Forwarding architecture, the subscription definition is held and maintained on the Collector in order to reduce the number of touch-points in case a subscription needs to be created or modified. Creating the subscription is accomplished through the new Event Viewer user interface by selecting the 'Create Subscription' action when the 'Subscriptions' branch is highlighted. The Subscription may also be created via the "WECUTIL" command-line utility.
Note: Both Windows Vista and Windows Server 2008 can be event collectors (this feature is not supported for down-level). Although there are no built-in limitations when Vista is a collector, Server 2008 will scale much better in high volume scenarios.

Although the above subscription is configured to leverage Group Policy, the subscription can be configured in a stand-alone mode (see the "Collector Initiated" option). In addition, this subscription is designed to gather all events from the "Application" and "System" logs that have a level of "Critical", "Error", or "Warning". This event scope can be expanded to gather all events from these logs or even add additional logs (like the "Security" log).
Lastly, the subscription is configured to forward events as quickly as possible with the advanced settings delivery option of "Minimize Latency". The default setting of "Normal" would only forward events every 15 minutes (which may be more desirable depending the the Collector and Source Computer resources).

If Group Policy is not being used, configure the "Subscription type" to be "Collector Initiated". In this case Source Computers will need to be manually added to the Subscription either through the Subscription configuration or the "WECUTIL" command-line utility (which can also be scripted using PowerShell, but that's another topic).
Note: In cases where there Source Computer is generating a large volume of forwarded events (e.g. Security events from a Domain Controller), use WECUTIL on the collector to disable event rendering for the subscription. The task of pre-rendering an event on the source computer can be CPU intensive for a large number of events.
Step 2: Configure Group Policy to enable Windows Remote Management on the Source Computers (clients)
Group Policy can be used to enable and configure Windows Remote Management (WinRM or WS-Man) on the Source Computers. WinRM is required by Windows Event Forwarding as WS-Man is the protocol used by WS-Eventing. The following shows the Group Policy branch locations for configuring both WinRM and Event Forwarding:

The following GP setting will enable WinRM on the client as well as configure a Listener that will accept packets from ANY source.

Note: This Listener configuration should only be used in a trusted network environment. If the environment is not trusted (like the Internet), then configure only specific IP Addresses or ranges in the IPv4 and IPv6 filters.
To configure WinRM outside of Group Policy, run the following command on the Source Computer (also see the above Note):
winrm quickconfig
Step 3: Configure Group Policy to enable Windows Event Forwarding on the Source Computers
As with WinRM, Group Policy can be used to configure Source Computers (Clients) to forward events to a collector (or set of collectors). The policy is very simple. It merely tells the Source Computer to contact a specific FQDN (Fully Qualified Domain Name) or IP Address and request subscription specifics. All of the other subscription details are held on the Collector.

If Group Policy is not being used, then there is nothing to do here as the "Collector Initiated" Subscription will proactively reach out to the Source Computer.
Step 4: Test Event Forwarding
If all of the Event Forwarding components are functioning (and there's minimal network latency), a test event created on the Source Computer should arrive in the Collector's "Forwarded Events" log within 60 seconds. Create a test event with the following command:
eventcreate /id 999 /t error /l application /d "Test event."

This event should appear on the Collector as follows:

If the event doesn't appear, perform the following troubleshooting steps:
Troubleshooting Step 1: Has Policy Been Applied to the Source Computer?
This can be forced by running the following command on the Source Computer:
gpupdate /force
Troubleshooting Step 2: Can the Collector Reach The Source Computer via WinRM?
Run the following command on the Collector
winrm id /r:<Source Computer> /a:none
Troubleshooting Step 3: Is the Collector Using the Right Credentials?
Run the following command on the Collector
winrm id /r:<Source Computer> /u:<username> /p:<password>
Note: These are the credentials defined in the Subscription on the Collector. The credentials don't need to be in the local Administrators group on the Source Computer, they just need to be in the "Event Log Readers" group on the Source Computer (local Administrators will also work).
Troubleshooting Step 4: Has the Source Computer Registered with the Collector?
Run the following command on the Collector
wecutil gr <subscription name>
This will list all the registered Source Computers (note if the Subscription is "Collector Initiated" then this will list all configured Source Computers), their state (from the Collector's perspective), and their last heartbeat time.
Enjoy!
My name is Allen Stewart Team Lead of the Windows Server Customer Advisory Team in the Windows Server Division. We are a team of Program Managers that are experts in specific workloads/technologies. We work in the Windows Server Engineering organization and our main goal is helping to ship Windows Server that incorporates requirements/scenario based feedback directly from customers via Engineering focused customer councils and direct technical projects that my team engages in specifically focused on early, new complex scenarios. My team looks forward to working with the technical community and sharing best practices from our learning's across a range of topics for specific server workloads. Our blog will be the main way we communicate some of the interesting things we discover. Each month we will communicate a new topic (see John Morello's blog post: on Clustering CA'S) http://blogs.technet.com/wincat/archive/2008/07/21/to-cluster-or-not-to-cluster-cas.aspx
I would like to introduce the members of my team and their focus areas:
Allen Stewart | System, Management & Application Virtualization |
Xavier Pillons | High Performance Computing |
John Morello | Windows Security, Anywhere Access |
Otto Helweg | Windows Management, Datacenter Automation, Sysinternals |
Pat Fetty | Network Health and Policy, Forefront Suite |
Robert DeLuca | Identity |
Allen Stewart - Principal Program Manager Lead
One of the many enhancements in Active Directory Certificate Services in Windows Server 2008 is support for 2 node active / passive clustering. We have a great whitepaper, Configuring and Troubleshooting Certification Authority Clustering in Windows Server 2008, which walks you through the setup process. Because we just leverage the Failover Clustering already in Windows, the supported hardware and software configurations for running a highly available CA are the same for running other applications on a cluster. Many of the customers I work with have recently asked about whether or not they should implement clustered CAs and the answer really depends on what you're trying to achieve.
The first thing to understand is that having a highly available CA does not mean the same thing as having a highly available PKI. While it performs a critical role, the CA itself is only one part of the overall PKI and it could be argued that other components, such as CRL Distribution Points, are actually more sensitive to outages. In most PKIs, end entities will only talk directly to a CA to enroll for / renew certificates. If a computer enrolls for a certificate with a 2 year validity period, that computer will talk to the CA once to get the initial certificate and then not again until 98 weeks later (assuming a 6 week re-enrollment window). During that long interval, the client doesn't know or care if the CA is online, only that it can find and download a fresh revocation list. Thus, clustering CAs solely to support continuous enrollment services in the case of an outage is often inefficient; it would likely be cheaper and more simplistic to have 2 separate issuing CAs instead.
During an outage, the most critical capability to restore is that of the Certificate Revocation List (CRL). CRLs are used to ensure that certificates used by end entities are still valid and, depending on the application, the inability to retrieve a CRL with a current validity period can cause significant problems. For example, CRL retrieval issues are by far the most common root cause of smart card logon issues. Fortunately, there is no need to rely on clustering to keep CRLs fresh during an outage. So long as you have access to the CA's private key material, you can manually sign and publish CRLs while your CA is offline and ensure service continuity to your users.
None of this is meant to dissuade customers from deploying ADCS clusters, but rather intended to provide some context about what are the right scenarios to use them. The two primary needs for which I recommend clusters are for autonomous failover or geo-dispersal. While manual CRL signing and multiple issuing CAs can ensure that your PKI continues to work during the outage of a CA, some customers prefer failover to be an autonomous activity. In other words, rather than having to manually resign and republish the CRLs, they'd prefer for one CA to just take over for the other with no administrator interaction required. This is a great use case for Failover Clustering and many customers find that autonomous recovery to be worth the investment.
The other major use case is geo-dispersion of CAs to increase survivability in the case of a major disaster. Consider an organization that has multiple datacenters around the world. They may be pursuing a strategy such that one of these datacenters is able to take over for another in the case of a major disaster. Or, the organization may have a dedicated 'hot site' whose sole purpose is to take over operations in the case of the loss of the primary site. In both of these cases, CA clustering provides a great way to ensure that a failure of one site will not interrupt enrollment or CRL signing services for the clustered CA. Typically this style of clustering, known as Multi-Site Clustering, leverages partner solutions to replicate the data between sites.
In most if not all enterprise customers most technology areas are driven by various teams responsible for a technology area. Some enterprise customers have more integrated technology teams then others. So how does this affect the common approach today of infusing virtualization into datacenters and building out a virtualization utility/service? To date this task has been assigned to a virtualization team or a virtualization expert in most companies folks that understand the virtualization technology well. This is reflective of the disruptive nature of the virtualization technology initially and companies reacted by creating teams to work on Virtualization. The virtualization projects started out small with Test/Development environments and expanded into production then into full blown virtualized datacenter projects.
So of course the datacenter existed long before the project to create a virtualization utility/service with various technology and process areas. Some will say the datacenter was lumbering along (no offense to present datacenter operations managers) with various issues and challenges and that virtualization is going to cure all the ills of the datacenters. I am a virtualization person and I love the virtualization technology and I happen to believe that virtualization capabilities have the potential to change how we build and use capabilities in the datacenter and beyond. Ok with that said without reference architecture of these new datacenters where virtualization is a key pillar we may miss some opportunities to completely realize all of the potential across datacenter service areas.
A Virtualization datacenter project should seek to review present datacenter capabilities determine gaps, leverage existing capabilities and redesign others that do not fully leverage or hinder the virtualization effort. What are some of the areas that have to be taken into account when embarking on this journey to build a virtualized datacenter? Well let's look at some of the services that a present day datacenter offers:
- Power and Cooling
- Systems/Service Management
- Backup services
- Disaster Recovery services
- Security Services
- Capacity services
- Storage services
Equipment Provisioning
- Servers
- Network devices
- Storage
- Backup devices
- Compliance services
So it is pretty clear that today's datacenters offer a range of complex services and you thought all it did was keep servers from being homeless. So a project to infuse virtualization as a core pillar has to seek to understand how virtualization impacts the various datacenters areas. I know what people are thinking, I already have virtual machines in production so this is a mute point. There is some production virtualization but very few projects that have truly looked at all of the datacenter areas and designed the area to fully leverage virtualization. This is an opportunity for datacenter architects and datacenter service owners to think and rethink how virtualization can affect some of the service areas in the datacenter. This exercise will help unlock potential that relooking at some of the services with virtualization capability in mind will provide. Some core areas to start are Storage and Business continuity areas that receive a lot of interest especially from the Virtualization community but have very little architecture patterns and practices. I look forward to your comments on this and experiences' looking deeper at datacenter services that Virtualization affects.
Allen Stewart
Principal Program Manager
Windows Server Group
Hopefully you get better presents than this, but if not, here are a couple of web casts I just did that you might be interested in:
Certificate Services Updates |
Amesh Mansukhani, Senior Product Manager, speaks with John Morello, Program Manager, Windows Server Division, about Windows Server 2008 and what's new with Certificate Services. John lists the four major pillars of Server 2008 for cryptography and certificate services, and they continue their conversation around the two biggest features—manageability and revocation—that will be impactful for customers. |
http://www.virtualteched.com/pages/videos.aspx
TechNet Webcast: Windows Server 2008 Terminal Services Security and Authentication (Level 300)
Are you looking to extend remote access to users on the road or at home? Learn how Windows Server 2008 Terminal Services can provide a solution. Join Microsoft Program Manager John Morello in this webcast as he provides an overview of Terminal Services Gateway and Terminal Services Authentication, and shows you how they can help take your business to the next level. Additionally, he demonstrates anywhere access to remote programs using single sign on (SSO).
http://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?culture=en-US&EventID=1032355426&CountryCode=US
Because CRL validity is so important to the functionality of your PKI, it's important to have proactive monitoring of the validity of your CRLs. Particularly in situations where you're using smart cards for logon, having a CRL go stale can be a very disruptive experience to end users.
In my session at IT Forum in Barcelona this week, I'm showing some of the new manageability features of Certificate Services in WS08. One of the major investments for CertSrv in this release is manageability, including integration with MOM/SCOM. One of the features I'm demonstrating is the extensibility of SCOM to help with monitoring the freshness of CRLs. The following scripts can be used to alert you to the status of CRLs before they go stale, allowing you to avoid outages before they happen. They can be integrated with MOM/SCOM for centralized alerting and reporting or just run by themselves.
Here's how the scripts work:
- checkCrlFreshness.cmd calls wget.exe (you'll need to download this separately) and downloads the CRL from whatever location you specify
- checkCrlFreshness.cmd then calls compareNextUpdateToNow.vbs
- compareNextUpdateToNow.vbs calls certutil "nameOfCrl" and scrapes the NextUpdate: line from its output
- compareNextUpdateToNow.vbs then compares the hourly NextUpdate value to whatever value is passed by checkCRLFreshness.cmd when it calls compareNextUpdateToNow.vbs
- if the NextUpdate value is < the threshold value (i.e. the CRL's expiration is below your minimum value), an alert is written to the application log on the machine where the script is run
SCOM can then be leveraged to watch the event logs for this particular event and then alert or take whatever other action you've specified. Note that if you're running the scripts on Vista or WS08, you can also take advantage of the new Windows eventing infrastructure to run a task or send an email when the event occurs, without having to have MOM/SCOM involved at all.
The nice thing about this script is that it can easily work against any HTTP CDP. So, you can use it to monitor CRLs inside or outside of your organization, including those of partners or service providers that you might depend on.
Scripts are provided as is, us at your own risk, no support, etc, etc…
checkCrlFreshness.cmd
:: CRL freshness monitoring master script
:: downloads all relevant CRLs with wget, calls VBScript to compare NextUpdate time with current system time to determine freshness
:: John Morello, Windows Server Division
:: uses wget to download CRL; -r says to overwrite existing copy, -nH --cut-dirs=1 says to save in current directory (removes --cut-dirs value from end of URI)
:: wget must be in same directory as this script or in %Path%
wget.exe -r -nH --cut-dirs=1 "http://ws08rc0.pki.test/certenroll/pki-ws08rc0-ca.crl"
:: calls compareNextUpdateToNow.vbs to compare time difference between Now() and NextUpdate value in CRL
:: if difference is < the second parameter (in hours), compareNextUpdateToNow.vbs writes warning to AppEvent
cscript compareNextUpdateToNow.vbs "pki-ws08rc0-ca.crl" 336
compareNextUpdateToNow.vbs
' CRL date comparison script
' takes 2 command line parameters, the name of the CRL and the acceptable age threshold
' if NextUpdate time from CRL is not > acceptable threshold (or other error condition exists), writes event to AppEvent
' Robert Deluca, Windows Server Division
' John Morello, Windows Server Division
Option Explicit
' check if command line parameters were passed
If WScript.Arguments.Count < 2 Then
wscript.echo "Usage: cscript crldatecheck.vbs <CRL filename> <age threshold in hours>"
wscript.quit
End If
' assign params to variables
Dim CRLFilename
CRLFilename = WScript.Arguments.Item(0)
Dim ThresholdHours
ThresholdHours = CInt(WScript.Arguments.Item(1))
' Create wscript.shell object for exec
Dim WshShell
Set WshShell = CreateObject("WScript.Shell")
' Execute certutil
Dim oExec
Set oExec = WshShell.Exec("certutil """ & CRLFilename & """")
' Wait for certutil to finish running
Do While oExec.Status = 0
wscript.sleep 100
Loop
' Read certutil output until end of stream or found the NextUpdate: line
Dim DateString
Dim Found
Found = false
Do While (Not Found) And (Not oExec.StdOut.AtEndOfStream)
DateString = oExec.StdOut.ReadLine
Found = instr(DateString,"NextUpdate: ") = 1
Loop
' exit if the proper line wasn't found
Dim NoNextUpdateFoundMessage
NoNextUpdateFoundMessage = "Failure to read Next Update time from CRL."
If Not Found Then
wscript.echo "Failure to read Next Update time from CRL."
WshShell.LogEvent 2, NoNextUpdateFoundMessage
wscript.quit
End If
' remove the header from the line
DateString = Replace(DateString,"NextUpdate: ","")
' exit if the rest of the string isn't recognized as a date
If Not IsDate(DateString) Then
wscript.echo "Date not recognized as valid."
wscript.quit
End If
' convert the string to a date variable
Dim d
d = CDate(DateString)
' calculate hours until NextUpdate
Dim HoursUntilUpdate
HoursUntilUpdate = DateDiff("h",Now,d)
' display update and threshold information
Dim Message
Message = "Next CRL update is in " & HoursUntilUpdate & " hours. Threshold is " & ThresholdHours & " hours."
wscript.echo Message
' update time is below acceptable threshold, write message to screen and event log
If HoursUntilUpdate < ThresholdHours Then
wscript.echo "Time is below acceptable threshold! Writing warning to event log."
WshShell.LogEvent 2, Message
Else
wscript.echo "Unknown failure! Manually verify CRL and diagnose run time failure."
WshShell.LogEvent 2, Message
End If
wscript.quit
Pete Rivera is the Windows Team Lead on one of our DoD support teams and we've been working together on a NAP project. In addition to being a master of style and male fashion, Pete also puts together some great guidance for his customers. Recently, he wrote a detailed description of all the various logging capabilities that you might ever need to use to debug a NAP problem. Thanks, Pete!
- NPS has various places where it does logging and/or creates a log… First off we do accounting IAS logging of the NPS status and network connection process data in %windir%\system32\LogFiles, but it can be configured to an alternative location. The log is:
IN<date>.log
2. Secondly we also can do SQL logging to a SQL 2k or SQL 2k5 database. This is used for logging user authentication and accounting requests: Logs user authentication and accounting requests in a stored procedure in a SQL Server 2000 or SQL Server2005 database. Request logging is used primarily for connection analysis and billing purposes. It is also useful as a security investigation tool, providing a method of tracking down the activity an attacker.
3. Likewise you can enable debug trace logging via netsh and this can be used to help provide detailed information about the Network Policy Server operation when NAP policies are configured: Netsh ras set tr * en
%windir%\Tracing\IASNAP.log
4. In addition this enabled a slew of other IAS/RAS related logs in the same folder (i.e.: IASSAM.LOG, IPSEC etc ):
%windir%\Tracing\*.log
5. You also have Event Logs. These provide a lot of info about the operation of NAP and connecting clients but is used primarily for auditing and troubleshooting connection attempts. Depending upon your build they are either in the SYSTEM (B3) log and/or the security log (RC0). There is also the Network Access Protection event log which you'd find on NAP clients.
6. On the client side we can enable NAP client Debug Tracing logs as well. This is enabled either via netsh or via the NAP client Configuration snap-in. It's an ETL file which is generated only by using logman… so you'll need to do a logman start QAgentRt -p {b0278a28-76f1-4e15-b1df-14b209a12613} 0xFFFFFFFF 9 -o %systemroot%\tracing\nap\QAgentRt.etl –ets in order to turn start .etl generation.
7. likewise we can also do WHSA tracing for NAP also… the trace GUID is 789e8f15-0cbf-4402-b0ed-0e22f90fdc8d
8. DHCP QEC tracing…
Netsh dhcpclient trace enable. This command enabled QEC tracing and the trace files will be generated at %WINDIR%\System32\LogFiles\WMI\DHCP*.*
9. EAPHost Tracing for 802.1x
Trace logs containing debugging information can assist users in finding the root causes of issues that occur during the EAP authentication process. The debugging information can include API calls performed, internal function calls performed, and state transitions performed. Tracing can be enabled on both the client side and the authenticator side.
When EAPHost tracing is enabled, logging information is stored in an .etl file in a user-specified location. Tracing generates an .etl file.
10. EAPHost Tracing for 802.1x (client side)
To enable tracing on the client side:
Run the following command: logman start trace EapHostPeer -o .\EapHostPeer.etl -p {5F31090B-D990-4e91-B16D-46121D0255AA} 0x4000ffff 0 -ets
Run the following command: logman stop EapHostPeer -ets
Convert the etl file into text using the following command: tracerptEapHostPeer.etl –pdb <pdbpath> -tp <tracemessagefilesdirectorypath> -o EapHostPeer.txt
11. EAPHost Tracing for 802.1x (Authenticator side)
To enable tracing on the authenticator side:
Run the following command: logman start trace EapHostAuthr -o .\EapHostAuthr.etl -p {F6578502-DF4E-4a67-9661-E3A2F05D1D9B} 0x4000ffff 0 -ets
Run the following command: logman stop EapHostAuthr -ets
Convert the etl file into text using the following command: tracerptEapHostAuthr.etl –pdb <pdbpath> -tp <tracemessagefilesdirectorypath> -o EapHostAuthr.txt
12. The we have the SCCM related logging specific to the SCCM SHA and shv. The Configuration Manager 2007 client computer log files are found, by default, in %windir%\CCM\Logs. For client computers that are also management points, the log files are found in %ProgramFiles%\SMS_CCM\Logs.
13. Ccmcca.log
This file logs the processing of compliance evaluation based on Configuration Manager NAP policy processing. It also contains the processing of remediation for each software update required for compliance.
14. locationservices.log
This log is used by other Configuration Manager features (for example, information about the client's assigned site), but it also contains information specific to Network Access Protection when the client is in remediation. It records the required remediation servers (management point, software update point, and distribution points that host content required for compliance), which are also sent in the client statement of health.
15. SMSSha.log
This is the main log file for the Configuration Manager Network Access Protection client, and it contains a merged statement of health information from the two Configuration Manager components: location services (LS) and the configuration compliance agent (CCA).
This log file also contains information about the interactions between the Configuration Manager System Health Agent and the operating system NAP agent, and also between the Configuration Manager System Health Agent and both the computer compliance agent and location services. It provides information about whether the NAP agent successfully initialized, the statement of health data, and the statement of health response.
16. CIAgent.log
This tracks the process of remediation and compliance. However, the software updates log file, Updateshandler.log provides more informative details on installing the software updates required for compliance.
17. SDMAgent.log
This log file is shared with the Configuration Manager feature desired configuration management, and it also contains the tracking process of remediation and compliance. However, the software updates log file, Updateshandler.log provides more informative details about installing the software updates required for compliance.
- On the server side for the System Health Validator point, you should first check the Windows Application event log on the Windows Network Policy Server computer. This log will record any failure categories and errors with the source being SMS_SYSTEM_HEALTH_VALIDATOR. These are also raised as Configuration Manager status messages. Otherwise More detailed logging information can be found in the Configuration Manager logs and the System Health Validator point log files are located in %systemdrive%\SMSSHV\SMS_SHV\Logs.
19. Ccmperf.log
This log contains information about the initialization of the System Health Validator point performance counters.
20. SmsSHV.log
This is the main log file for the System Health Validator point. It logs the basic operations of the System Health Validator service, such as the initialization progress.
21. SmsSHVADCacheClient.log
This log file contains information about retrieving Configuration Manager health state references from Active Directory Domain Services.
22. SmsSHVCacheStore.log
This log file contains information about the cache store used to hold the Configuration Manager NAP health state references retrieved from Active Directory Domain Services, such as reading from the store and purging entries from the local cache store file.
23. SmsSHVRegistrySettings.log
This log is used to record any dynamic changes to the System Health Validator component configuration while the service is running.
24. SmsSHVQuarValidator.log
This log file records client statement of health information and processing operations. To obtain full information, change the registry key LogLevel from 1 to 0 in the following location:
HKLM\SOFTWARE\Microsoft\SMSSHV\Logging\@GLOBAL
25. <InstallationPath>\Logs\SMSSHVSetup.log
This log file records the success or failure (with failure reason) of installing the System Health Validator point.
A customer I work with recently wanted to have a scriptable method to take any given CRL and determine the total number of revoked objects it contains. Luckily, certutil combined with your favorite findstr / grep / regex application can do this quite easily:
certutil.exe –dump <CRLFileName>| findstr "Entries"
The output will be a numerical count of the total number of revoked items. Note that this doesn't need to be run from a CA, nor does it have any dependencies on the issuer's chain.
Allen Stewart from the Windows Server Division (WinCat) team. I spend a lot of time with companies that are deploying a virtualized architecture for Datacenters and Branch offices. Some of the technologies leveraged in these scenarios, capacity planning tools, workload migration technologies, P2V, V2V, High Availability, virtualization management, virtual machine backup/snapshots and service oriented management. While the technologies are well understood one thing has been pretty fluid is the administration model for the virtualized environment. Lets dive in there does not seem to be a consistent model some companies have taken the approach of keeping things the same way as the physical environment, others have created a virtualization group and assigned them the task of managing the virtual world. I understand both schools of thought, the virtual world should not change the administration model, or the virtual world is so disruptive and demands new skills, approaches that we need a group directed at the technology. So what I am really interested in is the administration model that will win out and your thoughts on the topic because this should drive flexible administration models in virtualization products.
So in the centralized approach Tier 1 and Tier 2 have complete rights to the environment and handle activities like VM creation from templates, deleting VM'S, starting/stopping, workload migration. The Engineering team handles, virtualization product evaluation, environment build out, creating standard VM builds/templates.
Tier 1 Support – Initial call support and case management
Tier 2 Support- Escalation deep troubleshooting
Windows Server Engineering/Dedicated Virtualization Team – Escalation deep troubleshooting/environment design changes.
Certain environments like branch offices and test/dev labs may dedicate a different model where Virtualization tasks and activities are delegated to business units or IT in branch offices. I am not going to cover the scenario where there is decentralized IT and each business unit does their own engineering (that to me still looks like the model above abet more political). In the branch office scenario you can still manage centrally but in the case where you have an IT person in the branch this forces delegation of activities. In this case it seems the activities that get delegated the most are the same as Tier 1 and Tier 2 personnel. So this starts to look and feel like the Active Directory Organization Unit delegation model a person is able to handle virtualization tasks in single or multiple sites with the central IT group having complete access. Please send in comments on how your virtualization environment is structured and any ideas you have on how you feel the administration model should be structured. My next blog will cover virtualization administration roles another fun topic on this path, thought being do we create specific in box roles or just leave it completely, flexible. Also, next in that path is assigning roles to tasks in the new Virtual Datacenters.
Allen Stewart
Principal Program Manager
Windows Server Division
Jeff Sigman (NAP Release Manager) just added a post I wrote to the NAP blog about an upcoming customer webcast. If you weren't able to make it to TechEd, you probably missed the session that fellow WinCAT PM Pat Fetty did with Hunter Ely, who is charge of Louisiana State University's IT security technology and is one of our NAP TAP customers. Pat and Hunter's session covered LSU's architecture, deployment process, and lessons learned. So, if you've wanted to hear some real world experiences on NAP, tune into the web cast.
Morello
Most of the WinCAT team will be at TechEd next week delivering sessions and answering questions. Here's an overview o