Hey NAP fans, I’m Alex Chalmers from Ball State University with a guest post about NPS logging.
If you made it to one of Jeff’s TechEd IT Pro presentations, you’ll remember me discussing our NAP implementation and some of the challenges that we’ve faced along the way. Gathering accounting data across the NPS implementation for reporting is one of the largest we’ve faced so far. With multiple NPS servers around our campus for redundancy, trying to trace a session from login to logout can be tough without some type of centralization. There are a few possible solutions, but there are several gotchas to be aware of.
A natural reaction to the challenge might be to point to using SQL logging using a single, central data source for all of the NPS servers. It isn’t a bad solution for a small/medium sized site and is relatively simple to manage. But there is a pretty large problem with using this scenario when using RADIUS authentication (like with NAP)… if logging the event fails during authentication, the authentication will fail (refer to the Deploying SQL Server Logging with Windows Server 2003 IAS Guide, as it is still relevant). This means that if the central database server is ever down or otherwise unreachable, end users can no longer authenticate (or re-authenticate) to the network. At my site where we use 802.1X enforcement with session timeouts it would probably cause a flurry of helpdesk calls, depending on the length of the outage, and guarantee persona non grata status for me with my client services staff for a few days.
In trying to work out an alternative solution, other options were rejected for various reasons. Logging to a flat file didn’t solve any problems; it has the same problems of the current design with the added issue of trying to get the data into a reporting format. Trying to use SQL replication was out as we would have had to license SQL Server Standard or Enterprise for all of our NPS servers, as SQL Server Express can’t act as a publisher. Running independent SQL Server Express instances on each NPS system on its own could have worked, but you are limited to a 4GB database and still have to manually centralize the logging. Luckily, as we were looking at this last option some very knowledgeable people suggested we look at using SQL Service Broker.
Service Broker is a communications framework built into SQL Server 2005, and unlike replication in this case it could be used to send data from a SQL Server Express instance to a central data warehouse. The framework design is nearly tailor-made to be used in this situation. In its most basic sense, it enables two entities to send messages to each other while ensuring the messages are reliably received only once and in the same order they were sent. Reliable delivery, even across system restarts and network outages, and delivery without repetition are the two keys for this particular implementation.
I’ve created a set of SQL scripts that will help you to create the necessary objects for a basic, but functional, solution. But before we can get to implementing anything, we need some prerequisites. You will need to download and install SQL Server Express and Management Studio Express on each of your NPS servers. You will also need to have a server capable of storing your aggregate logging data running SQL Server 2005 Standard or Enterprise edition. Unfortunately, Service Broker will not communicate between to Express edition server instances. Each server must be addressable by DNS name. Configuring Service Broker on an instance will open a TCP port for communication, which will need access through any firewalls if present. The de facto default port is 4022, but it can be changed if needed. You will also need to have some paths pre-created for each server’s database and transaction logs, as well as a working directory to store certificate and key backups for disaster recovery purposes. Once you have these prerequisites complete, you can move on to running the scripts.
These scripts are sample code and assume that objects do not exist. Please take the time to analyze what is going on in each of them, and run through the scenario at least once in a test environment to be certain that the configuration is exactly what you want before moving into production. The script files use the Template Parameter feature of Management Studio Express to allow you to tune certain items in the scripts to fit your environment. Before running the scripts, please fill in the template information by selecting Query->Specify Values for Template Parameters… from the menu bar. Inside the script zip file, I have included a worksheet with the parameters used in each script to help you prepare.
Starting on the central SQL server, the first script to run is the ConfigureConsolidationServer.sql script. This will configure the Service Broker endpoints, create the consolidated accounting database, and create the basic broker service that each NPS server will connect to. While you should be able to execute the whole script in one batch after configuring the template parameters, I would suggest running it one section of code at a time to see the steps in action. When the script completes, you should have several files in the working directory you specified as a parameter. While you should store each of the files securely for disaster recovery purposes, you will need to copy the two certificates to each NPS server before running the next script.
Once the central server is configured, we can move to each NPS server. Open ConfigureNPSServer.sql in Management Studio Express and configure the necessary parameters. The certificates that you copied over in the previous step should reside in the working directory specified here. Those certificates will be used to identify and secure the remote broker service to the NPS system. This script will generate two similar certificates used to identify the NPS server to the central SQL server. You will need to copy them over before proceeding.
Now that servers have a baseline configuration, running the ConnectNPSServer.sql script on the central SQL server will authorize the NPS server to communicate on the Service Broker service. In a multi-server NPS configuration, you will need to run a version of ConnectNPSServer.sql for each NPS server in the environment. Once the scripts have run successfully, you should configure your NPS logging to log to the local SQL server instance. You will know if the scripts work by examining the RADIUS_Events_XML table on the central SQL server. If events are being stored, the configuration is successful. If you are getting data stored locally, but not to the central server, check that the addresses you’ve used in the scripts are valid and that all the ports are listening as expected. The majority of issues that I've run into with this configuration have been caused by either a bad address or a firewall blocking the Service Broker port that was configured for each server.
The magic of this configuration happens in two stored procedures: Report_Event on the NPS server and Collect_Events on the central SQL server. Report_Event is called whenever NPS logs an event. NPS sends an XML fragment to the stored procedure, which is then assigned a timestamp and GUID and stored in a local table. Additionally, the stored procedure transmits the data, including the additional timestamp and GUID, via Service Broker to the central SQL server. Collect_Events is called whenever data is logged to the central SQL server's Service Broker queue. It contains the logic to receive messages via the Broker service. The raw XML data is then stored in the RADIUS_Events_XML table, along with the previously assigned GUID and event timestamp. All of the remaining script code is used to create the infrastructure to allow these two procedures to work effectively.
Since I've said that these scripts are sample code, what could be improved upon for your environment? The first item I would look at is managing the local logs stored on the NPS server. These are stored there only as a safeguard until you can be certain that the data is accessible on the central system. You could deal with this issue in many ways, including not bothering with the local cache. The second major thing to look at is the data format on the central server. While centralizing the data is the main goal of this post, working directly with the XML data for reporting isn't necessarily the most elegant of solutions. You will probably want to either extend the Collect_Events procedure or create a scheduled job that will process the RADIUS_Events_XML table and transform the data into table form. Depending on the data that you're most interested in, you may find that a given event will have multiple entries for a given attribute (SHA SoH data is one) so you might need multiple tables with relationships. Key them off the event GUID assigned in the Report_Event procedure so that you can track an event's data where ever it may reside. The last item that I would look at directly is whether there is any organizational data that you might need to store at or near the time of the event. If your user population has somewhat frequent name changes, as an example, you may want to extend the data to include not only the username of the account used to login but the user object's AD GUID, SID, or some other unique identifier so that you can track a user's activities over a period of time without needing a list of usernames.
As you can see, this solution provides quite a bit of flexibility to design a system that will work for your needs. The downside to the solution is it does require a fair amount of knowledge about SQL Server to pull data from the logs and design queries that can later be used in a reporting solution, using Reporting Services or some other mechanism. The scripts I've implemented are really only the backbone of the solution, providing the necessary infrastructure and "glue" to allow the servers to communicate effectively.
I know that you will most likely have questions about our deployment or how these scripts function beyond the small novel of a post I have here. I'll happily answer any question in the comments of this post, or you are most welcome to send me email. If you have ideas, suggestions, or tips on how you've implemented something please share them as well!
- Alex B Chalmers
Not too long ago, I posted that I would be creating a guest post for the NAP blog...
http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=4000002748 This detailed case study provides