Update Dec 3 2012 : Please see the blog post on the Service Manager blog which is up to date. This blog post is remaining here only to increase the visibility via search results.
Introduction and Problem Statement
There has been quite a bit of confusion over the requirements for SQL Server collation for the various components of System Center. We didn’t make a coordinated effort here to make this as easy as it could and should be. The documentation has conflicts within itself in some cases. We will be cleaning up the documentation over the next few days so that it is consistent and clear. Since the documentation is still component focused, we want to take this opportunity to provide a holistic view of the SQL Server collation requirement across the suite. This will hopefully clear things up and over time it will become even clearer as we make changes in the product itself to be consistent on this requirement.
SQL Server collation for those of you that are unfamiliar with the concept is how SQL Server controls the sorting, storage, indexing, and comparisons of characters in a database. It is essential to use the right collation depending on the type of data that you intend to store in order to ensure that the system will function correctly. If you use the wrong collations, searches may be less effective or not work at all, sorting might produce unexpected results, and other problems can happen when inserting or retrieving data.
There are two types of collations – SQL Server collations and Windows collations. Generally speaking, SQL Server collations start with “SQL_” and Windows collations do not. SQL Server can use either a SQL Server collation or a Windows collation. Collation names are things like “Latin1_General_100_” or “Chinese_PRC_”. Following that name there are typically some indicators in the name for case sensitivity and accent sensitivity. CI = Case Insensitive, CS = Case sensitive. AI = Accent Insensitive, AS = Accent sensitive. Thus a complete collation name might be something like Latin1_General_CI_AS. This is a Windows collation which would stand for Latin1 (language) case insensitive and accent sensitive. Some of the newer collation names also have a number in them like 90 or 100. These refer to the version of SQL Server they were built for – 90 = 9.0 = SQL Server 2005, 100 = 10.0 = SQL Server 2008. You can read more about collations here and here.
Historically, System Center products have been tested and supported with only the SQL_Latin1_General_CP1_CI_AS collation. It is the default collation when you install SQL Server on an EN-US Windows Server. When you install SQL on all other languages of the Windows OS the collation default will be something different, typically a Windows collation. The SQL Server collations are no longer being updated and will eventually be phased out in favor of the Windows collations. Therefore, the SQL_Latin1_General_CP1_CI_AS collation doesn’t have some of the latest Unicode characters in its codepage which can cause some problems for those languages which have recently added characters to Unicode. Historically this has been a minimal to non-existent issue. With the introduction of Service Manager to the System Center suite we had our first component that has (a) a full-text searchable knowledge base and (b) a user base that included not just IT people but really the entire organization. The result is that data is stored in newer Unicode characters more often. We started hitting search and sort issues with customers that were using the SQL_Latin1_General_CP1_CI_AS collation during the beta testing of Service Manager 2012 because they were storing some of the newer Unicode characters in the database that weren’t in the SQL_Latin1_General_CP1_CI_AS collation codepage.
The SQL Server team advised us that we should start switching to the Windows collations to minimize these issues from happening. That is why when you are installing SCSM on a SQL Server that is using the SQL_Latin1_General_CP1_CI_AS collation you will see this warning message:
You can bypass this warning message and continue to install using the SQL_Latin1_General_CP1_CI_AS. Further, the SQL_Latin1_General_CP1_CI_AS collation was not listed as a supported collation on the Language Support for System Center 2012 - Service Manager page on the TechNet Library. The required collation for SCSM per the documentation currently is Latin1_General_100_CI_AS (note the 100).
Because all other System Center components (except SCOM which can also be installed on a few other additional collations – more on that in a minute) are required to be installed on SQL_Latin1_General_CP1_CI_AS, most people thought they would need a dedicated SQL Server instance for SCSM. We would generally recommend having a separate instance of SQL Server for running SCSM for performance and scalability reasons, but understand that there are probably plenty of cases where it makes sense to collocate SCSM and other System Center components on a single SQL Server instance.
Further, in one place the SCSM product documentation indicated that the SCOM DB and the SCSM DBs had to be exactly the same collation. This effectively meant that there was a conflict in the documentation.
The rest of this blog will describe how we are going to resolve this and what steps we are taking to make this more clear.
Resolution for System Center 2012
We have recently updated the documentation to add support for the following additional collations for SCOM:
German, Brazil (Portugese), Italian (new)
Chinese CHS (new)
We will also be updating the SCSM documentation to add support for the following additional collations:
I will now attempt to clarify what is meant when the documentation says that “You must use the same supported language collations if you intend to import data from Operations Manager into Service Manager.” This is true only for the OM OperationsManager database and the SM DWStagingAndConfig database in the scenario of creating an OM Data Source for the DW. To be clear this is known in the UI as a Data Warehouse Data Source. This does not affect either the SCOM -> SCSM Configuration Item connector or the SCOM -> SCSM Alert -> Incident connector. Visually – this is what I am talking about:
The problem with using different locales is because we use a special .Net class called SqlBulkCopy to copy the data from the OperationsManager database to the SM DWStagingAndConfig database. This particular class will throw an exception that looks like this if the Locale ID of the collations of the source and target DBs don’t match.
ETL Module Execution failed:
ETL process type: Extract
Batch ID: 594
Module name: Extract_ManagedType_CTI_OM12_CHS
Message: The source Column "TypeName" locale ID "1033" and the target column "TypeName" locale ID "2052" do not match.
Thus, the OM OperationsManager database and the SM DWStagingAndConfig database must have matching collation locales (not necessarily the exact same collation). All other databases in OM and SM can have mismatched collations (and even locales) if needed so long as you are using a collation which is supported by that particular component. This is because in other points of integration we are not using this SqlBulkCopy class to copy the data over.
If you have databases with collations that don’t match then you will not be able to use the SCOM -> SCSM DW integration which brings over alerts from the OM OperationsManager DB (not DW) to the SM DWStagingAndConfig.
We have also produced a support matrix which indicates the supported combinations of using SM DWStagingAndConfig databases and OM OperationsManager databases together in cases where the locale IDs are compatible.
Any of the green squares will be supported in combination with each other. The ‘Y’ squares have been specifically tested and passed the tests. The Y*, Y**, Y*** squares have not been specifically tested, but because they are very similar to other combinations tested should be fine and we will deal with any issues on a case by case basis through support. Any combination outside of a green box may or may not work and are not supported. The green combinations should cover the majority of common use cases. If there is a combination that is currently on the unsupported list and you would like to see it supported, please leave a comment below and we’ll take a look at the request. The documentation will be updated to reflect this matrix in the next couple of weeks but you can use this as a guide for now.
This matrix indicates that you may use SQL_Latin1_General_CP1_CI_AS for SCSM but please be aware that storage of the newer Unicode characters may cause your database to function incorrectly and require a support case to attempt to resolve the issue. Using SQL_Latin1_General_CP1_CI_AS in combination with storing Unicode characters which are not supported by the codepage for SQL_Latin1_General_CP1_CI_AS is not supported. If you need to store those newer characters then use a Windows Latin collation - either Latin1_General_CI_AS or Latin1_General_100_CI_AS.
You may be asking “Why doesn’t SCOM support Latin1_General_100_CI_AS?” It should, but right now the setup is designed to block installing SCOM on any other collation besides those listed above for SCOM. Without taking a code change to setup and re-releasing the product we can’t change this until the next release.
We will make this simpler going forward and be more progressive in supporting and using the most up to date collations. To do this we will be modifying setup of all the SC components in the next release to require the latest Windows collations available at that time for that version of SQL Server that you are attempting to install on. The collation versions will likely be updated over time by the Windows/SQL Server team so our requirements will stay in lockstep with them. That means for example if when you are installing any component of System Center on SQL Server 2008 in the next release we will require the Latin1_General_100_CI_AS collation for an English system. If you are installing on SQL Server 2012 we will require Latin1_General_110_CI_AS (assuming there is such a thing as _110_ - I don’t know right now). This will only apply to new installations. Upgrade installations will not be subject to this and we will continue to support the older collations for those upgraded databases.
We hope this provides more flexibility and clarity in how you deploy System Center 2012. We will update the documentation accordingly. We apologize for any confusion and inconvenience. We will make this better going forward by being more consistent across the suite while at the same time improving the quality of the product by using the latest collations.
perfect - thanks for clarification the SQL Collation and usage in System Center 2012. What I miss is the used Collation for Analysis Services as they are needed for SM12. Currently we use Latin1_General - it would be great to have that article updated to reflect also the supported collation for Analysis Services. Thanks!