I’ve been fairly quiet on the blogging-front recently, but I have been busy working with SQL Server Reporting Services (SSRS) ! I've also been fortunate enough to present a session to customers on SSRS scalability at a recent Business Intelligence Operations Day in Reading, UK. One thing that struck me from that day was that very few people are aware of the caching options available in Reporting Services and how they can be used to improve report processing performance. A quick show of hands showed that no-one in my session was making use of caching, and only one person required real-time reporting on their data.
When you create a report and deploy it to your report server, the default behaviour is to always run the report with the most recent data and not to implement any caching. Actually, SSRS does implement session level caching for each user request “under the covers” to improve report performance and to provide a consistent experience across a single browser session. Data remains consistent during the report session and is unaffected by changes in the underlying data source. This article provides a good discussion of report session caching.
Unless you need real-time data in your reports, then hitting the underlying data sources each time you run the report is unnecessary and can hurt performance. In most cases, the business requirement is to report on historical data (whether that is from the last 15 minutes, hour, day or week) so we can implement caching to improve the report processing times based on these requirements. If we have no caching, a new instance of the report is created for each user who opens or requests the report; each new instance contains the results of a new query. With this approach, if ten users open the report at the same time, ten queries are sent to the data source for processing. If the report server load is heavy, or if the reports take a significant time to retrieve data and process, then this can have a negative impact on the performance of the Report Server and the underlying databases executing the report queries.
We can implement 2 forms of caching in Reporting Services: temporary cached reports and report snapshots. A temporary cached instance of a report is based on the intermediate format of a report (report data merged with the layout information before any rendering is applied). The report server will cache one instance of a report based on the report name. However, if a report can contain different data based on query parameters, multiple versions of the report may be cached at any time. For example, suppose you have a parameterised report that takes a region code as a parameter value. If four different users specify four unique region codes, four cached copies will be created. The first user who runs the report with a unique region code creates a cached report that contains data for that region. Subsequent users who request a report using the same region code get the cached copy.
With this approach, if ten users open the report, only the first request results in report processing. The report is subsequently cached, and the remaining nine users view the cached report. Cached reports are removed from the cache at intervals that you define. You can specify intervals in minutes, or you can schedule a specific date and time to empty the cache. Not all reports can be cached. If a report prompts users for credentials or uses Windows Authentication, it cannot be cached unless you change the data source to use stored credentials instead. There are certain other events that will invalidate the cache and cause it to be flushed for temporary cached copies of a report. These include if the report definition is modified, report parameters are modified, data source credentials change, report execution options change or if the cache expiration timeout is reached.
A report snapshot is a report that contains layout information and data that is retrieved at a specific point in time. A report snapshot is usually created and refreshed on a schedule, allowing you to time exactly when report and data processing will occur. If a report is based on queries that take a long time to run, or on queries that use data from a data source that you prefer no one access during certain hours, you should run the report as a snapshot. A report snapshot is stored in the intermediary form in the ReportServer database, where it is subsequently retrieved when a user or subscription requests the report. When a report snapshot is updated, it is overwritten with a new instance. The report server does not save previous versions of a report snapshot unless you specifically set options to add it to report history.
Not all reports can be configured to run as a snapshot. For example, you cannot create a snapshot for a report that prompts users for credentials or uses Windows integrated security to get data for the report. Also, if you want to run a parameterised report as a snapshot, you must specify a default parameter to use when creating the snapshot. In contrast with reports that run on demand, it is not possible to specify a different parameter value for a report snapshot once the report is open.
So how do we identify which reports to cache to improve performance? By default, Report Server saves all the details of reports that have executed in the Report Server database. This information can be accessed via the ExecutionLog2 view in SSRS 2008 or by this method in SSRS 2005. Robert Bruckner has written an excellent blog post detailing the information contained in the ExecutionLog2 view, which is highly recommended reading. Using this data, we can analyse where these reports are spending most time during report execution (data retrieval, report processing or rendering). There’s a great article on the SQLCAT site on Reporting Services Performance Optimisations which can help you to address some of these issues. Long-running reports or frequently accessed reports are going to give us greater performance improvements, so these would be good candidates for caching. The figure below shows an extract of data after running the following statement on the Report Server database:
SELECT * FROM [ReportServer].[dbo].[ExecutionLog2]
So we can see that this report is taking 30 seconds to retrieve the data from the underlying data source. We can specify that this report is cached and that the cache will expire every 30 minutes by adjusting the Report Execution properties as shown below:
The next time someone requests the report, the data will be retrieved from the data source and then cached in the ReportServerTempDB. Any subsequent requests from any user for that report (with the same parameter values) will be rendered from the cached copy. The output from the ExecutionLog2 view will now be as follows:
As we can see, the report is now rendered from cache and the data source not queried, resulting in a huge performance improvement. If we change the parameters of the report, then we will go back to the data source to retrieve the new data and a temporary cached copy of the report will be created for the new combination of parameters:
I worked recently with a customer who had set up caching for a report but they were using dynamic date functions in the values for their parameters. Every time the report was executed they had different values for the start and end date parameters, so they were constantly hitting the database with new queries even though they only required to report on the data from the previous evening. We were able to schedule a snapshot of the report to run each evening and then render the reports from that snapshot throughout the day… report execution time went from 20 minutes to a few milliseconds as a result ! The benefit of scheduling a snapshot is that the data is already pre-loaded in cache ready for users the next morning. We don’t suffer the initial performance hit that the first user would experience (in order to refresh data) if we used a temporary cached report.
As you can see, utilising SSRS's caching mechanisms can provide a very powerful way to improve report processing performance and to lighten the load on your database servers !
Fantastic article and has helped us reduce the workload on the servers :) 10/10
thanks for giving your time to explain all this... this helps :)
Thanks for your posting and very good explanation, but I am still wondering if there is any way using cache if parameter values get changed we will not hit our underline data source.
Very good article and help me alot understanding the concept.