This is a topic that has got plenty of attention in the past, but it is something that I find myself periodically repeating over and over again. It sounds like a topic that I should write something about. Consider this part of 1 of random performance tuning.
Recently, I encountered a performance issue that arose as a result of introducing a new calendar dimension into a Multidimensional cube. This cube has had a lot of performance tuning in the past, and much of that centered around the appropriate partitions on fact tables. Effective partitioning has 2 key elements
1) Approx 25-50 million rows per partition
2) A partition has a slice, typically based on the calendar dimension.
So if you have a fact table with 1 billion rows that covers 3 years and ever month has a fairly even spread of data, then you may adopt a partitioning strategy based on months, end up with 36 partitions with approx 28 million rows in each. As a slice, you may end up with a slice that looks like, Calendar.Calendar.Month.202001 for Jan 2020. (202002 for Feb 2020 etc.
The problem. A new calendar dimension is created, let's call it "Secondary Calendar". Now some user queries are created that don't use Calendar, but instead use "Secondary Calendar". Suddenly, you get calls that performance is poor.
While the subject header is the big clue here, let's not draw assumptions, and follow a process. My process is
1) Setup a SQL Profiler Query Trace
2) Clear Cache
3) User warm the cache
4) Ask the user to run the query.
5) Capture the MDX via profiler
6) Note the execution time. (This is the cold cache execution time)
7) Clear Cache
8) User warm the cache
9) Paste query into management studio
10) Run the Query while a profiler trace is still running.
11) Check does performance match what the user experiences is.
Assuming the execution time of 11 is similar to 6, then you have reproduced your issue. Actual query time may be different here, you might be running on different hardware, or the server may be under different loads, but you should see comparable performance. I.e. user experiences 30+ second execution time and you experience 40 seconds, then all is good, you probably have a correct reproduction, however, if you experience execution times of seconds or minutes, then check those steps again.
To do this, you will need the following;
SQL Server Management Studio
SQL Server Profiler
Let's look at each step.
1) Setup a SQL Profiler Query Trace. Hopefully, you have this on your desktop. When you connect to a server, the minimum event you need is a Query End, but initially, I like to select the following;
2) Clear Cache. This is essential, you can't performance tune anything on a warm cache, so you need to clear cache before you run any queries. The following XMLA command will clear your cache when executed against a server. Change the DBNAME to match your DB Name.
<ClearCache xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<Object>
<DatabaseID>DBNAME</DatabaseID>
</Object>
</ClearCache>
3) User warm the cache. Once you have cleared the cache, you will need to reconnect to your Cube. In SQL Server Management Studio, click on the cube and select new query, or right-click and select browse.
4) Ask the user to run their query via their normal interface. NB, to make this easier, have your SQL Profiler trace paused while they set up their query, log in, etc as you don't want to clutter up your profiler trace with unnecessary statements. Just before they press go/execute etc, enable your profiler trace.
5) & 6) Capture the MDX via profiler. If all is going well, you should see lines of data begin to appear in SQL Profiler, and this will continue until the server has finished the query. You may end up with hundreds or even thousands of lines here. The MDX for your query will be located in the Query Begin and Query End event. Personally, I like using the Query End event as I also want to look at the official server duration. See highlighted elements in yellow below and the MDX statement. In this case, my query is running in 67 seconds.
Steps 7-11 are a repeat of above, except you are now running with the user query from your own environment. Assuming the query execution times are similar, you now have a query you can start to analyze.
So what next? Understanding what different lines in the query cache mean is a matter for a different day, but too short cut my issue here, when reviewing the profiler trace, I can see that while my query date range is only a few days, I can see that there are scans across multiple fact table partitions. This is a sign that partition elimination is not happening, which isn't surprising as the dimension "Secondary Calendar" isn't part of the slice when the partition was created.
Resolving this isn't so easy. I don't want to change my slice, but these queries have to use the secondary calendar for good reasons. In this case, as we control the source MDX query, I can come up with a "modeling" answer and create a new attribute on the Primary Calendar table that eliminates historical partitions. This works in this circumstance as historical data was not necessary. This works, in this case, but isn't always an option.
Key takeaway. It is very common to have slices set by a calendar. If you must have a second calendar, you have to call out to users that queries using it may not perform as fast as when working with the primary calendar. You may be able to find "tricks" to help this, but there are no guarantees.
Thanks for reading.
Mark
Comments