In the first part of applications optimization techniques, we described how to identify the bottleneck of our slowly performing dashboards.
We divided key factors which may impact application performance into four categories:
- Underlying datasets
- Application design
- Data blending of underlying datasets
- Incorrect setting configuration
In this publication, we will give you some tips on what to do, to optimize underlying datasets.
- Use as few datasets as possible when designing the dashboard.
For example, one dataset report with 1000 rows displays faster than ten smaller dataset reports. Use single dataset/data source if possible. However, be aware that combining dataset reports can create a Cartesian join, which inflates the size of the combined dataset and results in slower performance.
- Remove unused datasets, by:
- Consolidating them. When there are similarities across the different datasets, such as common attributes and metrics, it is recommended to consolidate these datasets into fewer units. This will result in less dataset executions, potentially smaller virtual datasets, and better overall performance for the document execution.
- Leveraging view filters (note: leveraging view filters only applies to the situation when report without filters is performing good). To avoid the negative performance impact associated with using multiple datasets, use a single dataset and create grid/graphs with different view filters to extract various data. By implementing view filters, a single dataset can be used to create several grids and graphs that include distinct data. Reducing the total number of dataset instances in this manner improves the data preparation time for your document.
- If necessary and you have to have multiple datasets instead of using only one, understand how the joins are solved between datasets by activating and analyzing MCE trace.
- Reduce the amount of data in each dataset by removing unused metrics and attributes. A dataset should have enough data to be useful as a rich source of analysis for many users, but it should not have extra data that is not needed on the dashboard. For example, do not include product item information when you want to display only product category information.
- Define dataset joins or disable join between datasets on unrelated attributes. We will talk this in more details in the next article in this blog.
- Switch all outer joins to inner joins where possible.
- Turn on caching
- Use derived metrics and attributes only if necessary. Review all your derived metrics and attributes to see if there are any complex functions in them.
- Build simpler reports. The simpler the reports, the smaller is the XML and the faster is the overall response time.
- Avoiding unnecessary Working Set (OLAP) reports. The following base article explains how to convert OLAP reports to regular reports: KB19829: How to convert an OLAP report to a standard report in MicroStrategy Developer 9.x and 10.x.
- Use only the required prompts. It is important to understand that each prompt in a report requires additional Intelligence Server resources. This is because Intelligence Server has a process, called the Resolution step, that matches all prompt answers in a report. Even though this process is fast and represents a very small percentage of the overall report execution time, for high concurrency scenarios it can become a costly bottleneck.
- Refrain from using search inside prompts. The behind-the-scenes process to search for objects is very expensive for both the Intelligence Server and the data warehouse, which in turn impacts the report response time.
- Limit the use of custom groups, if the report comes from data import you can group data using data wrangling (For best practices on data wrangling refer to the Analyst Architect Methodology Document).
- Limit the use of smart metrics, they must be dynamically aggregated and sub-totaled at all levels by the Analytical Engine. For documents with large datasets that require aggregation and subtotaling at multiple levels, this can have a significant impact in terms of performance and overall Intelligence Server memory utilization.
- Keep aggregations and joins to a minimum. When a report executes, the more tables in the data warehouse it has to access, the longer the report takes to run. Therefore, keeping the number of joins and aggregations, that either the data warehouse or the Analytical Engine will have to perform to a minimum, will have a positive impact on performance.
- Use in-memory cubes as datasets.
- If dashboard is based on the MTDI (Prime) cube, follow the best practices for documents and dossier that can be found here.
- Datasets Sourced from Multiple Cubes vs. Single Cube as Dataset
When processing a document that contains several reports as datasets and that are sourced from different cubes, Intelligence Server must go through an initial data processing/aggregation step that takes the data from the cube and puts it at the level of the dataset report. After all the datasets have been generated, Intelligence Server creates the virtual dataset. Finally, through a second data processing or aggregation step, it generates the data necessary to populate the different document components supported by those datasets.
These two initial steps can be completely skipped by using a single cube as dataset. Not all document designs support a single cube as dataset, but whenever this is possible, it is a recommended approach to improve document execution performance. It has been found that you can have performance improvements ranging between 50% and 100% when you switch from datasets that are sourced from multiple cubes to single cube as dataset. Although this technique improves document performance, the amount of time required to publish an Intelligent Cube increases as it grows in size. Use the table below as a guide to determine whether switching to a single Intelligent Cube as dataset can improve the performance in your environment:
3. View reports vs Cubes as datasets
Avoid using view reports, performance wise it’s better to directly use cubes directly within the application. Use view reports only if application needs to be prompted. You can break down the view reports based on requirement or components of your dashboard to gain the performance benefit. While Breaking down the view reports, you have to be careful in case of any selectors. For example, if there is a Product selector for all the dossier components, then this attribute needs to be included in all the datasets used in the dossier. In order to check the performance, we recommend creating view reports fist and execute them to verify the response times. Only after successful verification you should start building the application.
Additionally, view reports with their prompts and filters can be reused, however it is recommended that every application have its own dedicated dataset to easier introduce changes that won’t affect any other applications.
These are just basic tips to improve the performance. If users are experiencing general performance problems, deeper system checkup or training is recommended.
In the next blog article we will cover application design optimization, data blending and configuration tweaks that might help you to improve performance of your applications. So stay tuned!