About Motifworks

At Motifworks, we are AZURESMART. We are one of the fastest-growing cloud solutions providers, specializing in Cloud Adoption, Application Innovation, and Effective Data Strategies. Our passion is to empower you to accelerate your digital transformation initiatives using the Microsoft Azure cloud. We’re here to simplify your path to explore what’s possible.

Corporate Office

200 W Townsontown Blvd, Suite 300D, Towson, MD, 21204, US

Regional Offices

Philadelphia ¦ Detroit ¦ Dallas ¦ Florida ¦ Cincinnati ¦ Ohio

Development Office

101 Specialty Business Center, Balewadi Road, Pune- 411045, Maharashtra, India

Connect with us
info@motifworks.com +1 443-424-2340
Don’t be a victim of cyber-attacks. Assess your Cyber security risks before it is too late! Know more ➜

 

High-performing-PowerBI-using-Composite-models-with-Aggregate-feature

Implement Power BI Composite Models with Aggregate feature

Implement Power BI Composite Models with Aggregate feature

Power BI Composite Models with Aggregate feature

This article was originally published on LinkedIn Pulse , Tarun Agarwal. who leads Data and AI Practice at Motifworks has shared his thoughts on implementing Power BI Composite Models with Aggregate feature.

Power BI semantic model for large data sets (in multiple TBs) typically requires some sort of Data Warehouse implementation. Data Warehouses are needed to consolidate, cleanse, transform, and create a curated or certified copy of the data.

Data Warehouse implementation gives a performance boost to Power BI for analysis. However, many organizations struggle to decide the right query mode to choose in Power BI.

Power BI provides two query modes when connecting to data sources.  

Import mode cache the data in Power BI. All data is exported from data sources. Cache data is available in memory in compressed format. It provides faster query performance and a higher level of concurrency. However, caching the data also mean data needs to be refreshed before new data can be reflected in reports. Larger data set could take time to complete the refresh.

Further, unless you are using Power BI Premium, you are limited with 1 GB data set size limit. A large data set in import mode may not even fit.

Data sets are large mainly because of large amounts of detailed data. Large size data will impact summary query performance. Drill thought to detailed data will also run slower.

Direct query mode, on the other hand, queries the data directly from the source. The data is close to the real-time data warehouse refresh. There is no refresh cycle. Large data set can be accommodated without upgrading to Premium. However, the aggregate query will take a longer time. There will be performance implications.

What is the right strategy?

The right strategy depends on data access patterns. Typically, a higher number of analytical reports show summary-level data. Self-service features like drill-down and slicing are also used at summary level data. Summary level data is low in terms of data volume.

Drill through reports usually require a larger amount of data at lower level grain are low in volume.

Until now, the right Power BI strategy for large data sets was dependent on data access patterns. If there are a lot of drill-through reports then direct query mode might give better experience. Otherwise using import mode will boost summary level performance.

Introducing Power BI Composite Models with Aggregates

Recently added new feature in Power BI enables us to use the best of both worlds. It allows connecting in both import and direct query mode in the same model. Power BI Composite Models provide a balancing act.

The aggregate feature will supercharge the performance. Summary information can be configured to use in import mode. It will give a fast performance to most of the report visuals. Whereas detailed information in drill through visuals can come from direct query mode straight from the data warehouse.

As a result, the size of the Power BI data set will be smaller in size. The scheduled refresh will be fast allowing more frequent refresh cycles.

Additional best practices for Performance improvements

A combination of a few other tuning techniques can further boost performance.

  • A page is as fast as the slowest visual on the page. We should try not to keep any visual that goes to detail data on the summary page.
  • Power BI relationship should assume referential integrity and use inner join as much as possible
  • Implement a layered information stack to have fast performing Power BI against the large data set. The highest level of summary data should be running in import mode using aggregates. Build indexed view or materialized views for intermediate level aggregated data. Use detailed data in direct query mode against the fact tables.
  • Separate the layered summarization on different pages.
  • Use clustered index and hash distribution for large tables in the data warehouse (Synapse analytics)
  • Replicate small tables (Synapse analytics)
  • Leverage data warehouse result set caching
  • Ensure that the data type between aggregates and detail tables are the same.
  • If using Power BI Premium and Azure Synapse Analytics for the data warehouse, collocate both the tenants in the same region

Power BI allows creating enterprise-level large data sets. When working with a Power BI partner, organizations can deliver the performance needed for enterprise-wide adoption through right data strategy and best practices.

Tarun Agarwal, Data and AI Practice Lead

Tarun Agarwal

Data and AI Practice Lead, Motifworks

Tarun has been focused on building digital and data platforms that provide the innovation needed to bridge today’s business realities to future opportunities.

 

Want to connect with different kind of data sources when using Power BI?

Learn about various approaches to make those data connections