I was so excited to see the release of Hybrid Tables in Preview with last month update of Power BI. But just like you, I was wondering if Hybrid Tables were for me. I did some investigating and came back with some answers.
Understanding Import versus DirectQuery
It seems like a simple topic, but it is important to understand the difference between the two data connectivity methods because of Hybrid Tables.
When you select Import for you connectivity method, the data is going to be imported into your Power BI report. When properly used, it will quickly render visualizations. You also have the ability to perform transformations on the data by adding columns. (Yes – best practice is to push that as far upstream as you can, but that is not always possible for citizen developers)
When you select DirectQuery, you will be connecting to your data source, but never adding your data to the report file. You will query it every time you need to render visualizations. It keeps the report file size down and eliminates the need to refresh your data. However, this can be slow and does not allow for any transformations.
Why is this review so important? The answer is simple – Hybrid Tables are built on top of Incremental Refresh which uses the Import data connectivity method. If you have never used Incremental Refresh before, Microsoft has provided documentation on how to enable it for your report.
Are Hybrid Tables different from Incremental Refresh?
Hybrid Tables are not different from Incremental Refresh. In fact, Hybrid Tables augment Incremental Refresh. To understand how they work together, let’s look at how the two compliment each other.
To make it easier, I want you to think about your data in three segments – historical, current, and real time.
Incremental Refresh
Incremental Refresh is the base of Hybrid Tables and handles the historical and current data. Incremental Refresh is setup to simplify how much data is imported every time the data model is refreshed. First you will determine how much data you want to bring into the model. In my example, I have set my model up to return the last ten years of data. Then I will select how much of that data will be changing and need to be refreshed. This is generally a shorter window of time. My example is set to refresh the last ninety days worth of data.
Looking at this image, you can see a bar chart that is showing you the breakdown of how your data will be managed in the refresh process. The last ninety days of will be incrementally refreshed – our current data. The remainder of the selected data will be archived – our historical data.
If you have been using Incremental Refresh, this is how your data refreshes have been managed. The only problem with Incremental Refresh is that data is still missing. This excludes the most current data from the refresh and results.
Hybrid Tables to the Rescue!
Enabling Hybrid Tables brings you the most current data using DirectQuery. When you check the box, you will see a new segment called Real Time appear. At the same time, the “Only refresh complete day” is now checked and greyed out. What is going on?
Simply put, incremental refresh now will bring in all data before today. With this data loaded, DirectQuery will now bring in all new data that comes in today. This means the data does not need to be refreshed multiple times in the day. Now a single refresh loads the majority of the data model and any changes that come in today are handled in real time.
What else do I need to know?
If you are excited to try out Hybrid Tables, there are a few things you should know before jumping in.
Import becomes DirectQuerty
If you have been using Incremental Refresh and made transformations like adding columns, you will run into issues. We explained above how Import connections allow you to make those transformations. However Hybrid Tables use DirectQuery. You will need to play by those rules in order to use this feature.
Power BI Premium is a Must
Hybrid Tables can only be used with Power BI Premium. This can be Power BI Premium Per User (PPU) or Premium Capacity. If you already have Power BI Premium, you will be fine. If not, consider leveraging Power BI Premium Per User trial to see if Hybrid Tables work for you.
Refresh Costs
When using DirectQuery, you might be incurring additional costs whenever you view reports. If your data warehouse charges you per query, this could start to add up. You will need to monitor costs to make sure your queries are running efficiently and not costing more than forecasted.
So Should I use Hybrid Tables?
If you have a rapidly changing data set that you need to access in real time, you should try using it. Hybrid Tables work well and are easy to setup. If your data does not change that frequently, it might not be as valuable.
Have you tried using Hybrid Tables? Is there anything I am missing? What was your experience? Tell me in the comments below!
Leave a Reply
You must be logged in to post a comment.