Making sense of challenging data topics one step at a time.

Category: Incremental Refresh

Which Tool When – Parameters in Power BI

The word parameters is a little ambiguous when it comes to Power BI. I have written a number of articles that reference parameters, but never took the time to clarify their use cases. I wanted to dive in to the different types of parameters and understand the best way to use them. Paired with some tips and tricks, you will have a better idea of how to use all kinds of parameters in your reports.

Parameters in Power Query

In my opinion, parameters in Power Query are the most in your face. When you launch the Power Query editor, there is a button on the home ribbon that helps you manage them. But have you ever used them?

Parameters in Power Query allow you to add reusable values in your queries. The premise is simple – if you keep using the same value over and over again, why not make it easy on yourself?

How to Leverage It

To get started, open the Manage Parameters window in Power Query. From there, create a new parameter and give it a meaningful name. From there, you just need to specify the type of value, add your value, and click OK. Since I am creating a parameter for a reusable API key, I am just using text and adding in my key.

Creating a Parameter in Power Query

Now that I have the parameter in place, I can use it in an existing query. Since I was looking to easily reuse an API key, I can go and update my queries.

Leveraging my Parameter with Another Query

As an added bonus, I can share my full query without having to remember to hide my API key!

Common Use Cases

While the API key use case above is one of my favorite stories to tell, there are a few other key use cases.

One of the most famous use cases is when you need to create parameters for Incremental Refresh or Hybrid Tables. You need to create a RangeStart and RangeEnd parameters to setup the filtering on the report.

Another great use case is when you need to quickly change values to filter a report. In a past life, I worked for a company that had APIs for external reporting. Since customers always wanted similar reports, I created them with parameters so I could quickly update the API key and customer id. When a new customer wanted a custom report, I just updated the parameters to get their data in, made some adjustments, and sent the report out.

Tips and Tricks

Tip #1 – If you save your Power BI Desktop file (PBIX) as a template (PBIT), you will be prompted to enter the parameters when you use it. This helps save you a few clicks.

Tip #2 – All of your parameters can easily be updated in the Power BI Service without having to open the desktop file. Go to your data source settings and you will find the parameters section. This allows you to update your parameters like API keys with minimal effort.

Tip #3 – If you are using a deployment pipeline, you can easily swap between your Dev-Test-Prod databases. Just set parameters for your server and database fields in the advanced editor window and update them in the Power BI Service.

Function Parameters

Not as common of a use case, but functional parameters allows you to reuse a query quickly. I go into great detail on why to use functions with my article on how to use REST APIs in Power BI, but in short, they allow us to scale a base query rapidly.

How to Leverage It

One of my favorite endpoints pulls stock information by ticker symbol. I could build multiple queries and append them, but a function makes it easier. I just need to add a little code at the top to convert my base query to accept a parameter.

Parameters in a Power Query Function

I specified the parameter that I wanted to gather and then apply it in the query. When I go to use the function, I will be prompted to specify the parameter so the query works properly.

Common Use Cases

I frequently use function parameters with APIs. Because some API endpoints require dynamic values, you will need to iterate with a function like I did with the stock symbols.

Another common use case for me is building common elements I might use in a report. If I am running into queries that are using paging , I can grab a function and invoke it to build out my base. Another common use case is to build out a date table when one is not available in my data source. Of course these are things we want to do as far upstream as possible, but if you are not in control of your database, you might need to create your own.

Tips and Tricks

Tip #1 – Invoking a function as a column is an easy way to specify your parameters with existing column values. The function will run on each row of your query using the values in the columns identified. It is a quick and easy way to apply a function!

Tip #2 – You can edit a query and add a function by hand. Just make sure you specify your parameter in the function or it will not work. I use this all the time for token based authentication with APIs.

Tip #3 – If you are using a few different API endpoints, you can create a base function to handle the majority of your query. Just add a parameter to the function to specify the endpoint. From there, you can add your endpoint or table name in the parameter field and invoke the function which speeds up your connection time.

What If Parameters

What if parameters are different than the other two parameters we discussed. Power Query and function parameters really help us with querying data efficiently. What if parameters are all about what you could do with your data.

How to Leverage It

We often find what if parameters when you want to run some simulated scenarios. I recently did a deep dive on what if parameters, but at a high level, we have two components.

The first component is the creating the parameter itself. We will go to the modeling ribbon and select new parameter. A new window will open and I can specify key components of my parameter which include the name, data type, minimum, maximum, increment value, and default. Once I click OK, a new slicer will appear.

But that was the easy part. We now have a new measure that is <Parameter Name> Value. This brings back the value you selected in your parameter. We now can use this measure in a measure to get our value. You will need to use a little DAX, but it is worth the extra effort.

Common Use Cases

The most common use case is to simulate revenue increases. I add the parameter to a formula that takes their current revenue and shows the change. Everyone is trying to make more money, so it makes sense.

One of my favorite use cases is to help filter data. I built a report that compared property assessments in my town. I used a what if parameter to help filter 3,500 properties to only a handful. Using key values such as year built, livable square footage, and lot size, the parameters found similar houses to the one I selected. Instead of having to fool around with filters, I could just quickly adjust my report parameters instead.

Tips and Tricks

Tip #1 – Always have the parameter make the slicer when you can. You can hide it in the selection panel if you don’t want to use it, but the slider is so nice. It makes the mobile experience even better!

Tip #2 – If you are using a number of parameters, don’t be afraid to create a page for your parameters. You can just sync the slicers to other report pages. This prevents clutter and speeds up load time for a page sine there are less objects in the view.

Tip #3 – You can create your own what if parameters with a custom table. This is nice when you want to use a non-numeric parameter. This is a common practice for filtering reports on things like top/bottom views.

Parameters on Parameters

You might have gotten to this point and said “Wow – I had no clue that this term is so widely used!”. I think it is important to understand the different kinds of parameters because they really can make an impact on your reports. Some of my reports maybe only use one of these while others use all three. The key is understanding how to properly use them.

Have you used any of these parameters before? Do you find them helpful in your report design and data models? If so, tell me in the comments below!

Sample Data Sets You Can Use for Trying New Ideas

We all want to hone our skills, but sometimes struggle to find good sample data sets to try out new ideas. Sometimes it is about specific data structure or maybe you want to show off an idea but cannot use production data. A lot of what I share comes from real scenarios I have encountered. To share these tips, I have had to use different data sources over the years.

This week, I want to share some data sets that I find fun and helpful for trying ideas out in Power BI. Some will be easier to use than others. You might even need to have a SQL Server to make them work. Regardless, you should be able to find something you can use.

The Most Basic of All Sample Data Sets

If you are brand new to Power BI, the Contoso Financial Sample workbook is a great place to start. It is a free and easy to use data set for beginners. While it does not a great resource for data modeling, it does serve as a quick and easy model to learn the basics of Power BI.

When I was a Power BI trainer, I liked using this data set for basic DAX calculations as some of the key measures such as cost of goods sold (COGS) where included in the model. I can perform some basic calculations which result in net profit.

Check out the Contoso Financial Sample data set here.

Learn How to Find Insights

Another one of my favorite sample data sets that is easy to use is the Pima Indians Diabetes Database from Kaggle.com. Like some of you, I cringe typing out the name of this data set. If published more recently, it would have likely been given a more culturally sensitive name. However, I use this data set for demonstrating the key influencers visual. I have also used it for predictive modeling with Azure Machine Learning, but that is for another day.

This data set was assembled by the National Institute of Diabetes and Digestive and Kidney Diseases. The purpose of the data set was to perform predictive modeling on diabetes in the Pima community. Identified as a high risk population for Type 2 Diabetes, this data represents the Pima community through years of research. Kaggle provides this data set for free. You just need to sign up for an account to access it.

Kaggle is a great resource for other data sets. There are so many to choose from, it is hard to just pick one. However, you are welcome to peruse their catalogue as you might find something interesting. With a little searching, you will find a data set which you can use to build a report on Settlers of Catan!

Check out the Pima Indians Diabetes data set here.

Simplest of SQL Sample Data Sets

Adventure Works is likely the world’s best know SQL database.. A common data set used for training, it is easy to implement. Experience with SQL Server Management Studio will serve you well as you implement this data set. Microsoft provides clear instructions on restoring the database but I find a little extra know how helps. It is wise to make friends with a database administrator if you don’t have one. Offer to buy them a drink or two at happy hour for their help and you will probably make a new friend out of the experience.

Download the Adventure Works data set here.

Binge The Office While Building Reports

Fans of The Office rejoice! TDMitch created a Dunder Mifflin sales data set from the Northwind Traders data base by Microsoft. Just like Adventure Works, this is a SQL data set. Implementing this data set requires additional effort compared to the Adventure Works database. You must follow instructions and run a few SQL scripts to finalize the setup of this data set.

I recommend this data set for someone who is trying to make something that connects with end users. I also recommend this data set for people who are expanding their transact SQL knowledge.

Check out the Dunder Mifflin data set here.

Simplest of REST API Sample Data Sets

REST APIs are great resources for 3rd party data. They work well but you might find frustration with implementing them. I have used this data set before with my series on the Basics of REST APIs in Power BI. While each API endpoint is unique, you can capture the basics using the Yahoo Finance API.

Offered for free up to 100 calls per day, it is an effortless way to learn the basics with no costs. If you are really into stocks, you might even consider purchasing a paid subscription. Spend some time digging through the endpoints and become comfortable with how you can use APIs with Power BI.

You can review the Yahoo Finance API documentation here.

Big Data Sample Data Sets

Sometimes you want to throw a lot of data to test out a solution. The New York City Taxi data set is a massive trove of data that is free for use. Available as CSVs or APIs, you can choose how you want to access the data. I used it to benchmark refresh speeds between various Azure data sources such as blob, table, data lake, and Azure SQL storage solutions.

The Taxi and Limousine Commission provides quality documentation around the data set. It even provides clear descriptions in the data dictionary, maps of taxi zones, and dimension tables. It even explains the difference between yellow taxi, green taxi, and for hire car services.

Check out the NYC Taxi and Limousine Commission data mart here.

Did Not Find Something To Fit Your Needs?

No fear about that! There are tons of free data sources out there for you to use. My favorite place to go is to data.gov and check out different data sets available from the US Federal Government. You can also search for open data from many states and cities. You might even be able to use it for some of your solutions.

Google also has a data set search that will help you find some samples. Search for different topics such as commodities or labor statistics and see what comes back. My only caution is that not every result will be free. However, if you are looking for something specific, this search will help you find what you data you need.

How about you? What are some of your favorite sample data sets? If you have a good one or used one of these, tell me in the comments below!

What do I need to know about Hybrid Tables in Power BI?

I was so excited to see the release of Hybrid Tables in Preview with last month update of Power BI. But just like you, I was wondering if Hybrid Tables were for me. I did some investigating and came back with some answers.

Understanding Import versus DirectQuery

It seems like a simple topic, but it is important to understand the difference between the two data connectivity methods because of Hybrid Tables.

Selecting Import or DirectQuery has an impact on how you build your reports in Power BI

When you select Import for you connectivity method, the data is going to be imported into your Power BI report. When properly used, it will quickly render visualizations. You also have the ability to perform transformations on the data by adding columns. (Yes – best practice is to push that as far upstream as you can, but that is not always possible for citizen developers)

When you select DirectQuery, you will be connecting to your data source, but never adding your data to the report file. You will query it every time you need to render visualizations. It keeps the report file size down and eliminates the need to refresh your data. However, this can be slow and does not allow for any transformations.

Why is this review so important? The answer is simple – Hybrid Tables are built on top of Incremental Refresh which uses the Import data connectivity method. If you have never used Incremental Refresh before, Microsoft has provided documentation on how to enable it for your report.

Are Hybrid Tables different from Incremental Refresh?

Hybrid Tables are not different from Incremental Refresh. In fact, Hybrid Tables augment Incremental Refresh. To understand how they work together, let’s look at how the two compliment each other.

To make it easier, I want you to think about your data in three segments – historical, current, and real time.

Incremental Refresh

Incremental Refresh is the base of Hybrid Tables and handles the historical and current data. Incremental Refresh is setup to simplify how much data is imported every time the data model is refreshed. First you will determine how much data you want to bring into the model. In my example, I have set my model up to return the last ten years of data. Then I will select how much of that data will be changing and need to be refreshed. This is generally a shorter window of time. My example is set to refresh the last ninety days worth of data.

Looking at this image, you can see a bar chart that is showing you the breakdown of how your data will be managed in the refresh process. The last ninety days of will be incrementally refreshed – our current data. The remainder of the selected data will be archived – our historical data.

Incremental Refresh setup to bring in the last 10 years of data, but only refresh the last 90 days

If you have been using Incremental Refresh, this is how your data refreshes have been managed. The only problem with Incremental Refresh is that data is still missing. This excludes the most current data from the refresh and results.

Hybrid Tables to the Rescue!

Enabling Hybrid Tables brings you the most current data using DirectQuery. When you check the box, you will see a new segment called Real Time appear. At the same time, the “Only refresh complete day” is now checked and greyed out. What is going on?

Getting latest data in real time adds a section to the bar chart at the bottom and greyed out “Only refresh complete days”

Simply put, incremental refresh now will bring in all data before today. With this data loaded, DirectQuery will now bring in all new data that comes in today. This means the data does not need to be refreshed multiple times in the day. Now a single refresh loads the majority of the data model and any changes that come in today are handled in real time.

What else do I need to know?

If you are excited to try out Hybrid Tables, there are a few things you should know before jumping in.

Import becomes DirectQuerty

If you have been using Incremental Refresh and made transformations like adding columns, you will run into issues. We explained above how Import connections allow you to make those transformations. However Hybrid Tables use DirectQuery. You will need to play by those rules in order to use this feature.

Power BI Premium is a Must

Hybrid Tables can only be used with Power BI Premium. This can be Power BI Premium Per User (PPU) or Premium Capacity. If you already have Power BI Premium, you will be fine. If not, consider leveraging Power BI Premium Per User trial to see if Hybrid Tables work for you.

Refresh Costs

When using DirectQuery, you might be incurring additional costs whenever you view reports. If your data warehouse charges you per query, this could start to add up. You will need to monitor costs to make sure your queries are running efficiently and not costing more than forecasted.

So Should I use Hybrid Tables?

If you have a rapidly changing data set that you need to access in real time, you should try using it. Hybrid Tables work well and are easy to setup. If your data does not change that frequently, it might not be as valuable.

Have you tried using Hybrid Tables? Is there anything I am missing? What was your experience? Tell me in the comments below!

Powered by WordPress & Theme by Anders Norén