Data with Dom

Making sense of challenging data topics one step at a time.

Cognitive Services in Power Query

You might have heard the announcement last week that Microsoft is deprecating Azure Cognitive Services and Machine Learning functions Power Query. For many, it was news that this functionality existed inside of Power Query! Some others mocked the existence of this in Power Query. For me, it was a favorite diamond in the rough within the Power Query editor.

As a reminder, not everyone has access to the same robust resources. While I would agree that this might not be the most efficient option, it meets a need. It is like choosing between a hammer and pneumatic nail gun. If I am building a house, the powered option would be far more efficient. But if I am doing a little project, your old-fashioned hammer would make more sense.

So, with the deprecation of these features in Power Query, what are we to do? What if you have no access to Azure to leverage cognitive services? What if you want to try other models instead? With a little bit of hard work, we can put together a solution that might resolve this for you and even open doors for others!

Why This Change with Cognitive Services Makes Sense

For some, there is justified anger. I get it – it can be frustrating. However, I do not disagree with this move. If you leverage cognitive services for a single refresh, not a big deal. We get the analysis we need and can move on. But the problem starts to arise when people refresh the data over and over again.

A common use case for this feature would be to analyze survey results. If you left an open text field at the end of a survey for comments, we could perform some analysis. As results roll in, people refresh the dataset. However, each refresh prompts cognitive services to re-assess entries that were processed in previous refreshes. This adds overhead to the analysis and adds demand on Azure cognitive services.

As a result, this is the right choice. Instead of re-assessing an entire column on every refresh, it would be considered the right choice to structure your refresh to only assess new records within the dataset. That being said, there are still use cases where this solution made life a lot easier. Hence the purpose for this article.

Enter Python, Hugging Face, and KeyBERT

Another feature people do not know is you have the ability to run a Python script within Power Query. And with the rise of Artificial Intelligence, there are plenty of options out there for you to use! In doing some research, I landed on Hugging Face as an easy solution. However, if you choose to use another solution, that is up to you. However, the code we use can serve as a blueprint to make it work.

To prepare for this, you will need to install at least Python 3.9 or newer on your device. From there, you will need to install a few packages to make it usable. You need to install the following items to enable the functionality we need:

  • Transformations (Hugging Face)
  • Matplotlib
  • Torch (for running analysis on your local machine)
  • KeyBERT

To install these packages, open a terminal window as an administrator and run the following commands:

pip install transformations
python -m pip install -U matplotlib
pip install torch
pip install keybert

With these packages installed, we can replace the key text analysis tools from Azure cognitive services. Once we have some data inside of Power Query, we can transform the query by running a Python script under the transform ribbon:

Screenshot of Power Query with arrows directing you to the Transform tab and then the Run Python Script button to replace Azure Cognitive Services in Power Query.
Navigating to the Run Python Script button

From there, we can put whatever code in the window we want to replicate the Azure cognitive services experience.

Use Case #1 – Sentiment Analysis

A common need is to analyze the sentiment of a block of text. This is not easy, but Hugging Face makes it easier for us to accomplish this goal. In our scenario, we have a dataset that has a “Comments” column that needs to be analyzed. When we click the button for Python, we will use the following code:

# 'dataset' holds the input data for this script

import pandas as pd
from transformers import pipeline

# 'dataset' is automatically provided by Power BI
df = dataset.copy()

# Clean and prepare text input
text = df["Comments"].astype(str).tolist() # Update column name to match your dataset

# Load the model
sentiment_pipeline = pipeline("sentiment-analysis")

# Analyze sentiment for each line of text
results = sentiment_pipeline(text)

dataset[['label','score']] = pd.DataFrame(results)

To update this code for your use, you just need to replace the column name found within the data frame under the clean and prepare text input section. You can also adjust the model you wish to use user the load the model section as well. Once you click OK, you can drill down on the table on the dataset line:

Drilling down on the sentiment analysis results from Hugging Face
Drilling down on Sentiment Analysis results

Once that is completed, you will see the sentiment label and score from the entries:

Screenshot of the sentiment label and score from Hugging Face in place of Azure cognitive services.
Sentiment label and score is now available

It is that easy! But the next challenge is getting keywords out of these comments.

Use Case #2 – Keyword Extraction

The next challenge is pulling keywords from a block of text. This is where KeyBERT comes in and assists. Just like we did for sentiment analysis, we will click the Run Python Script button and use this block of code:

# 'dataset' holds the input data for this script

import pandas as pd
from keybert import KeyBERT

# 'dataset' is automatically provided by Power BI
df = dataset.copy()

# Clean and prepare text input
texts = df["Comments"].astype(str).tolist() # Update column name to match your dataset

# Load the model
kw_model = KeyBERT(model='all-MiniLM-L6-v2')

# Extract keywords for each text
keyword_results = []
for text in texts:
    keywords = kw_model.extract_keywords(text, keyphrase_ngram_range=(1, 2), stop_words='english')
    keyword_phrases = [kw[0] for kw in keywords]  # Extract just the phrases
    keyword_results.append(", ".join(keyword_phrases))  # Join into a single string

# Add keywords back to the DataFrame
df["keywords"] = keyword_results

# Output for Power BI
results = df

As before, you can specify the column in your dataset that will be analyzed under clean and prepare text input. Once you click OK, the script will run and you can drill down on the results output:

Screenshot of the results from the Python script using KeyBERT.
Drill down on the results to see the keywords found

Now we can see the keywords that were extracted from the text:

Screenshot of the keyword results from KeyBERT in Power Query.
Keyword results from KeyBERT

From here, you can split out the words and use them as you deem appropriate!

Cognitive Services versus Python

There are advantages and challenges with both solutions. But results are results, so let’s break it down together so you can make the right decisions.

The Good

First and foremost, I get the desired outcomes from both solutions. While it takes more work than with cognitive services, the use of Python gets the job done. In fact, in some ways it might be better. With Python, you can get the label in addition to the score where cognitive services only gives you the score. That is quite nice!

Second, this is a huge win for people who do not have access to a premium Power BI experience. To use cognitive services, you must have a Fabric Capacity, Power BI Premium Capacity (if in a sovereign tenant), or a Premium Per User license. If you are an independent that uses Power BI Desktop for free, this is an easy way to get this functionality with no additional cost. It also means it is able to be refreshed in the Power BI service in a pro workspace. That is pretty cool!

The Bad

While it is not the most difficult process, you will need to install Python on your device. You might not be allowed to do it or don’t feel comfortable doing it. You also might run into errors and need to work through them (pro tip – Copilot Chat is great for helping you with this). But I promise it is not that bad as long as you can do it.

Next, you are running the models on your device. If they are not beefy enough to handle a large volume of data, it could really bog you down. However, you can work through this by limiting the rows of your table with a parameter and then publish to the service when you are ready to refresh the whole query.

What About Performance?

If it is hard to run on my machine, what does true performance look like? Glad you asked! I ran both use cases with cognitive services and Python within the Power BI service. This allowed for an equal footing and provide a little control to the analysis.

Tests were ran on the same query of 500 entries that are in a CSV file. This was to emulate the survey results use case with a tool like Microsoft Forms. I ran ten refreshes for each to verify performance.

Use CaseCognitive ServicesPython
Sentiment AnalysisAvg: 1:28
Min: 1:19
Max: 2:21
Avg: 1:20
Min: 0:49
Max: 2:25
Keyword ExtractionAvg: 0:37
Min: 0:21
Max: 1:42
Avg: 1:28
Min: 1:19
Max: 2:21

Initially as I was running this on my device, I was worried about overall performance, but it turned out that for sentiment analysis it was pretty similar overall. For the most part, my refreshes were pretty close on timing and even outperformed by a slight bit.

However, keyword extraction proved to be not as performant with refreshes taking three times as longer to complete. With a small dataset, I do not think this would be the end of the world. However, if I were to increase the number of records to 5,000, there could be an issue.

I think with some exploration, you could find models that would extract keywords faster, but keeping in mind that the solution should not be applied to massive amounts of data as it would not be considered the most efficient option. Handling this process upstream as data in added to the source would be considered a far better experience from a processing standpoint.

Conclusion

Like I said earlier, this is not a solution for every situation. But it does fill a need and you might have it. It is worth checking out if you are currently leveraging cognitive services or need a fix for your existing solution. If you have never tried using Python in Power Query, it might be an easy way for you to jump in! Let me know how it works for you in the comments below!

Should I Go Outside? Real Time Intelligence in Action – Part 1

Real-Time Intelligence in Microsoft Fabric is a transformative capability that empowers organizations to process and respond to data as it flows in. This enables faster, more informed decision-making. By combining streaming data ingestion, event processing, and analytics in a unified platform, Microsoft Fabric allows businesses to monitor operations, detect anomalies, and trigger actions in real time. This was powerful in a previous life as I worked for a manufacturer, but applies to other industries such as retail, finance, and healthcare. At its core, Real-Time Intelligence is about turning raw, fast-moving data into actionable insights without delay.

Despite its potential, I have found that many current demonstrations of Real-Time Intelligence fall short of showcasing its full capabilities. They often rely on abstract or overly simplified scenarios that do not reflect the complexity of real-world use cases. As a trainer, I totally understand the need to provide a successful solution to guide attendees in a workshop to success. As a result, they tend to overlook the integration of artificial intelligence which helps extract value from real-time data to make decisions. Without demonstrating how machine learning models can be embedded into these pipelines to make predictions or automate decisions, these demos miss a critical opportunity to show the true impact of what Real Time Intelligence can do.

This multi-part series aims to bridge that gap by offering a hands-on, in-depth exploration of how to integrate real-time data with machine learning in Microsoft Fabric. We will walk through the end-to-end process starting with setting up streaming data sources to training and deploying predictive models. When finished, we will operationalize those models in real-time workflows. Along the way, we address common challenges such as data latency, model accuracy, and some of the areas where need to stitch elements of the solution together. Whether you are a data engineer, data scientist, or solution architect, this series will equip you with the tools and insights needed to build intelligent, responsive systems that deliver real value.

The Use Case for Real Time Intelligence

If you work from home like me, sometimes you end up trapped at your desk all day. This solution is designed to predict whether or not you should go outside to enjoy some fresh air. Based upon the temperature, humidity, and barometric pressure, you can predict whether or not it is a nice time to get outside.

To make this work, we will use a Raspberry Pi to serve as our Weather Station to collect data. That device will be connected to an Azure IoT hub which will collect data from our device and make it available inside of Real Time Intelligence with Microsoft Fabric. The way the code is written, you can use just about any hardware you have laying around, but you might need to make some adjustments to get it to work with your hardware. I have remarked out details, so you can adjust as you need to suit your needs.

Hardware Requirements

To join in, you will need the following items. I am putting links to Amazon, but you are welcome to purchase these items from whatever retailer works best for you. To join in, you will need the following items:

You may find your sensors in a kit that contains these three items. There are also variants of sensors such as the DHT-22 or BMP180 sensors are already in your possession. You can make some adjustments to the sample code and integrate them into the solution if you want to save some additional costs.

Also, if you are more of a fan of Arduinos, this same process will work for you. I just do not have any code available at the time being. However, I encourage you to try writing it out if you are a fan of Arduino devices.

Deploy an Azure IoT Hub

To create an Azure IoT Hub using the Free Tier, begin by signing into the Azure Portal. Once logged in, click on “Create a resource” from the left-hand menu, then search and select “IoT Hub”. Click “Create” to begin the setup process. Under the “Basics” tab, choose your subscription and either select an existing resource group or create a new one. Choose a region close to your location, and enter a globally unique name for your IoT Hub, such as YourName_WeatherStation. Proceed to the “Networking” tab and leave the default settings for public access. In the “Management” tab, select the F1: Free Tier, which allows up to 8,000 messages per day. Leave the remaining settings as default, then click “Review + create.” After reviewing your configuration, click “Create” and wait for the deployment to complete. Once it is done, click “Go to resource” to open your new IoT Hub.

Screenshot of a deployed Azure IoT Hub for our Weather Station
Screenshot of our deployed IoT Hub

With your IoT Hub created, the next step is to add a device. In the IoT Hub resource page, look for the “IoT devices” option under the “Explorers” section in the left-hand menu and click on it. Then, click the “+ New” button at the top to add a new device. Provide a unique device ID, such as your city, and leave the authentication type set to “Symmetric key.” You can keep the auto-generated keys and other settings as they are. Click “Save” to create the device. Once the device is added, click on its name to open its details, and copy the Primary Connection String. This will be how you will connect your physical or simulated device to the IoT Hub. At this point, your IoT Hub is ready to receive data.

Screenshot of device specific information from the Azure IoT Hub
Copy the Primary Connection String as we will need it shorty

With the Azure IoT Hub in place, it is time to setup our device!

Staging your Raspberry Pi

Once you have your hardware, it is time to get everything ready. We need to image our Raspberry Pi, attach sensors, and then install software. We will start with imaging your SD card.

Applying an Image to an SD Card

To image an SD card for a Raspberry Pi, begin by downloading the Raspberry Pi Imager. After installing the software, insert your SD card into your computer using an SD card reader. Launch the Raspberry Pi Imager and click “Choose OS” to select the operating system you want to install. Next, click “Choose Storage” and select your SD card from the list of available drives. Be sure to choose the correct drive, as the process will erase all existing data on the card.

Selecting your Raspberry Pi device, operating system, and storage for imaging your SD card.
Configuring your Raspberry Pi Imager

Before you start the imaging process, edit the custom settings for your image. This will allow you to assign a custom host name, user, and password for your device. Also confirm that SSH is enabled under the services tab. These small items will ensure you can access your device without a keyboard or monitor attached. If you prefer to do it that way, that is fine. But if you are like me, you probably have too much stuff on your desk already.

Customizing your Raspberry Pi WIFI, user, and password settings before imaging.
Customizing your image before writing

Once your selections are made, click yes with your customized settings to begin the imaging process. The Imager will download the OS image, write it to the SD card, and verify the image. This may take several minutes depending on your system and internet speed. When the process is complete, safely eject the SD card from your computer. You can now insert it into your Raspberry Pi, connect your peripherals, and power it on. The Raspberry Pi will boot into the newly installed operating system, ready for initial setup and use.

Connecting your Sensors

With your image in place, we want to connect our sensors next. If you are using a T-Cobbler, this should be pretty easy for you. However, if you forgo that, you will need to look up a pin reference for your device.

First, we want to connect our DHT-11 sensor:

Wiring diagram for the DHT-11 sensor to a T-Cobbler board
Wiring diagram for the DHT-11

Next, we will want to connect our BMP280 Pressure Sensor:

Connecting our BMP280 Pressure Sensor to our T-Cobbler
Wiring diagram for the BMP280

Lastly, we need to hook up our DS18B20 Temperature Sensor:

Connecting the DS18B20 Temperature Sensor to our T-Cobbler
Wiring Diagram for the DS18B20

If you decided to forgo the T-Cobbler, you can connect the VCC wire to a 3V3 pin. The DS18B20 sensor works with both 3.3v and 5v.

With our sensors hooked up, we can turn the power on to your Raspberry Pi.

Installing your Software

It may take a few minutes for your Raspberry Pi to come online. Eventually, you connect to it with PuTTY with your hostname, user, and password. Once you are online, you can start running updates and software installs. First thing we need to do is enable I2C:

sudo raspi-config

From here, go to interface options, then I2C, and enable. Next, we need to update a config file:

sudo nano /boot/config.txt

Scroll to the bottom of the file and add “dtoverlay=w1-gpio”. Then use CTRL+X, then Y, then Enter to save the changes. Once saved, reboot your device:

sudo reboot

Once you are back online and signed in, we have a list of commands that need to be run in sequence:

sudo apt-get update
sudo apt-get -y upgrade 
sudo apt-get install -y git cmake build-essential curl libcurl4-openssl-dev libssl-dev uuid-dev
sudo apt-get install -y nodejs npm vim
sudo npm install -y -g npm
sudo npm cache clean -f
sudo npm install -y -g n
sudo n stable
sudo apt-get install -y python3-full
sudo python3 -m pip config set global.break-system-packages true
sudo pip3 install azure-iot-device  
sudo pip3 install azure-iot-hub
sudo apt-get install -y i2c-tools build-essential
sudo pip3 install adafruit-circuitpython-bmp280
git clone https://github.com/adafruit/Adafruit_Python_DHT.git
sudo python ~/Adafruit_Python_DHT/setup.py install
cd Adafruit_Python_DHT
sudo python setup.py install
sudo reboot

However, if you do not want to babysit the device, you can run the script I put together in my GitHub Repository for this project. You can move that file across with WinSCP. Once this is complete, it is time to add the python files to send our data to Fabric and Real Time Intelligence!

Deploying the Weather Station Script

There are two python files required to make this script work. The configWeatherStation.py file identifies the device name and connection string. All you have to do is update the values in the sample code below:

#!/user/bin/python
#For use with the WeatherStation.py script

#Paste the connection string for your device here:
DeviceConnectionString = 'Connection String Here'

#Enter your Device ID here:
DeviceName = 'Device Name Here'

Once you have updated configWeatherStation.py, upload it with WeatherStation.py. You can then run the WeatherStation.py file with the following command:

python WeatherStation.py

Once this has started, it should provide sensor data and see the messages starting to accumulate in the Azure IoT Hub page.

Next Steps

Well, that was not very data centric. I know, it is a little annoying to some degree. However, I have found that having a physical device makes it easy to demonstrate capabilities. Real Time Intelligence sounds good, but until you see data being manipulated in real time, it feels fictitious. I saw this in action in a previous life and hope this will help you as well!

On a personal note, I am already enjoying this series as it is quite nostalgic. I did this demonstration about eight years ago with stream analytics and Azure ML Studio. The difference was I got to do it with a good friend and colleague, Greg Hicks. Unfortunately, he passed away last year. While it would be more fun to do this project with him, it does bring back some great memories. I only hope you enjoy following this series as I am while walking down memory lane.

Extending Power BI with Power Automate

While Power BI has always been a part of the Power Platform, it never feels that way. So often the focus has been integrating Power BI into Power Apps. But what about Power Automate? Is there a way for us to integrate some automation with our Power BI content?

The short answer is yes! There are a lot of components to consider with your solution. Spend a little time brainstorming and you will discover a world of possibilities! The trick is to consider what value that automation can bring with just a little innovation!

What is Power Automate?

Power Automate is the automation pillar within the Microsoft Power Platform. Focused on repetitive and menial tasks, Power Automate is designed to free your time to focus on more value-add activities. You create “flows” that start with a trigger and then have subsequent actions that are executed within the context of the flow.

To create a flow, you can navigate to Power Automate and sign in. From there, you can start typing into Copilot to tell it what you want to do. However, you might want to just explore some ideas. For that, you can go to the templates section on the right or just click create to start from scratch!

Screenshot of the Power Automate home page with Copilot in the center and the navigation bar on the left for templates and creating solutions from scratch.
Screenshot of the Power Automate web interface

To help you with your journey, let’s take a moment to understand what triggers and actions are.

Flow Triggers

When you create a new flow, you must initiate it with a trigger. There are three primary types of triggers:

  1. Automated Cloud Flows
  2. Instant Cloud Flows
  3. Scheduled Cloud Flows
Screenshot of how to create a flow from scratch in Power Automate
Screenshot of how to create a flow from scratch

While there are some other options to use on the create screen, we will focus on these three. The difference between the triggers is pretty simple. Automated cloud flows initiate when something happens like a new file is created or an email arrives. Scheduled flows happen on a particular cadence. This is just like a scheduled refresh in Power BI. Instant flows are initiated with a click of a button.

Flow Actions

Now that you have created a flow and assigned a trigger, you must assign at least one action. Actions are essentially automation steps that can be taken within the flow. You can click the plus sign under your trigger to add an action:

Screenshot of adding an action inside of a flow with Power Automate
Adding an action in a flow

On the left-hand side of the editing screen, you can search for different actions or applications. You can select an application to see what actions are available. A great option to look at is Microsoft Teams. Some actions you can do include create a message or chat, schedule a Teams Meeting, or post an adaptive card and await a response as just a few options:

Screenshot of actions available for Microsoft Teams within the editing window of Power Automate
List of actions that can be taken with Microsoft Teams

One other benefit within Power Automate is that we can call data from the initial trigger and previous actions. In our example, this flow is being manually initiated with a button from the Power Automate app. By initiating the flow in the mobile app, we are able to capture some location data. I can call that data dynamically within my flow:

Screenshot showing the list of location data elements available from the flow trigger within an action for Power Automate
Calling dynamic data from the flow trigger in an action

At this point, I hope you are already thinking about what actions you could take with Power BI inside of Power Automate!

What about this “Premium” label?

This always comes up when I talk about anything with the Power Platform. What does the premium label mean? In short, it means you need to pay for some additional licensing to use this action or trigger.

When you look at the actions and triggers available to you, there are generally three types. Built in items are always available. Standard items are available assuming you have the application licensed and enabled for the user. This generally applies to the entire Microsoft 365 stack in addition to some other third-party SaaS solutions. Examples include Google Drive, Gmail, Dropbox, Box.com, and others of that ilk.

The last group, premium items, require you to pay for some form of premium licensing. This includes SQL connectors, Azure DevOps, Dataverse, AWS services, and others. If you want to use data out of your SQL server, you need to have a premium connector.

Or do you?

Power BI Actions in Power Automate

If you know me, you know that I like doing things as efficiently as possible. I hate watching my customers pay for licensing they do not need. I also do not condone the circumvention of licensing and recommend you invest to do things the right way every time. With that said, sometimes we can get creative with our solutions to obtain a goal.

This is why I feel getting a plan together early helps. If you understand the actions and touch points you are looking to use, you might be able to save a few dollars here and there. Not every organization has funds available to achieve goals. However, a little strategy could get you there quickly!

Querying your Semantic Model

If you take a look at the actions you have available to you for Power BI, you will note that there are several options. One of those options is to query a Power BI dataset. Yes, it should say semantic model instead, but Power Automate has not caught up yet. If you leverage this action, you can write a query in DAX and run it against your model. Below is a sample query I have that I can use with my Van Arsdel Sales model:

DEFINE
	VAR _SalesSummary = 
		SUMMARIZECOLUMNS(
			'bi_manufacturer'[Manufacturer], 
			'bi_product'[Segment],
			"Quantity", 'ReportMeasures'[Quantity],
			"Revenue", 'ReportMeasures'[Revenue]
		)

EVALUATE
	_SalesSummary

ORDER BY
	[Revenue] DESC, 
		'bi_manufacturer'[Manufacturer], 
		'bi_product'[Segment]

This is a simple query that I wrote, but I can now apply it against my model:

Screenshot of the Run a query against a dataset action being customized to use a DAX Query inside of the Power Automate editor.
DAX Query from above to be run against a dataset

Now think about this for a minute – I just used Power Automate to query data that is being refreshed from my SQL Server! Am I violating the licensing terms? No! Instead, I am leveraging the existing investment I have made in Power BI licensing. That is pretty cool!

Now, this might not always work for everyone. Real time data would still need to connect to a SQL server directly. However, if that is not a requirement, you might be able to avoid buying additional licenses by thinking through how you connect to your data in Power Automate.

But what else could I do?

Export Reports to a File

I can already hear the groans on this one. However, hear me out. For many organizations, they would like to create a way to share reports and dashboards with customers. Exporting this data to a file does take away the interactive piece. But is there a valid use case for this? There might be!

The first thing I can do is identify the report I want to export to a file. You select the workspace and report that will be exported and define the parameters for the job. You can select file to be in PDF, PNG, or PowerPoint format. In addition, you can provide a filter like I did on the manufacturer. I am not going to bore you with the filtering details, but you can reference this Microsoft Learn article for more information. I also suppressed my hidden pages for the time being:

Screenshot of the Export to File for Power BI Reports action being customized within the Power Automate editor.
Exporting a Power BI Report to a File

Now that I have an action exporting the data, I can attach it to an outbound email. For that, I just need to place the dynamic content from the export into the attachment field and I am good to go:

Customizing the Send an email (V2) action for Outlook to send an exported report in the Power Automate editor.
Attaching the exported report to an email

Now I know some of you are already thinking about ways to reduce your licensing already with Power BI Capacity or Fabric Capacity. Remember, you are leveraging existing licensing. Unfortunately, this one requires you to still maintain capacity licensing on the Power BI side. But why would I do this instead of sharing it directly with others or embedding into a web portal?

What I like about this approach is that you could create a list of customers and contacts that you could put together and simply send these reports. If you are not a web developer or comfortable with the idea of external sharing, you could put this together pretty quickly. As long as your report is within a workspace that has a capacity assigned, you could do this on your own with just a few clicks! Pretty cool!

Any Other Interesting Actions?

Of course! For me, I think the dataset refresh and add a row to a dataset actions are the most helpful. They have specific uses and are not for everyone. But they have served me well in the past for certain edge cases or challenges that I am trying to solve.

In addition to those, there is a full complement of actions and triggers for goals. I find not a lot of organizations are leveraging goals, so they ignore these actions. Especially since they are officially in “preview” status as of the writing of this article. However, you might find a solid use case for them.

Power Automate Inside Power BI

Alright – we finally arrived at the moment you have been waiting for. How do I leverage Power Automate inside of Power BI? Well, it is quite simple. Just start by adding the Power Automate visualization to your report canvas like so:

Screenshot of adding the Power Automate visual to the canvas using the Visualization Pane inside of the Power BI Desktop interface.
Power Automate visual inside of the Power BI Report canvas

From there, the steps are spelled out for you. Make sure you have the right environment selected, then start adding data. Once that is complete, you can click on the three dots and edit your flow:

Screenshot of editing the Power Automate flow to take the Power BI data, convert it to a CSV table, and then create a CSV file in my OneDrive with the transformed data within the Power Automate classic editor.
Editing the flow

Now there are a few things to note. First is that you have to use the classic editor for this. I do not have a solid answer to why this is the case, but it is just the reality of the situation. While the interface is not as nice, the same principals apply.

Next, you will note that I have a data operation step. This is to simply extract the data I care about and strip away the additional metadata from the JSON body that comes in. I am then able to take that body and do whatever I want with it. As a simple example, I am exporting it to a CSV file inside of my OneDrive.

Once I am finished with my customization of the flow, I can save and apply. And while I like the Power Automate blue, it does not match my theme. So, I can just make a quick adjustment to the fill color so it matches. I also update the button text to identify what will happen when I click it. This way everyone knows what it will do when you press it:

Screenshot of the Power Automate visualization on the report canvas being updated with the proper matching theme color and button text stating "Run Export in SAP Job".
Updating the button to match the theme

Once again, I am hearing some grumbling from people reading this article. Yes, I am with you that exporting to Excel or CSV from a Power BI report is not what we want our consumers to do. However, there are valid use cases for this type of process.

I had a customer recently that would export data from a legacy system into an Excel file so it could be ingested into their ERP solution. Now I think we would both agree, if you are using Power Automate, why not write a real integration? 100% agree with you on that. However, remember what I said about leveraging your existing investments?

This customer created this custom ingestion method almost ten years ago. And while it is probably time for an upgrade, someone was able to create this solution with about three hours of their time. The time savings to the user was minimal as it might have taken as much as ten minutes once every few days to do this process the old way. But likely not enough to justify rebuilding the process to create a new integration. So, while it was ugly, it got the job done while providing a quick return on investment.

Conclusion

These are only some small ideas of what you can accomplish with Power Automate within Power BI. You are only limited by your own imagination. Spend some time thinking about what you might be able to do with this solution as this could really extend the value you are creating within Power BI!

Also, if you are looking to learn more about how to use Power Automate, check out my friend Micheal Heath’s Get Automating YouTube channel. You can also check out more resources from Microsoft Learn as you start your automation journey!

Copilot Reports with Microsoft Fabric – Part 3

We have been on a journey the past two weeks to deploy Copilot Reports with the new user activity reporting endpoint. While the journey has taken some time, we have arrived at the final step which is deploying our reports! You might have already performed some of this already, but I developed a template to make your life easier.

At the end of this article, you will have a report deployed to help you track week over week utilization of Copilot for Microsoft 365 within your organization. As a bonus, we will implement a few other reports to help you along the way on your adoption journey.

By the way, if you have not deployed Copilot for Microsoft 365 yet, you still want to deploy this report. There is a special page that will identify potential top users of Microsoft 365 within your organization. They are generally primed for being a participant in a Copilot for Microsoft 365 pilot program and potentially serve as a champion for your organization.

First Things First – I Made an Error

If you have been following this series, I discovered an error in our data flow code. I resolved this error on November 16th, 2024 at 9:30 AM eastern time (New York City time) and updated my GitHub repository. If you have been following this series before that correction has been made, you will need to edit the CopilotData_Raw table. You can open the advanced editor and replace it with the following code:

let
  GetCopilotData = (Params) =>
    let
      Path = if Text.Length(Params) > 0 then Params else "$top=1000", 
      Source = Json.Document(
        Web.Contents(
          "https://graph.microsoft.com/beta/reports/getMicrosoft365CopilotUsageUserDetail(period='D30')", 
          [RelativePath = "?" & Path, Headers = [Authorization = "Bearer " & #"Get-AccessToken"()]]
        )
      ), 
      NextList = @Source[value], 
      result = try
        @NextList & @GetCopilotData(Text.AfterDelimiter(Source[#"@odata.nextLink"], "?"))
      otherwise
        @NextList
    in
      result, 
  CopilotData = GetCopilotData(""), 
  #"Converted to table" = Table.FromList(
    CopilotData, 
    Splitter.SplitByNothing(), 
    null, 
    null, 
    ExtraValues.Error
  ), 
  #"Expanded Column1" = Table.ExpandRecordColumn(
    #"Converted to table", 
    "Column1", 
    {
      "reportRefreshDate", 
      "userPrincipalName", 
      "displayName", 
      "lastActivityDate", 
      "copilotChatLastActivityDate", 
      "microsoftTeamsCopilotLastActivityDate", 
      "wordCopilotLastActivityDate", 
      "excelCopilotLastActivityDate", 
      "powerPointCopilotLastActivityDate", 
      "outlookCopilotLastActivityDate", 
      "oneNoteCopilotLastActivityDate", 
      "loopCopilotLastActivityDate", 
      "copilotActivityUserDetailsByPeriod"
    }, 
    {
      "reportRefreshDate", 
      "userPrincipalName", 
      "displayName", 
      "lastActivityDate", 
      "copilotChatLastActivityDate", 
      "microsoftTeamsCopilotLastActivityDate", 
      "wordCopilotLastActivityDate", 
      "excelCopilotLastActivityDate", 
      "powerPointCopilotLastActivityDate", 
      "outlookCopilotLastActivityDate", 
      "oneNoteCopilotLastActivityDate", 
      "loopCopilotLastActivityDate", 
      "copilotActivityUserDetailsByPeriod"
    }
  ), 
  #"Removed columns" = Table.RemoveColumns(
    #"Expanded Column1", 
    {"displayName", "copilotActivityUserDetailsByPeriod"}
  ), 
  #"Renamed columns" = Table.RenameColumns(
    #"Removed columns", 
    {
      {"lastActivityDate", "Last Activity Date"}, 
      {"copilotChatLastActivityDate", "Copilot Chat"}, 
      {"microsoftTeamsCopilotLastActivityDate", "Copilot for Teams"}, 
      {"wordCopilotLastActivityDate", "Copilot for Word"}, 
      {"excelCopilotLastActivityDate", "Copilot for Excel"}, 
      {"powerPointCopilotLastActivityDate", "Copilot for PowerPoint"}, 
      {"outlookCopilotLastActivityDate", "Copilot for Outlook"}, 
      {"oneNoteCopilotLastActivityDate", "Copilot for OneNote"}, 
      {"loopCopilotLastActivityDate", "Copilot for Loop"}
    }
  )
in
  #"Renamed columns"

In short, I missed the paging requirement for this endpoint. I assumed that this endpoint was like the other graph reports endpoints. But then again, we all know what happens when we assume! Regardless, with this small change in place, you are ready to collect data for your entire organization.

If you are not up to speed, go back to my first post in this series and start collecting data to build your Copilot reports.

Deploying the Copilot Reports Template

To get started, go my GitHub repository to download the Copilot Utilization Report template. It is a pbit file that we will be able to connect to the semantic model built off your Lakehouse. Once you have downloaded the template, you can open it. You will immediately receive an “error”, but that is simply because you need to connect to your semantic model:

Screenshot of connecting our Copilot Reports to our semantic model attached to our Lakehouse.
Connecting to our semantic model associated with our Lakehouse.

With our template activated, we can save and deploy the report by publishing it to Power BI! Before we get too far, let’s do a quick tour of the report!

Copilot Reports – High Level Overview

The template comes with nine pages. Three are focused on Copilot Activity, four are focused on collaboration behaviors within the Microsoft 365, and the Collaboration Score page. There is also a report template page that is available for you to create additional custom reports.

Each page has a slicer panel to filter based upon your user information from Entra ID. With the caveat that your user information must be accurate, you can easily filter to see how users leverage tools within the Microsoft 365. You can filter department, company, office location, or even job title. You can also filter based upon licenses assigned to users as well!

In addition to slicers, each page has a help button that when activated, provides an overlay to help you understand the report visualizations. This will help you and any future consumers understand how the report is interpreted and how values are calculated. Because of this, I will not go into great detail, but give you a high-level overview of the pages themselves.

How to Use the Copilot Reports

There are three pages dedicated to Copilot utilization. The first page, Copilot activity, dives into the utilization of Copilot over the past week. It also allows you to trend utilization over time. Remember, we can only capture dates when someone last used Copilot. It is best to compare week over week due to inconsistencies with how Microsoft releases the data. However, we can smooth this out over a week instead.

The Active Copilot Users page allows us to quickly identify which users used Copilot over the past week. In addition, it shows how many applications it was leveraged within. For example, if you used Copilot for Microsoft 365 within Microsoft Teams, Word, and PowerPoint, you would have an app count of three. This can help you find users for a focus group to learn more about use cases.

The Inactive Copilot Users page allows you to identify who is not using their assigned license. During a pilot, we want to make sure that everyone is leveraging Copilot. This report helps us track and activate dormant licenses. It also helps us to find who has used Copilot for Microsoft 365 in the past but have not used it for over two weeks. This might prompt you to reassign licenses to users who have more time to spend exploring Copilot.

While not a perfect science, these three reports will help you drive a quality pilot program. Beyond your pilot, it will help you drive broader adoption across the organization as well.

How to Use the Activity Analysis Reports

Why do I care about utilization of Microsoft 365 tools? Remember, Copilot for Microsoft 365 is using data from your tenant. The better you use the platform, the better your experience with Copilot for Microsoft 365 will be!

Each page is dedicated to the core utilization metrics available from Microsoft Graph. While these are high level metrics, it is enough to understand general trends over the past thirty days. Our goal is to see balance between applications and drivers of quality collaboration.

How to Use the Collaboration Score Report

There is a difference between using software and adopting software. Even though Microsoft Teams became the standard since Skype for Business was retired in 2020, it is not well adopted. The concept of the Collaboration Score is to see how well people are using their collaboration tools. But how does it work?

The Collaboration Score focuses on behaviors that lead to sharing knowledge. You could have people who work in their OneDrive and private Teams Chats all day, but they are siloed. Even if they send 500 messages a day, they will have a low score. The reason? They do not collaborate with others in shared spaces!

The better users leverage centralized collaboration repositories, the broader the corpus of knowledge will be. In addition, when new people join your organization, they will be able to draw upon that knowledge right away with Copilot. If it is siloed, it might take a while for that knowledge to be shared through individual collaboration tools like private chats and OneDrive.

What the Heck is a Collaboration Score?

I credit the Collaboration Score to my friend and colleague Kristen Hubler. She recently wrote a great article on using versus adopting software. We collaborated together to find a way to identify top users of Microsoft 365 users. What started as a hunch proved out time over time when it came to Copilot.

The concept was simple – the more diverse your Microsoft 365 utilization, the better their Copilot experience. The higher the Collaboration Score, the higher their self-reported use case count and savings. In fact, my favorite thing to do is temper “Copilot sucks” feedback with Collaboration Scores. If a user says that with a Collaboration Score of 10, it is likely because they do not use their tools well. However, if they collaborate well, there is a better corpus of knowledge to draw upon with Copilot.

In addition to the data they will have access to, it shows a user’s propensity to use new technology. It has been four years since Microsoft Teams became the official standard for collaboration. If some of your users have not been adopting these tools, they might struggle with Copilot.

What does this all mean? You might need to work on your adoption of Microsoft 365 in addition to Copilot. However, that can be part of your overall rollout of Copilot for Microsoft 365 as you go forward.

Next Steps

With these reports in place, you can now track utilization of Copilot for Microsoft 365 for adoption. If your adoption is low, start working with your organizational change management team within your organization to get the most out of your investment. Technology professionals do not consider the behavior changes you need to make with a tool like Copilot. However, having the data can help shape that roadmap.

After you have deployed this report, drop some comments below on how it is going! I want to know what you discovered with your data and some changes you might make! Also, please do not hesitate to provide feedback as I will likely be releasing new versions of this report template over time.

Copilot Reports with Microsoft Fabric – Part 2

Copilot Reports have been a challenge for a while, but Microsoft Fabric has made life so much easier! In last week’s article, we went through the process to start harvesting data from our endpoints. This week is about getting the model in place so we can deploy an easy-to-use report to track utilization trends.

While this process is a little tedious, this should make your life much easier. With a report that is already designed with the key metrics and outcomes built for you, it should only take minutes for you to find some key insights. Just make sure you pay attention to the details and everything should work out smoothly!

Add Tables to your Semantic Model

The first step we need to complete is the process of adding tables to your default semantic model. This is a critical step to ensure you have the right data available to you. We will start by going to your workspace and opening your SQL analytics endpoint tied to your Lakehouse:

Screenshot highlighting the SQL analytics endpoint for your Lakehouse
SQL analytics endpoint within your Fabric Workspace

Once, opened, we can go to the Modeling tab and manage your default semantic model. Once the new window appears, select the tables holding your data within the Lakehouse and click confirm:

Screenshot of the steps to add tables to your default semantic model within MIcrosoft Fabric
Adding tables to your Semantic Model

Now that we have your tables in place, it is time to setup some relationships to enable our reports.

Building Your Relationships

Now that we have our tables selected, we need to connect them together. If you have been using Power BI already, you know how this works. In our situation, we will need to drag our tables to the model layout so we can use it for our Copilot Reports:

Screenshot of the model layout for Copilot Reports within Microsoft Fabric
Tables added to our model layout

You will note that I added all tables except one. The collaboration_levels table was intentionally omitted because it will not have any relationships. This table is purely for labeling purposes only and will not directly interact with the tables displayed in this layout.

Connecting the Date Dimension for Copilot Reports

To get started, we will connect our date_table to our different activity tables. To make this simple, all of our relationships will use the same column name. We just need to connect the dots. We will simply connect date_table[date] to the following columns with a one-to-many relationship:

  • onedrive_user_activity[Report_Refresh_Date]
  • teams_user_activity[Report_Refresh_Date]
  • sharepoint_user_activity[Report_Refresh_Date]
  • viva_engage_user_activity[Report_Refresh_Date]
  • copilot_last_activity[Report_Refresh_Date]
Screenshot of setting up relationships in the model layout for Copilot Reports
Creating a relationship between tables

You will note that I left one table out of the mix. We will need to create a relationship with the copilot_data table, but this one will be a little different. For this one, we need to make sure the relationship is inactive:

Screenshot of the special relationship between the date_table and copilot_data
Make sure your relationship with copilot_data is inactive!

This might not make sense at the moment, but it will work out later. I promise! With these relationships in place, your model layout should look something like this:

Screenshot of our model layout with our date table in place.
Model view with date_table connected to all activity tables

With our date dimension in place, it is time to build our user data.

Connecting the User Dimension for Copilot Reports

Our graph_users table holds a lot of valuable information for deeper analysis. For example, we can provide dimensional analysis based on a user’s department, office, company, manager, or even job title! This will obviously only be as effective as the data associated with your user within Entra ID. However, many organizations find this as a great place to start.

The process of building these relationships is far easier. Both sides of the relationship use the [User_Principal_Name] column, so we just need to build out our connection. To make this happen, we will create a one-to-many connection between graph_users[User_Principal_Name] with the following columns:

  • onedrive_user_activity[User_Principal_Name]
  • teams_user_activity[User_Principal_Name]
  • sharepoint_user_activity[User_Principal_Name]
  • viva_engage_user_activity[User_Principal_Name]

Wait a minute! What about the copilot activity data tables?! Yes, I did exclude them for the moment. The reason is we need to create a bi-directional relationship our user dimension and these tables as it will aid in some of our reporting:

Screenshot of a bi-directonal relationship for our Copilot Reports model layout.
Creating a bi-directional relationship for our Copilot utilization tables

We are getting close, but we have one last relationship to create.

Connecting the License Dimension for Copilot Reports

The last dimension we are going to provide is the ability to filter users based upon what licenses they have assigned to them. For some organizations, there is a split in users between enterprise and frontline licenses. For others, they want to be hyper focused on only Copilot for Microsoft 365 users.

Regardless of your use case, this could provide some additional insight. To make things easier for us, we will simply create a relationship between the licensing_data and graph_users table with the [User_Principal_Name] column on both sides. This will also be a bi-directional relationship.

With our relationships in place, our model should look something like this:

Screenshot of our final model layout for our Copilot Reports
Our final model layout for our Copilot Reports

At this point, please double and triple check your relationships to make sure nothing is missing or incorrectly mapped. If you do not have this in place, next week’s article is going to be painful as you will be coming back to this article.

Modeling Choices for our Copilot Reports

So why did I make some of the choices I did with relationships? Well, I can tell you that part of this is because of the challenges I have experienced already with the stock labels. For example, I have experienced some volatility with the labeling of copilots in the copilot_data table prior to the release of the API endpoints. To avoid having to make these changes in the future, I have taken a few steps to ensure this is not an issue for the future.

I also fully expect this solution to be obsolete at some point in the future. When the announcement of the Copilot Reports endpoint was made, I was hoping this would provide some more granular results. For whatever reason, this has not happened yet. However, I fully expect this to eventually fall in line with the other reporting endpoints in the future. No clue on the timeline and I won’t speculate on it, but I have to imagine that this will happen eventually. Therefore, I hope to be writing an article that says this series is obsolete and you should be using the new version of the endpoint.

Fun Fact About This Article

I think one of the biggest aspects of leveraging Microsoft Fabric is the power to build everything within a web browser. My favorite part of writing this article is that I did it completely at 37,000 feet in the air! If you are a fan of Chick Corea, I just heard 500 Miles High on repeat in my head as I wrote this article.

While the soundtrack is fun, I just wanted to highlight the power and ability to manage this project all while in the air. While working on airplane might not be your thing, it is still amazing to see what we can get done with a limited internet connection. Naturally, your milage may vary by airline, but still pretty cool in my book!

Next Steps

At this point, we are harvesting our data and we have our model set. By next week, we should have three weeks of data that we can trend and start to provide better Copilot for Microsoft 365 data points to review. I am so excited to share this and look forward to publishing that article soon!

Copilot Reports with Microsoft Fabric – Part 1

Over the past year, I have been helping customers with their Copilot for Microsoft 365 deployments. The number one complaint I have received from customers has to do with the ability to provide quality Copilot Reports. If you are reading this, you are probably not alone. However, some recent changes have come about that will help us get better reporting.

While in preview at the moment, Microsoft released reporting APIs for Copilot User Activity. While these endpoints are in preview with a target release of November 2024, we can start taking advantage of them today. The biggest challenge is that in order to provide some quality reporting, we will need to start populating data. To make this a reality, this week’s post will be 100% focused on populating data in your Fabric Lakehouse.

It is important to note that I cover a lot of the mechanics of how to make this happen in previous posts. I have linked to the appropriate posts where applicable but wanted to highlight this now. There are a lot of moving pieces to make this happen and there is some prerequisite knowledge for you to be successful. Please take the time to reference these posts when applicable.

Elephant in the Room with Copilot Reports

Before we get started, we need to address the headwinds with Copilot Reports. If you know me personally, you know I am a balls and strikes kind of person. I call things as I see them and sometimes that means calling out some of the issues that come with new solutions.

Challenges with the Data

If you are a data person like me, you find the reports available from Microsoft to be frustrating. They are detailed with specific behaviors inside of Viva Insights, but you cannot drill down enough to see where use cases are being leveraged. The only way to look at individual user activity has been with the Microsoft 365 Admin Reports for Copilot. This data is what the API endpoints are pulling at this time. While the individual detail is nice, it does not resolve the frustration we have been seeing.

The reason for this frustration is twofold. Unlike the endpoints we explored in my previous series, these endpoints have two drawbacks. The first is that I cannot identify activity for a specified date. I can only pull for a period of activity. Second, I do not get a count of activity for each use of Copilot. Instead, I just get a date for the last time Copilot was used. Most of my customers have been vocal about this second point.

Overcoming the Challenges

I will not speculate on why Microsoft has chosen this path. Instead, I pushed to find a way to work around the challenge and make something happen. Instead of focusing on individual days of data, we will need to look at data week over week. Is this an imperfect system? Absolutely! But I rather focus on what I can do than what I cannot do. Previously, I was able to do this by exporting this data on a weekly basis. Annoying? Yes, but now we can automate it with the advent of the new reporting endpoint. And I was even able to enhance other elements of my analysis as a result!

This all means our reporting will simply look at user activity week over week with Copilot. This will help us drive the value we seek in some meaningful way instead of guessing at how people are leveraging Copilot. My customers have found this insight valuable in driving better adoption which results in stronger utilization.

Foundations of Copilot Reports

If you followed along for my series on Graph API reports, you should already be familiar with the key steps to get things going for this project. While we did a lot of this in that series, we will be revisiting parts of it again. To simplify the process, we will go through high level the steps we need to complete. If you need a primer on this, please go back to my previous series to go through the details on how to prepare a Lakehouse in Microsoft Fabric.

Build a Fabric Lakehouse

To get started, sign into Microsoft Fabric and create a workspace for your Copilot Reports. You will likely want to share these reports with others and it might make sense to create a dedicated workspace for this. The key point here is that we will want a new workspace to streamline the data collection process as it will be changing.

Screenshot of my new Fabric Lakehouse for my Copilot Reports
New Fabric Lakehouse for my Copilot Reports

With our Lakehouse in place, we are ready for the next step in the process.

Creating Tables in the Lakehouse

Just like we did before, we will use a Fabric Notebook to create our tables. To get started, download the Create Graph Tables notebook I created for you. Remember, that you must swap your context to the Data Engineering view before you can import the notebook.

Screenshot of the Data Engineering view inside of Microsoft Fabric where we will import our notebook.
Swapping to the Data Engineering view to import your Notebook

Once imported, you can open the notebook and connect it to your new Lakehouse. From there, you can run the notebook. This process will take a few minutes, but will hopefully make some parts of this process much easier for you long term.

Screenshot of our notebook creating tables in our Fabric Lakehouse for our Copilot Reports
Creating tables in our Fabric Lakehouse

With our tables connected, we need to start harvesting data from Microsoft Graph.

Harvesting Data for our Copilot Reports

Now that we have our foundation in place, it is all about getting our data liberated from Graph and into the Lakehouse. To get this process started, download the Copilot Reporting Template I created for your Dataflow. Once downloaded, create a Dataflow Gen2 and import your template:

Screenshot of the process to import a Power Query template in a Dataflow.
Importing my Power Query template in a Dataflow

Once imported, you will need to update your parameters with your Service Principal information. If you never did this before, follow the steps I outlined in my Reporting with the Graph API post a few weeks back to ensure you have one configured properly. If you do, just update the parameters:

Screenshot of the Dataflow with the required parameters that need to be updated for your Copilot Reports
Update your parameters with your Service Principal details

Now that you have updated your Service Principal information, you just need to update your connection properties ensuring all endpoints are using an anonymous method as we are using the Service Principal to authenticate. From there, you just need to map your data to your new Lakehouse:

Screenshot of us mapping data from our Dataflow to our new Fabric Lakehouse
Mapping our data to our Lakehouse

We will need to map the following tables with the following properties:

Query NameLakehouse Table NameBehavior
GraphUsersgraph_usersReplace
Date_Tabledate_tableReplace
LicensingDatalicensing_dataReplace
TeamsUserActivityteams_user_activityAppend
OneDriveUserActivityonedrive_user_activityAppend
SharePointUserActivitysharepoint_user_activityAppend
VivaEngageUserActivityviva_engage_user_activityAppend
CopilotDatacopilot_dataAppend
CopilotLastActivitycopilot_last_activityAppend
Table mappings for our Dataflow to our Fabric Lakehouse

Once this is completed, we can publish our Dataflow.

Next Steps

Now that we have all of this in place, it is time to setup the refresh. If you remember from the beginning, we will be looking this data on a weekly basis. Therefore, we will want to setup the refresh to pull the data weekly. For myself, I setup the refresh to occur Sunday mornings at 9 AM as that is a quiet time of the week. It also gives me the best shot at pulling the data on the same day of the week. Remember, we are at the mercy of Microsoft and when they are releasing the data, so we have to somewhat hope for the best in this situation.

With the refreshes running weekly, we will start to collect some meaningful data we can use. Next week, we will work on preparing the semantic model to allow us to connect our Power BI report. Until then, we will keep harvesting data to help bring better value to our report!

Microsoft Graph API Reporting in Fabric – Part 3

Welcome back to my series on the Microsoft Graph API. We have spent a lot of time laying the groundwork for our data. In this article, it is all about getting our data into our Lakehouse so we can start to trend our data.

As a quick reminder, the purpose of us having to go through this exercise is because we must retain data on our own. The Microsoft Graph API will only furnish detailed data over the past thirty days. Any further back and it is lost. Therefore, if we want to trend data over three months, we must warehouse it.

In this article, we will be finalizing our queries and load them into our Lakehouse. To get there, we are going to go through some deep levels of M Query. However, hang in there until the end, I should be able to make it easier for you!

Getting Teams User Activity Detail

To get started, we are going to have to write out some code. While you can just copy/paste the code into this article, I think it is worth walking through together to help you understand the method behind the madness. It seems complicated – and it really is. But understanding the rationale of what was done will help you with your own projects. To help us on this journey, you will need to create three more parameters to simplify the process:

Parameter NameParameter TypeParameter Value
LakehouseIdTextValue of Lakehouse ID from URL
WorkspaceIdTextValue of Workspace ID from URL
HistoricalDayCountDecimal NumberList of values (5, 15, 27) with the default set to 5
Parameters required for capturing Microsoft Graph API data

With your parameters in place, it is time to write out our query. Open a new blank query and strap in!

Start With a Function

Because we are pulling data for a single date at a time, we will need to create a function. We have done this before, but this time we are going to embed it within an existing query:

let
  GraphDataFunction = (ExportDate as date) =>
    let
      GraphData = Web.Contents("https://graph.microsoft.com/v1.0/reports/", 
        [RelativePath="getTeamsUserActivityUserDetail(date=" & Date.ToText(ExportDate, "yyyy-MM-dd") & ")",
        Headers=[Accept="application/json",Authorization="Bearer " & #"Get-AuthToken"()]])
    in
      GraphData
in
  GraphDataFunction

We need to call a few things out. First and foremost, you will note we are using the RelativePath option. This is because our dates will be updated with each all of the function, therefore we must use this option to avoid errors.

Next, you will notice the Authorization header is using our Get-AuthToken function we created in part 2 of this series. This is key to ensure we use our token to complete the authentication process.

Checking the Lakehouse for Existing Data

We will need to ensure we only query Graph data that is not already in the Lakehouse. Because of how finicky the Graph API can be, we cannot guarantee data to be available the next day. Sometimes it is a few days behind. We also do not want duplicate data to be added to the Lakehouse.

To prevent this from happening, we will query the Lakehouse and the tables we created. By using the parameters we created, we will call our Lakehouse and pull our historical data so we can transform it:

let
  GraphDataFunction = (ExportDate as date) =>
    let
      GraphData = Web.Contents("https://graph.microsoft.com/v1.0/reports/", 
        [RelativePath="getTeamsUserActivityUserDetail(date=" & Date.ToText(ExportDate, "yyyy-MM-dd") & ")",
        Headers=[Accept="application/json",Authorization="Bearer " & #"Get-AuthToken"()]])
    in
      GraphData, 
  LakehouseConnection = Lakehouse.Contents(null){[workspaceId = WorkspaceId]}[Data]{[lakehouseId = LakehouseId]}[Data],
  HistoricalExports = LakehouseConnection{[Id = "teams_user_activity", ItemKind = "Table"]}[Data],
  ReportRefreshDates = Table.SelectColumns(HistoricalExports, {"Report_Refresh_Date"}),
  FilterRefreshDates = Table.SelectRows(ReportRefreshDates, each [Report_Refresh_Date] <> null and [Report_Refresh_Date] <> ""),
  HistoricalDatesExported = Table.Distinct(FilterRefreshDates, {"Report_Refresh_Date"})
in
  HistoricalDatesExported

These five lines of code will allow us to connect to our teams_user_activity table, remove unnecessary columns, remove duplicates, and format into a usable column. This will be empty for right now, but more will be added as we move forward.

Create a Range of Dates and Merge with Historical Data

Next, we will pull data from the last few days. To make life easier on us, we will limit the pull to five days to speed up our development process. We just need to use today’s date as a base and create a range going back five days that we can use:

let
  GraphDataFunction = (ExportDate as date) =>
    let
      GraphData = Web.Contents("https://graph.microsoft.com/v1.0/reports/", 
        [RelativePath="getTeamsUserActivityUserDetail(date=" & Date.ToText(ExportDate, "yyyy-MM-dd") & ")",
        Headers=[Accept="application/json",Authorization="Bearer " & #"Get-AuthToken"()]])
    in
      GraphData, 
  LakehouseConnection = Lakehouse.Contents(null){[workspaceId = WorkspaceId]}[Data]{[lakehouseId = LakehouseId]}[Data],
  HistoricalExports = LakehouseConnection{[Id = "teams_user_activity", ItemKind = "Table"]}[Data],
  ReportRefreshDates = Table.SelectColumns(HistoricalExports, {"Report_Refresh_Date"}),
  FilterRefreshDates = Table.SelectRows(ReportRefreshDates, each [Report_Refresh_Date] <> null and [Report_Refresh_Date] <> ""),
  HistoricalDatesExported = Table.Distinct(FilterRefreshDates, {"Report_Refresh_Date"}),
  TodaysDate = Date.AddDays(Date.From(DateTime.LocalNow()), -5),
  ListDatesForExport = List.Dates(TodaysDate, 5, #duration(1, 0, 0, 0)),
  TableOfDates = Table.FromList(ListDatesForExport, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  RenameDateColumn = Table.RenameColumns(TableOfDates, {{"Column1", "ExportDates"}}),
  SetProperDataType = Table.TransformColumnTypes(RenameDateColumn, {{"ExportDates", type date}}),
  MergeHistoricalData = Table.NestedJoin(SetProperDataType, {"ExportDates"}, HistoricalDatesExported, {"Report_Refresh_Date"}, "HistoricalDatesExported", JoinKind.LeftOuter),
  ExpandHistoricalDates = Table.ExpandTableColumn(MergeHistoricalData, "HistoricalDatesExported", {"Report_Refresh_Date"}, {"Report_Refresh_Date"}),
  IdentifyMissingData = Table.SelectRows(ExpandHistoricalDates, each ( [Report_Refresh_Date] = null))
in
  IdentifyMissingData

Once we have a list of dates in place, we will merge it to our historical data from the HistoricalDatesExported step. From there, expand the values and filter out the null values so we only have dates where we are missing data left.

Execute the Microsoft Graph API Function

Reaching all the way back to the first step, GraphDataFunction, we are going to pass our list of dates through to get our data. We will pass our column and pull our data through:

let
  GraphDataFunction = (ExportDate as date) =>
    let
      GraphData = Web.Contents("https://graph.microsoft.com/v1.0/reports/", 
        [RelativePath="getTeamsUserActivityUserDetail(date=" & Date.ToText(ExportDate, "yyyy-MM-dd") & ")",
        Headers=[Accept="application/json",Authorization="Bearer " & #"Get-AuthToken"()]])
    in
      GraphData, 
  LakehouseConnection = Lakehouse.Contents(null){[workspaceId = WorkspaceId]}[Data]{[lakehouseId = LakehouseId]}[Data],
  HistoricalExports = LakehouseConnection{[Id = "teams_user_activity", ItemKind = "Table"]}[Data],
  ReportRefreshDates = Table.SelectColumns(HistoricalExports, {"Report_Refresh_Date"}),
  FilterRefreshDates = Table.SelectRows(ReportRefreshDates, each [Report_Refresh_Date] <> null and [Report_Refresh_Date] <> ""),
  HistoricalDatesExported = Table.Distinct(FilterRefreshDates, {"Report_Refresh_Date"}),
  TodaysDate = Date.AddDays(Date.From(DateTime.LocalNow()), -5),
  ListDatesForExport = List.Dates(TodaysDate, 5, #duration(1, 0, 0, 0)),
  TableOfDates = Table.FromList(ListDatesForExport, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  RenameDateColumn = Table.RenameColumns(TableOfDates, {{"Column1", "ExportDates"}}),
  SetProperDataType = Table.TransformColumnTypes(RenameDateColumn, {{"ExportDates", type date}}),
  MergeHistoricalData = Table.NestedJoin(SetProperDataType, {"ExportDates"}, HistoricalDatesExported, {"Report_Refresh_Date"}, "HistoricalDatesExported", JoinKind.LeftOuter),
  ExpandHistoricalDates = Table.ExpandTableColumn(MergeHistoricalData, "HistoricalDatesExported", {"Report_Refresh_Date"}, {"Report_Refresh_Date"}),
  IdentifyMissingData = Table.SelectRows(ExpandHistoricalDates, each ( [Report_Refresh_Date] = null)),
  ExecuteGraphFunction = Table.AddColumn(IdentifyMissingData, "Invoked custom function", each GraphDataFunction([ExportDates])),
  FilterMissingData = Table.SelectRows(ExecuteGraphFunction, each [Attributes]?[Hidden]? <> true),
  MergeExportedFiles = Table.AddColumn(FilterMissingData, "Transform file", each #"Transform file"([Invoked custom function], 33)),
  RemoveNonDataColumns = Table.SelectColumns(MergeExportedFiles, {"Transform file"}),
  ExpandResults = Table.ExpandTableColumn(RemoveNonDataColumns, "Transform file", {"Report Refresh Date", " User Id", " User Principal Name", " Last Activity Date", " Is Deleted", " Deleted Date", " Assigned Products", " Team Chat Message Count", " Private Chat Message Count", " Call Count", " Meeting Count", " Meetings Organized Count", " Meetings Attended Count", " Ad Hoc Meetings Organized Count", " Ad Hoc Meetings Attended Count", " Scheduled One-time Meetings Organized Count", " Scheduled One-time Meetings Attended Count", " Scheduled Recurring Meetings Organized Count", " Scheduled Recurring Meetings Attended Count", " Audio Duration In Seconds", " Video Duration In Seconds", " Screen Share Duration In Seconds", " Has Other Action", " Urgent Messages", " Post Messages", " Tenant Display Name", " Shared Channel Tenant Display Names", " Reply Messages", " Is Licensed", " Report Period"}),
  FilterMissingResults = Table.SelectRows(ExpandResults, each [Report Refresh Date] <> null)
in
  FilterMissingResults

After we execute the function, we will combine our files and expand the results. To do this, we are using a custom function titled Transform file to combine our files seamlessly:

(Parameter as binary, ColumnCount as number) => let
  Source = Csv.Document(Parameter, [Delimiter = ",", Columns = ColumnCount, Encoding = 65001, QuoteStyle = QuoteStyle.None]),
  #"Promoted headers" = Table.PromoteHeaders(Source, [PromoteAllScalars = true])
in
  #"Promoted headers"

The result should be something that looks like this:

Screenshot of Microsoft Graph API report data.
Screenshot of the Microsoft Graph API report data for Teams

Wow – that was a whole lot for a single endpoint. Only four more applications left! Frustrated that you will have to do this again? Well, instead of being frustrated, let’s make it a little easier.

Turning Microsoft Graph API Reports into a Flexible Function

We have worked so hard to get the first query in place. But if we think through things a little further, we can simplify the process for future endpoints. Instead of going through this entire process over and over again, we can convert this to a function and pass data to make it easier for us to use.

Prepare Your Endpoint Data

To simplify our process, we will start by creating a table we can reference. This will highlight the application, Microsoft Graph API endpoint, Fabric Lakehouse table, columns pulled from the endpoint, and the column names. Create a blank query called Endpoints and add the following code:

let
  Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3VXbbtswDP0Vwc/pQ9cvCOIMDZCtRt0FGIIiECzOIWpTgURny75+lOPc5aB+3YttkkeUeXRILZfJtNZYJaMEwnvVeHArXTBukXfiLYFbwLhz/ZB4Crxf8vgkj1fYWMfqFX458GuVaoaRCjCVOaQCN7pS33UtzhT9ptK7zpp5lUIFDGZ0+OgWz7Vnddiw8+VARk1sQzySrQrALZxMfQx9A2CkUk0c6JDwyj0jBifVnSJj77EksTNnTVOwD/nagjJwaE3yPlomLwSpkx2lWivY8Bkj6gCLc/U4iKth7CwQfgvgxampwQD9itWRn3xHxbVrrZ24WjpIV9UuFp3+iUY/xVibIrNILFX7YGyCEWPtBI3z9uU/4W2Bvt0i0+UwLt9A115I4PCOMdgCAhtRAp/uN+nM9DAZ46uf3Ugl4bfUZK1Z2s/7s6plp60sisYmwttNM1+YXo6r1IR/b9rbqzGzzImz7jbq2RZ3Vl4DrhPkxRpMUwWNEDww1nAn2T1wf2KZZo1zoc7PZI6gb4puDFqVNk4zWhKxyvAsLBkfNGigJ5QXDoD2Go4jnrX8Ga9FKkEVlkQ3rgQ6HqFAMuvPzTcgLfHLud91iRw+EVQxzL4LpH1OmUR4cyyAfFDebYssRFBqSqWARexbsVbQWrF2+anruqdXBg7py8JaSM7764o7gZNIu3/+BL7AxC6yOX4MuaTe/wE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Application = _t, Lakehouse_Table = _t, Graph_Endpoint = _t, Column_Count = _t, Column_Mapping = _t]),
  #"Changed column type" = Table.TransformColumnTypes(Source, {{"Column_Count", Int64.Type}, {"Application", type text}, {"Lakehouse_Table", type text}, {"Graph_Endpoint", type text}, {"Column_Mapping", type text}})
in
  #"Changed column type"

You can choose to pull this data from somewhere else if it is easier for you, but this table is easy to maintain in this scenario.

Convert our Query to a Function

Our next step is to take the first query we created in this article and make it a function. We will start by adding the parameter for an application at the top and filter the Endpoints table with the value passed in the parameter:

(Application as text) =>
let
  EndpointTable = Endpoints,
  EndpointData = Table.SelectRows(EndpointTable, each [Application] = Application),
  GraphDataFunction = (ExportDate as date) =>
    let
      GraphData = Web.Contents("https://graph.microsoft.com/v1.0/reports/", 
        [RelativePath=EndpointData{0}[Graph_Endpoint] & "(date=" & Date.ToText(ExportDate, "yyyy-MM-dd") & ")",
        Headers=[Accept="application/json",Authorization="Bearer " & #"Get-AuthToken"()]])
    in
      GraphData, 

  //Endpoint = "teams_user_activity",
  LakehouseConnection = Lakehouse.Contents(null){[workspaceId = WorkspaceId]}[Data]{[lakehouseId = LakehouseId]}[Data],
  HistoricalExports = LakehouseConnection{[Id = EndpointData{0}[Lakehouse_Table], ItemKind = "Table"]}[Data],
  ReportRefreshDates = Table.SelectColumns(HistoricalExports, {"Report_Refresh_Date"}),
  FilterRefreshDates = Table.SelectRows(ReportRefreshDates, each [Report_Refresh_Date] <> null and [Report_Refresh_Date] <> ""),
  HistoricalDatesExported = Table.Distinct(FilterRefreshDates, {"Report_Refresh_Date"}),
  TodaysDate = Date.AddDays(Date.From(DateTime.LocalNow()), -HistoricalDayCount),
  ListDatesForExport = List.Dates(TodaysDate, HistoricalDayCount, #duration(1, 0, 0, 0)),
  TableOfDates = Table.FromList(ListDatesForExport, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  RenameDateColumn = Table.RenameColumns(TableOfDates, {{"Column1", "ExportDates"}}),
  SetProperDataType = Table.TransformColumnTypes(RenameDateColumn, {{"ExportDates", type date}}),
  MergeHistoricalData = Table.NestedJoin(SetProperDataType, {"ExportDates"}, HistoricalDatesExported, {"Report_Refresh_Date"}, "HistoricalDatesExported", JoinKind.LeftOuter),
  ExpandHistoricalDates = Table.ExpandTableColumn(MergeHistoricalData, "HistoricalDatesExported", {"Report_Refresh_Date"}, {"Report_Refresh_Date"}),
  IdentifyMissingData = Table.SelectRows(ExpandHistoricalDates, each ( [Report_Refresh_Date] = null)),
  ExecuteGraphFunction = Table.AddColumn(IdentifyMissingData, "Invoked custom function", each GraphDataFunction([ExportDates])),
  FilterMissingData = Table.SelectRows(ExecuteGraphFunction, each [Attributes]?[Hidden]? <> true),
  MergeExportedFiles = Table.AddColumn(FilterMissingData, "Transform file", each #"Transform file"([Invoked custom function], EndpointData{0}[Column_Count])),
  RemoveNonDataColumns = Table.SelectColumns(MergeExportedFiles, {"Transform file"}),
  ExpandResults = Table.ExpandTableColumn(RemoveNonDataColumns, "Transform file", Text.Split(EndpointData{0}[Column_Mapping], ", ")),
  FilterMissingResults = Table.SelectRows(ExpandResults, each [Report Refresh Date] <> null)
in
  FilterMissingResults

From there, we will call the data from the EndpointData step and inject it into our query with the following items:

  • In the GraphDataFunction step, place a dynamic endpoint name
  • In the HistoricalExports step, place a dynamic Lakehouse table name
  • In the MergeExportedFiles step, place a dynamic column count in the Transform file function
  • In the ExpandResults step, place dynamic column mappings and wrap it with a Text.Split() function

Now that is completed, we can call the function by typing in the application name and clicking invoke:

Screenshot of us invoking our Microsoft Graph API reports function
Invoking our Microsoft Graph API reports function.

With this in place, we can invoke our function for OneDrive, SharePoint, Email, and Viva Engage with a few clicks!

Get Current Users from Entra ID

For our last step, we just need to pull our users out of Entra ID into our Dataflow. To make this happen, I like to use a function to make it easier. I created a function titled Get-Users and invoked it with this code:

() =>
let
  GetUserDetail = (Params) =>
  let    
    Path = if Text.Length(Params) > 0  then Params
      else "$select=displayName,givenName,jobTitle,mail,officeLocation,surname,userPrincipalName,id,accountEnabled,userType,companyName,department,employeeType&$expand=manager",  
    Source = Json.Document(Web.Contents("https://graph.microsoft.com/v1.0/users",[RelativePath = "?" & Path, Headers=[Authorization="Bearer " & #"Get-AuthToken"()]])),  
    NextList = @Source[value], result = try @NextList & @GetUserDetail(Text.AfterDelimiter(Source[#"@odata.nextLink"], "?")) otherwise @NextList    
  in
    result,
  UserDetail = GetUserDetail("")
in
  UserDetail

You will notice the function inside of the function. This is because we will need to iterate through the entire user list to receive all users. Because the API limits us to batches of 1,000, we will need to work through this a few times to get the results we desire.

HELP! THIS IS TOO MUCH!

Is your head spinning? Mine too. In fact, this was a lot to even get this coded. And if I ever want to reuse this, I do not want to go through any of this level of effort again. I am probably going to be like that math teacher you had where you learned how to do everything the hard way to pass an exam and then the next chapter they teach you a way to complete a problem easier. I hated it when that happened to be, but I am still going to do it to you.

To streamline this, I created a Power Query template for this project. You can download the template and import it into your new Dataflow:

Screenshot of the Microsoft Graph API Power Query template being imported into a dataflow.
Importing a Power Query template for our Microsoft Graph API reports

Once imported, you will need to update the following parameters:

  • TenantId
  • AppId
  • ClientSecret
  • WorkspaceId
  • LakehouseId

This should streamline the process considerably if you are just getting started!

From Dataflow to Lakehouse

We now just need to map our data into our Lakehouse. This will take a little diligence, but it should move along pretty quickly. We just need to match our queries and columns from our Dataflows into tables within our Lakehouse.

Adding Teams Data to the Lakehouse

Starting with our Teams user activity data, we will go to the bottom right-hand corner of the screen and click the + icon to add it to our Lakehouse. Once you have a connection to your Lakehouse, we will use an existing table and navigate to the table where your data is stored:

Screenshot of placing Microsoft Graph API data into our Lakehouse
Adding Teams data into our Lakehouse

Once we have selected our table, we need to map our columns. We also need to update the setting that will allow us to append our data:

Mapping our Microsoft Graph API report fields into our Fabric Lakehouse
Mapping columns and appending our data

Now we just need to save our settings. Once we publish and refresh our dataflow, our data will be added into the Lakehouse. We just need to repeat these steps for OneDrive, SharePoint, Email, and Viva Engage.

Adding Graph Users to the Lakehouse

If you remember from my last article, we created all of the table for our Microsoft Graph API report endpoints. However, we did not need to create anything for our users. Why is that?!

The answer is simple – with our approach, we will optimize the refresh experience and eliminate data duplication for these dates. This is because it is a moving target and hard to hit. However, with our users, it is a different story as we will replace our data each time. So, just like we did before, we will go ahead and add our Lakehouse as a data destination. However, this time we will create a new table. Once in place, we can make adjustments to our mappings and settings:

Screenshot of our Entra ID users mapping to the Fabric Lakehouse
Mapping Entra ID users to the Lakehouse

Once completed, we can click save and move to publishing our Dataflow.

One More Thing Before We Publish!

Last thing we need to do is adjust our HistoricalDayCounter from 5 days to 27 days. This will ensure that all the historical data available is captured in the refresh and will be loaded into the Lakehouse:

Screenshot of the HistoricalDayCounter parameter being updated to 30 days.
Update our HistoricalDayCounter to 27 days

Once completed, we can click publish. It will take a few minutes to validate the Dataflow. After that, we just need to setup a daily refresh and our data will start to populate into the Lakehouse!

Next Steps and Conclusion

Alright, that was a ton of information in a single article. It was hard to split up, but I wanted to keep it together in a single article as these pieces were all interconnected. It is a lot, but it is complete. Now that you have your data, you can create your model, measures, and start building out your reports.

In other news, I have been able to get the endpoint for Copilot. However, there are some data quality challenges that I want to address before doing a full write-up. In the meantime, this will get you started with understanding how complicated this level of reporting can be.

Did you make it to the end? Were you successful? If so, tell me about it in the comments below!

Microsoft Graph API Reporting in Fabric – Part 2

In my last post, I went over they why and covered the building blocks for connecting to the Microsoft Graph API for reporting. With this in place, it is time to prepare the landing zone for our data in Microsoft Fabric! To achieve this, we will need to setup our Fabric environment so we can start to build out our solution.

Before we get started, if you do not have Microsoft Fabric, do not fear! You can start a Microsoft Fabric trial for 60 days for free. When you sign into Power BI, you can click on your user avatar, and start your trial. If this is not available to you, it might be because your Fabric Administrator has turned off the trial feature. You might need to submit a request to re-enable this feature.

Creating Your Lakehouse

To get things started, we need to create a new workspace that has a Fabric capacity assigned to it. Just like you would before with Power BI, create a new workspace inside of Microsoft Fabric and ensure it is tied to your capacity.

Once you have your workspace in place, it is time to build your Lakehouse. If you go to the new button in the upper left hand corner of the workspace window, a screen will appear where you can select a Lakehouse:

Screenshot of the screen where you can select a Lakehouse to deploy.
Deploying a Lakehouse in Microsoft Fabric

From there, you can give your Lakehouse a name and click create. You will be redirected to your new Lakehouse. Once created, we will need to extract the Workspace ID and Lakehouse ID from the URL. The easiest way to capture this is to just copy the URL in the address bar to a notepad where you kept your Service Principal data in part 1 of this series. Once you capture this information, you can go can close your Lakehouse:

https://app.powerbi.com/groups/<WorkspaceId>/lakehouses/<LakehouseId>?experience=power-bi

Preparing Your Lakehouse

Now that you have a Lakehouse in place, it is time to prepare the tables. While there is an opportunity to create these tables directly out of our Dataflow, we will want to create a few ahead of time. This is simply to reduce the amount of effort on your side. The goal is to make it easy, but the process can be complex. This should make the process a little more seamless for you as a result.

While I am going to give you a Notebook you can upload, we will pause and create one from scratch so you understand the process. When you click new item in your workspace, we will scroll down to prepare data and select Notebook:

Screenshot of adding a Notebook to your Fabric Workspace.
Adding a Notebook to your Fabric Workspace

We will add our Lakehouse to the Notebook on the left hand side. Once in place, we can add our code using PySpark into a cell:

Screenshot of our Notebook in Microsoft Fabric that has been connected to a Lakehouse and PySpark code to create a table.
Preparing our Notebook to build tables in our Lakehouse

Using documentation from the Microsoft Graph API, I was able to identify the endpoints and their associated fields. This allowed me to create a block of code that would create our teams_user_activity table:

#Build out teams_user_activity table in the Lakehouse

from pyspark.sql.types import StructType, StructField, IntegerType, StringType, DateType, BooleanType
schema = StructType([
    StructField("Report_Refresh_Date", DateType(), True),
    StructField("User_Id", StringType(), True),
    StructField("User_Principal_Name", StringType(), True),
    StructField("Last_Activity_Date", DateType(), True),
    StructField("Is_Deleted", BooleanType(), True),
    StructField("Deleted_Date", StringType(), True),
    StructField("Assigned_Products", StringType(), True),
    StructField("Team_Chat_Message_Count", IntegerType(), True),
    StructField("Private_Chat_Message_Count", IntegerType(), True),
    StructField("Call_Count", IntegerType(), True),
    StructField("Meeting_Count", IntegerType(), True),
    StructField("Meetings_Organized_Count", IntegerType(), True),
    StructField("Meetings_Attended_Count", IntegerType(), True),
    StructField("Ad_Hoc_Meetings_Organized_Count", IntegerType(), True),
    StructField("Ad_Hoc_Meetings_Attended_Count", IntegerType(), True),
    StructField("Scheduled_One-time_Meetings_Organized_Count", IntegerType(), True),
    StructField("Scheduled_One-time_Meetings_Attended_Count", IntegerType(), True),
    StructField("Scheduled_Recurring_Meetings_Organized_Count", IntegerType(), True),
    StructField("Scheduled_Recurring_Meetings_Attended_Count", IntegerType(), True),
    StructField("Audio_Duration_In_Seconds", IntegerType(), True),
    StructField("Video_Duration_In_Seconds", IntegerType(), True),
    StructField("Screen_Share_Duration_In_Seconds", IntegerType(), True),
    StructField("Has_Other_Action", StringType(), True),
    StructField("Urgent_Messages", IntegerType(), True),
    StructField("Post_Messages", IntegerType(), True),
    StructField("Tenant_Display_Name", StringType(), True),
    StructField("Shared_Channel_Tenant_Display_Names", StringType(), True),
    StructField("Reply_Messages", IntegerType(), True),
    StructField("Is_Licensed", StringType(), True),
    StructField("Report_Period", IntegerType(), True)
])
df = spark.createDataFrame([], schema)
df.write.format("delta").saveAsTable("teams_user_activity")

In our scenario, we will need five tables for Teams, OneDrive, SharePoint, Email, and Viva Engage. To simplify the process for you, I have created an notebook you can upload into Fabric and run. You just need to connect your Lakehouse and select run all. To perform this task, you must switch to the Data Engineering context of Fabric and select Import on your workspace:

Screenshot of the process to import a notebook into your Fabric Workspace
Importing a Notebook into your Microsoft Fabric Workspace

Once you have run the script successfully, you can go to your SQL Endpoint for the Lakehouse and verify the tables have been created and that they are empty with the following query:

SELECT * FROM teams_user_activity ORDER BY Report_Refresh_Date DESC;

SELECT * FROM onedrive_user_activity ORDER BY Report_Refresh_Date DESC;

SELECT * FROM sharepoint_user_activity ORDER BY Report_Refresh_Date DESC;

SELECT * FROM email_user_activity ORDER BY Report_Refresh_Date DESC;

SELECT * FROM viva_engage_user_activity ORDER BY Report_Refresh_Date DESC

Preparing Your Dataflow

After creating your Lakehouse and preparing your tables, you can create a Dataflow in the same workspace. This will query the data and push it to the right tables within our Lakehouse. However, you need to be careful to select a Dataflow Gen2 as opposed to a Dataflow:

Screenshot of the screen where you can select a Dataflow Gen2 to deploy.

Adding a Dataflow Gen2 to your Fabric workspace

This is critical for us to ensure our data makes it into the Lakehouse later on, so be careful with your selection! Once you select it, you can rename your Dataflow in the upper left hand corner:

Screenshot of renaming your Dataflow in Microsoft Fabric.
Renaming your Dataflow

With your Lakehouse and Dataflow created, it is time to start getting some data.

Connecting to the Microsoft Graph API

If you remember from my last article, we will be using the application permissions with our app registration. As a result, we must acquire a OAuth2 token to facilitate the authentication process. Luckily, we made it easy for you to deploy this in your solution.

Adding Parameters in Power Query

I covered the basics of Parameters in Power Query in an older article. Instead of going through the details, I will just highlight what we need to have created.

To facilitate our authentication, we need to create three parameters that will be used to hold data from our app registration. The three parameters we need are as follows:

Parameter NameParameter TypeParameter Value
TenantIdTextDirectory (tenant) ID
AppIdTextApplication (client) ID
ClientSecretTextClient secret value
Parameters for our authentication process

Remember, capitalization counts! If you copy/paste the parameters above, you will not have an issue with the next step. It seems easy, but take great care with this process!

Creating an Authentication Function

Once you have your parameters set, you need to create a new blank query. To create what we need to make this connection happen, it requires you to build out the query in code. However, I have made that easier for you. Just copy the code below and paste it into the advanced editor with your new query:

() =>
  let
    ContentBody = "Content-Type=application/x-www-form-urlencoded&
        scope=https://graph.microsoft.com/.default&
        grant_type=client_credentials&
        client_id=" & AppId & "&
        client_secret=" & ClientSecret, 
    Source = Json.Document(
      Web.Contents(
        "https://login.microsoftonline.com/" & TenantId & "/oauth2/v2.0/token", 
        [
          Content = Text.ToBinary(ContentBody), 
          Headers = [
            Accept = "application/json", 
            ContentType = "application/x-www-form-urlencoded"
            ]
        ]
      )
    )[access_token]
  in
    Source

Once we have pasted the code into the query, click next. After that, rename the query to Get-AuthToken. Again, capitalization is important so take care with this step!

Now that we have our authentication function in place, it is time to take the next step towards building out our Microsoft Graph API functions to pull user information and Teams user activity.

Conclusion and Next Steps

There is a lot of setup involved in this process. Part of the reason is because of the complexity of the pull we need to perform, but also to avoid running into duplicates and reduce the chance for throttling with the endpoints. I know this seems like a lot of work, but I promise it will pay off in the end!

In my next post, we will finally grab our Graph API data and load it into our Lakehouse! Stay tuned to see how it happens!

Microsoft Graph API Reporting in Fabric – Part 1

The Microsoft Graph API library has been around since November 2015 and continues to expand. One of the key endpoints that has centered around utilization reports of the Microsoft 365 ecosystem. These reports can be leveraged to drive better overall adoption of the Microsoft 365 platform and understand trends in use.

Leveraging the Microsoft Graph API, the Microsoft 365 Utilization Metrics app has been the easiest deployed solution to work through this challenge. With a few clicks in AppSource, you can deploy this app within Power BI, update some details on the sematic model, and hit refresh to understand utilization. And while it does show data for the past year, customers frequently want more out of their data.

People forget about the Microsoft Graph API until new technology finds its way into the ecosystem. Enter stage left, Copilot for Microsoft 365, the newest belle at the ball! With new technology coming at a premium price point, companies want to understand utilization to ensure value. The $30 per user per month cost of Copilot suddenly has people wanting to confirm people are leveraging the tool. If not, they want to reclaim the license and get it to someone else.

This gave me the idea to revisit this challenge and see how we could leverage Fabric to make it a reality. While the Microsoft 365 Roadmap has this rolling out soon, a little preparation will allow you to deliver this data when it is finally available.

What are the Microsoft Graph API Reports?

The Microsoft Graph API reports are a collection of endpoints that provide utilization data for users. Aggregated by action, these reports provide visibility to how Microsoft 365 is leveraged. While you can pull data for users in 7, 30, 90, and 180 periods, the data is aggregated by user. This proves a challenge in getting data in a meaningful format if you are looking to track trends over time.

While those periods work, there is another option. Most endpoints allow you to specify an individual date for your query. However, you can only collect that data for the past 30 days. This is a huge restriction and can limit your ability to trend data over a longer period of time.

In the past, if you did not have a data warehousing strategy in place, you were severely limited to what Power BI could do. However, leveraging Microsoft Fabric, this will get you in the right direction pretty quickly. However, with the advent of Microsoft Fabric, we will be able to simplify this process and enable you to track this data with minimal effort!

Preparing Access to Microsoft Graph API Reports

To get ready for the data warehousing process, you must deploy an app registration inside of Entra ID. To get started, sign into Entra ID, navigate to Applications, select App Registration, and start a new registration. Name your new app and keep everything else the same:

Screenshot of an app being registered in Entra ID
Registering an app via Entra ID

Once you click register, copy the Application (Client) ID and Tenant ID from the overview page and save it for later. Next, go to API permissions and click add permission. You will select Microsoft Graph and then application permissions. From there, you will need to add the following permissions:

Permission NamePurpose
User.Read.AllProvide dimensional data around users within your organization from Entra ID. While User.Read has already been added to the app, it will not work in this situation.
Reports.Read.AllProvides access to all reporting endpoints for Microsoft 365. This includes Microsoft Teams, SharePoint, OneDrive, Exchange, and Viva Engage
Details about API permissions for your app

Once these permissions are assigned, grant admin consent for your organization. The permissions you added should have green checkmarks next to them:

Screenshot of API permissions granted for your application
Admin consent granted for API permissions

Lastly, you need to generate a client secret. Go to Certificates & Secrets and add a new Client Secret. Once created, make sure you copy the value for your secret and save it. You cannot get it again and will need to re-create it when you are finished.

Turn off Report Masking

As a security feature, the Microsoft Graph API has user masking available for this data. This helps protect the anonymity of user activity in the tenant. However, if we want to track Copilot for Microsoft 365 utilization, we cannot be anonymous. For some of you, this will not apply. However, this process could require some approvals. Asking early will ensure you can the report mask turned off ahead of time.

To turn off report masking, go to the Microsoft 365 Admin Center and expand settings on the left side of the screen. From there, select org settings and then select reports. A panel will appear on the right side of the screen. Uncheck the option titled display concealed user, group, and site names in all reports:

Turning off the report masking in the Microsoft 365 admin center
Turn off the report masking in the Microsoft 365 Admin Center

It is important to note that this feature is either on or off for everyone in the organization for all reports. There are no granular controls for this feature. Therefore, make sure you understand the implications of turning off this setting and communicate it to fellow Microsoft 365 administrators who might be using these reporting features whether by API endpoint or in their respective administrator portal.

What else do I need to know?

This project will require you to understand how to use REST APIs in Dataflows. If you have not read my series on leveraging REST APIs in Power BI, this would be a good place to start. We will need to build a number of connections leveraging this method.

Next, we will have to navigate an authentication process that will require us to use the POST method. I have also written an article on navigating this challenge. This will be a little more complicated, but the code found in this article will be helpful for understanding how we will build a function to drive authentication for these endpoints.

Lasty, you will find yourself writing some M Query code. If you are new to this, the code will be provided to assist you. However, you will want to learn a little more about the language so you understand the method behind the madness.

Conclusion and Next Steps

This setup only takes a few minutes, but it will go a long way in preparing our environment. As mentioned above, if you are not familiar with using REST APIs in Power Query, please take some time to learn about them. I promise you that you will thank me later. This can get very complicated, so understanding the process ahead of time goes a long way!

This will be enough to get thing started for right now. In my next article, we will explore setting up our Fabric environment to query this data and prepare it for the data warehouse.

Simulation Integration with Power Apps

Simulating scenarios with Power BI is a common use case that organizations need to get the most out of their data. Being able to manipulate elements of the data to simulate an outcome brings deeper insights and accelerates decision making. Everyone says “what if”, but Power BI paired with Power Apps can help you build the simulation and put it into action.

About two years ago, I wrote an article on how to simulate scenarios with numeric parameters. This solution still applies and works well for most scenarios. However, there is a drawback to this solution. If you have a lot of inputs, you could end up with a painful user experience. Because of the constant recalculation of your measures, it could bog down your experience and eat away at your capacity.

While not a solution for every scenario, we can use Power Apps to achieve the same outcome. In our scenario for this article, we want to drive profitability by manipulating a profit and loss statement. We want to adjust inputs such as revenue, expenses, headcount, and taxes. Because of the numerous inputs that drive a profit and loss statement, this is a perfect example for Power Apps.

Preparing the Report Elements

To start, we are using an existing report that contains a profit and loss statement. We have a matrix visual that shows the lines of the statement and their associated values. If you notice, each line is labeled with a section letter and line number. This makes it easy to identify each element:

Screenshot of a report canvas in Power BI Desktop with a profit and loss statement.
Profit and loss statement within Power BI Desktop

To setup our simulation, we need to have measures to aggregate the data and bring it into the app. You will note that I created a Simulation App Measures query and added a measure for each line item. This will make it easy to bring the data into Power Apps.

Lastly, you will note that I left space on the right hand of the canvas to make it easy to add our application. While we are adding a Power App, it is like any other visualization and requires the appropriate space. While it is perfectly fine to place the app on it’s own page, we want to keep it close to our profit and loss statement for comparison.

Preparing the Power App

While it seems simple, we need to take a few steps to integrate Power Apps into our report. The first step is that we need to select the Power App visual in the visualization pane. Once you add the Power App visual, it provides the instructions for the next few steps:

Screenshot of adding a Power App visual to Power BI Desktop
Adding the Power App visual to our report canvas.

Once we have the visual in place, we need to add fields from our data pane to the visual. This data is the base for supporting your simulation. Once the data has been selected, we can select which Power Apps environment we want to use. Next, if we have an existing app, we can select it here. However, we have not created the app yet so we are going to create one from scratch. When you do this, you will be warned that a new window is being opened so you can edit the app with Power Apps Studio.

Creating Your Simulation App

Now, there are a lot of components that go into a Power App. Because of the complexities that go into creating a Power App, I am going to direct you to Microsoft Learn to get all of these details. However, I am going to go over a few basics to help you get started.

Basic App Settings

There are a number of settings you will need to get in place. A simple example is the name of the app, icon for the app, and the orientation of the application. If you go to the three apps at the end of the menu and select settings, you can set all of these items. You can also set other items such as background color or images:

Screenshot of Power Apps Studio and how to get to the app settings.
Adjust settings for your app within Power Apps Studio

You will also note that this is an option for version history. If you are new to Power Apps, this is a great tool to assist you in the event you make a mistake. You can easily roll back your version if you break something and need to go back. Just remember to save versions periodically with comments so you know where you are at in your versioning.

Using Data for Simulation

If you are familiar with Power Apps, you know you can bring data from various sources into a single application. However, if you go to the data view, you will find it empty. But you need data to build your simulation, so where is it at? There is an item on your tree view called “PowerBIIntegration” which holds your data in a tabular format:

Screenshot of Power Apps Studio showing the PowerBIIntegration data source for simulation.
The PowerBIIntegration provides the simulation data for our app.

Since we put only measures in this app and no dimensions with it, the table only has a single row in it. However, we will still need to provide the aggregation of our data. To do this, you will need to write a little code in Power Apps to achieve the desired value to appear in a text box. For example, if you want to aggregate the total gross revenue, you will need this code to achieve the outcome:

Text(Sum([@PowerBIIntegration].Data, A1), "$#,0")

This code aggregates the data from [@PowerBIIntegration] and formats it properly for display. You do not have to go as crazy, but I really like to make sure the formatting is correct. It requires a little extra work, but it goes a long way.

Testing, Saving, and Publishing the Simulation App

Once you have something ready, you can click the play button on the menu bar to test your application. This will allow you to test the functionality of your application and ensure it works. You also can hold down your ALT key to test elements on the app canvas. Once you have it tested, it is time to publish.

Screenshot of Power Apps Studio highlighting the play button, save button, and publish button.
Test, save, and publish your reports from Power Apps Studio

Once it is ready, click the save button to ensure your changes are captured. Next, you can click the publish button and this will push the changes live. This will allow you to view the app inside of Power BI and available for your report consumers to use. Depending on the environment settings, you might need to share the application with others, but that can easily be performed with using the same Entra or Active Directory security group you leveraged to provide access to the report in the Power BI Service. However, when you are finished, the report canvas will now have the app integrated right on the page:

Screenshot of the simulation Power App inside of Power BI Desktop.
The final product – Simulation app inside of the Power BI Report

Common Questions with Power Apps

Every time I bring up this solution, I receive a lot of questions. However, they are often variations of the same three topics that are in this section. I wanted to address them as I am guessing you have the same questions as well.

Question 1 – Licensing and Cost

We all know Microsoft does not give much out for free. However, with how this app is designed, there is no additional impact to the cost of your licensing. Power Apps is using seeded capabilities from your base Microsoft 365 licensing and therefore does not cost you anything extra. While licensing changes often, I recommend you check out Michael Heath’s blog post on Power Platform licensing if you are new to it.

Question 2 – App Use Outside of Power BI

A lot of people ask about being able to leverage the app outside of Power BI. The answer on this is a hard no. We must keep in mind that the app is leveraging data from your Power BI report and filtering based upon the report context. Therefore, this app is not usable outside of the report. However, you could easily re-deploy this app in another report assuming you leverage the same measure names to load the data into the app.

Question 3 – Power Apps and Writeback

The last question I receive is if Power Apps provides capabilities for writeback to data sources. The short answer is yes, but depending on your data source additional licensing might be required. There are tools out there that allow you to perform write back with less complexity, but Power Apps absolutely works for this. It is up to you if it is worth the effort and cost to make it happen, but it is absolutely possible.

Conclusion

People forget that Power BI is actually part of the Power Platform suite. Sadly, many forget that there are opportunities to integrate these apps together. However, this is an example of being able to simplify integration of Power Apps to extend capabilities with little effort. If you are not an expert, do not be afraid to use Copilot for Power Apps to help you out as well!

Have you ever integrated a Power App into a Power BI report? What was your use case? Did you have any challenges in making it happen? Tell me in the comments below!

Page 1 of 5

Powered by WordPress & Theme by Anders Norén