Data with Dom

Making sense of challenging data topics one step at a time.

Microsoft Graph API Reporting in Fabric – Part 3

Welcome back to my series on the Microsoft Graph API. We have spent a lot of time laying the groundwork for our data. In this article, it is all about getting our data into our Lakehouse so we can start to trend our data.

As a quick reminder, the purpose of us having to go through this exercise is because we must retain data on our own. The Microsoft Graph API will only furnish detailed data over the past thirty days. Any further back and it is lost. Therefore, if we want to trend data over three months, we must warehouse it.

In this article, we will be finalizing our queries and load them into our Lakehouse. To get there, we are going to go through some deep levels of M Query. However, hang in there until the end, I should be able to make it easier for you!

Getting Teams User Activity Detail

To get started, we are going to have to write out some code. While you can just copy/paste the code into this article, I think it is worth walking through together to help you understand the method behind the madness. It seems complicated – and it really is. But understanding the rationale of what was done will help you with your own projects. To help us on this journey, you will need to create three more parameters to simplify the process:

Parameter NameParameter TypeParameter Value
LakehouseIdTextValue of Lakehouse ID from URL
WorkspaceIdTextValue of Workspace ID from URL
HistoricalDayCountDecimal NumberList of values (5, 15, 27) with the default set to 5
Parameters required for capturing Microsoft Graph API data

With your parameters in place, it is time to write out our query. Open a new blank query and strap in!

Start With a Function

Because we are pulling data for a single date at a time, we will need to create a function. We have done this before, but this time we are going to embed it within an existing query:

let
  GraphDataFunction = (ExportDate as date) =>
    let
      GraphData = Web.Contents("https://graph.microsoft.com/v1.0/reports/", 
        [RelativePath="getTeamsUserActivityUserDetail(date=" & Date.ToText(ExportDate, "yyyy-MM-dd") & ")",
        Headers=[Accept="application/json",Authorization="Bearer " & #"Get-AuthToken"()]])
    in
      GraphData
in
  GraphDataFunction

We need to call a few things out. First and foremost, you will note we are using the RelativePath option. This is because our dates will be updated with each all of the function, therefore we must use this option to avoid errors.

Next, you will notice the Authorization header is using our Get-AuthToken function we created in part 2 of this series. This is key to ensure we use our token to complete the authentication process.

Checking the Lakehouse for Existing Data

We will need to ensure we only query Graph data that is not already in the Lakehouse. Because of how finicky the Graph API can be, we cannot guarantee data to be available the next day. Sometimes it is a few days behind. We also do not want duplicate data to be added to the Lakehouse.

To prevent this from happening, we will query the Lakehouse and the tables we created. By using the parameters we created, we will call our Lakehouse and pull our historical data so we can transform it:

let
  GraphDataFunction = (ExportDate as date) =>
    let
      GraphData = Web.Contents("https://graph.microsoft.com/v1.0/reports/", 
        [RelativePath="getTeamsUserActivityUserDetail(date=" & Date.ToText(ExportDate, "yyyy-MM-dd") & ")",
        Headers=[Accept="application/json",Authorization="Bearer " & #"Get-AuthToken"()]])
    in
      GraphData, 
  LakehouseConnection = Lakehouse.Contents(null){[workspaceId = WorkspaceId]}[Data]{[lakehouseId = LakehouseId]}[Data],
  HistoricalExports = LakehouseConnection{[Id = "teams_user_activity", ItemKind = "Table"]}[Data],
  ReportRefreshDates = Table.SelectColumns(HistoricalExports, {"Report_Refresh_Date"}),
  FilterRefreshDates = Table.SelectRows(ReportRefreshDates, each [Report_Refresh_Date] <> null and [Report_Refresh_Date] <> ""),
  HistoricalDatesExported = Table.Distinct(FilterRefreshDates, {"Report_Refresh_Date"})
in
  HistoricalDatesExported

These five lines of code will allow us to connect to our teams_user_activity table, remove unnecessary columns, remove duplicates, and format into a usable column. This will be empty for right now, but more will be added as we move forward.

Create a Range of Dates and Merge with Historical Data

Next, we will pull data from the last few days. To make life easier on us, we will limit the pull to five days to speed up our development process. We just need to use today’s date as a base and create a range going back five days that we can use:

let
  GraphDataFunction = (ExportDate as date) =>
    let
      GraphData = Web.Contents("https://graph.microsoft.com/v1.0/reports/", 
        [RelativePath="getTeamsUserActivityUserDetail(date=" & Date.ToText(ExportDate, "yyyy-MM-dd") & ")",
        Headers=[Accept="application/json",Authorization="Bearer " & #"Get-AuthToken"()]])
    in
      GraphData, 
  LakehouseConnection = Lakehouse.Contents(null){[workspaceId = WorkspaceId]}[Data]{[lakehouseId = LakehouseId]}[Data],
  HistoricalExports = LakehouseConnection{[Id = "teams_user_activity", ItemKind = "Table"]}[Data],
  ReportRefreshDates = Table.SelectColumns(HistoricalExports, {"Report_Refresh_Date"}),
  FilterRefreshDates = Table.SelectRows(ReportRefreshDates, each [Report_Refresh_Date] <> null and [Report_Refresh_Date] <> ""),
  HistoricalDatesExported = Table.Distinct(FilterRefreshDates, {"Report_Refresh_Date"}),
  TodaysDate = Date.AddDays(Date.From(DateTime.LocalNow()), -5),
  ListDatesForExport = List.Dates(TodaysDate, 5, #duration(1, 0, 0, 0)),
  TableOfDates = Table.FromList(ListDatesForExport, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  RenameDateColumn = Table.RenameColumns(TableOfDates, {{"Column1", "ExportDates"}}),
  SetProperDataType = Table.TransformColumnTypes(RenameDateColumn, {{"ExportDates", type date}}),
  MergeHistoricalData = Table.NestedJoin(SetProperDataType, {"ExportDates"}, HistoricalDatesExported, {"Report_Refresh_Date"}, "HistoricalDatesExported", JoinKind.LeftOuter),
  ExpandHistoricalDates = Table.ExpandTableColumn(MergeHistoricalData, "HistoricalDatesExported", {"Report_Refresh_Date"}, {"Report_Refresh_Date"}),
  IdentifyMissingData = Table.SelectRows(ExpandHistoricalDates, each ( [Report_Refresh_Date] = null))
in
  IdentifyMissingData

Once we have a list of dates in place, we will merge it to our historical data from the HistoricalDatesExported step. From there, expand the values and filter out the null values so we only have dates where we are missing data left.

Execute the Microsoft Graph API Function

Reaching all the way back to the first step, GraphDataFunction, we are going to pass our list of dates through to get our data. We will pass our column and pull our data through:

let
  GraphDataFunction = (ExportDate as date) =>
    let
      GraphData = Web.Contents("https://graph.microsoft.com/v1.0/reports/", 
        [RelativePath="getTeamsUserActivityUserDetail(date=" & Date.ToText(ExportDate, "yyyy-MM-dd") & ")",
        Headers=[Accept="application/json",Authorization="Bearer " & #"Get-AuthToken"()]])
    in
      GraphData, 
  LakehouseConnection = Lakehouse.Contents(null){[workspaceId = WorkspaceId]}[Data]{[lakehouseId = LakehouseId]}[Data],
  HistoricalExports = LakehouseConnection{[Id = "teams_user_activity", ItemKind = "Table"]}[Data],
  ReportRefreshDates = Table.SelectColumns(HistoricalExports, {"Report_Refresh_Date"}),
  FilterRefreshDates = Table.SelectRows(ReportRefreshDates, each [Report_Refresh_Date] <> null and [Report_Refresh_Date] <> ""),
  HistoricalDatesExported = Table.Distinct(FilterRefreshDates, {"Report_Refresh_Date"}),
  TodaysDate = Date.AddDays(Date.From(DateTime.LocalNow()), -5),
  ListDatesForExport = List.Dates(TodaysDate, 5, #duration(1, 0, 0, 0)),
  TableOfDates = Table.FromList(ListDatesForExport, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  RenameDateColumn = Table.RenameColumns(TableOfDates, {{"Column1", "ExportDates"}}),
  SetProperDataType = Table.TransformColumnTypes(RenameDateColumn, {{"ExportDates", type date}}),
  MergeHistoricalData = Table.NestedJoin(SetProperDataType, {"ExportDates"}, HistoricalDatesExported, {"Report_Refresh_Date"}, "HistoricalDatesExported", JoinKind.LeftOuter),
  ExpandHistoricalDates = Table.ExpandTableColumn(MergeHistoricalData, "HistoricalDatesExported", {"Report_Refresh_Date"}, {"Report_Refresh_Date"}),
  IdentifyMissingData = Table.SelectRows(ExpandHistoricalDates, each ( [Report_Refresh_Date] = null)),
  ExecuteGraphFunction = Table.AddColumn(IdentifyMissingData, "Invoked custom function", each GraphDataFunction([ExportDates])),
  FilterMissingData = Table.SelectRows(ExecuteGraphFunction, each [Attributes]?[Hidden]? <> true),
  MergeExportedFiles = Table.AddColumn(FilterMissingData, "Transform file", each #"Transform file"([Invoked custom function], 33)),
  RemoveNonDataColumns = Table.SelectColumns(MergeExportedFiles, {"Transform file"}),
  ExpandResults = Table.ExpandTableColumn(RemoveNonDataColumns, "Transform file", {"Report Refresh Date", " User Id", " User Principal Name", " Last Activity Date", " Is Deleted", " Deleted Date", " Assigned Products", " Team Chat Message Count", " Private Chat Message Count", " Call Count", " Meeting Count", " Meetings Organized Count", " Meetings Attended Count", " Ad Hoc Meetings Organized Count", " Ad Hoc Meetings Attended Count", " Scheduled One-time Meetings Organized Count", " Scheduled One-time Meetings Attended Count", " Scheduled Recurring Meetings Organized Count", " Scheduled Recurring Meetings Attended Count", " Audio Duration In Seconds", " Video Duration In Seconds", " Screen Share Duration In Seconds", " Has Other Action", " Urgent Messages", " Post Messages", " Tenant Display Name", " Shared Channel Tenant Display Names", " Reply Messages", " Is Licensed", " Report Period"}),
  FilterMissingResults = Table.SelectRows(ExpandResults, each [Report Refresh Date] <> null)
in
  FilterMissingResults

After we execute the function, we will combine our files and expand the results. To do this, we are using a custom function titled Transform file to combine our files seamlessly:

(Parameter as binary, ColumnCount as number) => let
  Source = Csv.Document(Parameter, [Delimiter = ",", Columns = ColumnCount, Encoding = 65001, QuoteStyle = QuoteStyle.None]),
  #"Promoted headers" = Table.PromoteHeaders(Source, [PromoteAllScalars = true])
in
  #"Promoted headers"

The result should be something that looks like this:

Screenshot of Microsoft Graph API report data.
Screenshot of the Microsoft Graph API report data for Teams

Wow – that was a whole lot for a single endpoint. Only four more applications left! Frustrated that you will have to do this again? Well, instead of being frustrated, let’s make it a little easier.

Turning Microsoft Graph API Reports into a Flexible Function

We have worked so hard to get the first query in place. But if we think through things a little further, we can simplify the process for future endpoints. Instead of going through this entire process over and over again, we can convert this to a function and pass data to make it easier for us to use.

Prepare Your Endpoint Data

To simplify our process, we will start by creating a table we can reference. This will highlight the application, Microsoft Graph API endpoint, Fabric Lakehouse table, columns pulled from the endpoint, and the column names. Create a blank query called Endpoints and add the following code:

let
  Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3VXbbtswDP0Vwc/pQ9cvCOIMDZCtRt0FGIIiECzOIWpTgURny75+lOPc5aB+3YttkkeUeXRILZfJtNZYJaMEwnvVeHArXTBukXfiLYFbwLhz/ZB4Crxf8vgkj1fYWMfqFX458GuVaoaRCjCVOaQCN7pS33UtzhT9ptK7zpp5lUIFDGZ0+OgWz7Vnddiw8+VARk1sQzySrQrALZxMfQx9A2CkUk0c6JDwyj0jBifVnSJj77EksTNnTVOwD/nagjJwaE3yPlomLwSpkx2lWivY8Bkj6gCLc/U4iKth7CwQfgvgxampwQD9itWRn3xHxbVrrZ24WjpIV9UuFp3+iUY/xVibIrNILFX7YGyCEWPtBI3z9uU/4W2Bvt0i0+UwLt9A115I4PCOMdgCAhtRAp/uN+nM9DAZ46uf3Ugl4bfUZK1Z2s/7s6plp60sisYmwttNM1+YXo6r1IR/b9rbqzGzzImz7jbq2RZ3Vl4DrhPkxRpMUwWNEDww1nAn2T1wf2KZZo1zoc7PZI6gb4puDFqVNk4zWhKxyvAsLBkfNGigJ5QXDoD2Go4jnrX8Ga9FKkEVlkQ3rgQ6HqFAMuvPzTcgLfHLud91iRw+EVQxzL4LpH1OmUR4cyyAfFDebYssRFBqSqWARexbsVbQWrF2+anruqdXBg7py8JaSM7764o7gZNIu3/+BL7AxC6yOX4MuaTe/wE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Application = _t, Lakehouse_Table = _t, Graph_Endpoint = _t, Column_Count = _t, Column_Mapping = _t]),
  #"Changed column type" = Table.TransformColumnTypes(Source, {{"Column_Count", Int64.Type}, {"Application", type text}, {"Lakehouse_Table", type text}, {"Graph_Endpoint", type text}, {"Column_Mapping", type text}})
in
  #"Changed column type"

You can choose to pull this data from somewhere else if it is easier for you, but this table is easy to maintain in this scenario.

Convert our Query to a Function

Our next step is to take the first query we created in this article and make it a function. We will start by adding the parameter for an application at the top and filter the Endpoints table with the value passed in the parameter:

(Application as text) =>
let
  EndpointTable = Endpoints,
  EndpointData = Table.SelectRows(EndpointTable, each [Application] = Application),
  GraphDataFunction = (ExportDate as date) =>
    let
      GraphData = Web.Contents("https://graph.microsoft.com/v1.0/reports/", 
        [RelativePath=EndpointData{0}[Graph_Endpoint] & "(date=" & Date.ToText(ExportDate, "yyyy-MM-dd") & ")",
        Headers=[Accept="application/json",Authorization="Bearer " & #"Get-AuthToken"()]])
    in
      GraphData, 

  //Endpoint = "teams_user_activity",
  LakehouseConnection = Lakehouse.Contents(null){[workspaceId = WorkspaceId]}[Data]{[lakehouseId = LakehouseId]}[Data],
  HistoricalExports = LakehouseConnection{[Id = EndpointData{0}[Lakehouse_Table], ItemKind = "Table"]}[Data],
  ReportRefreshDates = Table.SelectColumns(HistoricalExports, {"Report_Refresh_Date"}),
  FilterRefreshDates = Table.SelectRows(ReportRefreshDates, each [Report_Refresh_Date] <> null and [Report_Refresh_Date] <> ""),
  HistoricalDatesExported = Table.Distinct(FilterRefreshDates, {"Report_Refresh_Date"}),
  TodaysDate = Date.AddDays(Date.From(DateTime.LocalNow()), -HistoricalDayCount),
  ListDatesForExport = List.Dates(TodaysDate, HistoricalDayCount, #duration(1, 0, 0, 0)),
  TableOfDates = Table.FromList(ListDatesForExport, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  RenameDateColumn = Table.RenameColumns(TableOfDates, {{"Column1", "ExportDates"}}),
  SetProperDataType = Table.TransformColumnTypes(RenameDateColumn, {{"ExportDates", type date}}),
  MergeHistoricalData = Table.NestedJoin(SetProperDataType, {"ExportDates"}, HistoricalDatesExported, {"Report_Refresh_Date"}, "HistoricalDatesExported", JoinKind.LeftOuter),
  ExpandHistoricalDates = Table.ExpandTableColumn(MergeHistoricalData, "HistoricalDatesExported", {"Report_Refresh_Date"}, {"Report_Refresh_Date"}),
  IdentifyMissingData = Table.SelectRows(ExpandHistoricalDates, each ( [Report_Refresh_Date] = null)),
  ExecuteGraphFunction = Table.AddColumn(IdentifyMissingData, "Invoked custom function", each GraphDataFunction([ExportDates])),
  FilterMissingData = Table.SelectRows(ExecuteGraphFunction, each [Attributes]?[Hidden]? <> true),
  MergeExportedFiles = Table.AddColumn(FilterMissingData, "Transform file", each #"Transform file"([Invoked custom function], EndpointData{0}[Column_Count])),
  RemoveNonDataColumns = Table.SelectColumns(MergeExportedFiles, {"Transform file"}),
  ExpandResults = Table.ExpandTableColumn(RemoveNonDataColumns, "Transform file", Text.Split(EndpointData{0}[Column_Mapping], ", ")),
  FilterMissingResults = Table.SelectRows(ExpandResults, each [Report Refresh Date] <> null)
in
  FilterMissingResults

From there, we will call the data from the EndpointData step and inject it into our query with the following items:

  • In the GraphDataFunction step, place a dynamic endpoint name
  • In the HistoricalExports step, place a dynamic Lakehouse table name
  • In the MergeExportedFiles step, place a dynamic column count in the Transform file function
  • In the ExpandResults step, place dynamic column mappings and wrap it with a Text.Split() function

Now that is completed, we can call the function by typing in the application name and clicking invoke:

Screenshot of us invoking our Microsoft Graph API reports function
Invoking our Microsoft Graph API reports function.

With this in place, we can invoke our function for OneDrive, SharePoint, Email, and Viva Engage with a few clicks!

Get Current Users from Entra ID

For our last step, we just need to pull our users out of Entra ID into our Dataflow. To make this happen, I like to use a function to make it easier. I created a function titled Get-Users and invoked it with this code:

() =>
let
  GetUserDetail = (Params) =>
  let    
    Path = if Text.Length(Params) > 0  then Params
      else "$select=displayName,givenName,jobTitle,mail,officeLocation,surname,userPrincipalName,id,accountEnabled,userType,companyName,department,employeeType&$expand=manager",  
    Source = Json.Document(Web.Contents("https://graph.microsoft.com/v1.0/users",[RelativePath = "?" & Path, Headers=[Authorization="Bearer " & #"Get-AuthToken"()]])),  
    NextList = @Source[value], result = try @NextList & @GetUserDetail(Text.AfterDelimiter(Source[#"@odata.nextLink"], "?")) otherwise @NextList    
  in
    result,
  UserDetail = GetUserDetail("")
in
  UserDetail

You will notice the function inside of the function. This is because we will need to iterate through the entire user list to receive all users. Because the API limits us to batches of 1,000, we will need to work through this a few times to get the results we desire.

HELP! THIS IS TOO MUCH!

Is your head spinning? Mine too. In fact, this was a lot to even get this coded. And if I ever want to reuse this, I do not want to go through any of this level of effort again. I am probably going to be like that math teacher you had where you learned how to do everything the hard way to pass an exam and then the next chapter they teach you a way to complete a problem easier. I hated it when that happened to be, but I am still going to do it to you.

To streamline this, I created a Power Query template for this project. You can download the template and import it into your new Dataflow:

Screenshot of the Microsoft Graph API Power Query template being imported into a dataflow.
Importing a Power Query template for our Microsoft Graph API reports

Once imported, you will need to update the following parameters:

  • TenantId
  • AppId
  • ClientSecret
  • WorkspaceId
  • LakehouseId

This should streamline the process considerably if you are just getting started!

From Dataflow to Lakehouse

We now just need to map our data into our Lakehouse. This will take a little diligence, but it should move along pretty quickly. We just need to match our queries and columns from our Dataflows into tables within our Lakehouse.

Adding Teams Data to the Lakehouse

Starting with our Teams user activity data, we will go to the bottom right-hand corner of the screen and click the + icon to add it to our Lakehouse. Once you have a connection to your Lakehouse, we will use an existing table and navigate to the table where your data is stored:

Screenshot of placing Microsoft Graph API data into our Lakehouse
Adding Teams data into our Lakehouse

Once we have selected our table, we need to map our columns. We also need to update the setting that will allow us to append our data:

Mapping our Microsoft Graph API report fields into our Fabric Lakehouse
Mapping columns and appending our data

Now we just need to save our settings. Once we publish and refresh our dataflow, our data will be added into the Lakehouse. We just need to repeat these steps for OneDrive, SharePoint, Email, and Viva Engage.

Adding Graph Users to the Lakehouse

If you remember from my last article, we created all of the table for our Microsoft Graph API report endpoints. However, we did not need to create anything for our users. Why is that?!

The answer is simple – with our approach, we will optimize the refresh experience and eliminate data duplication for these dates. This is because it is a moving target and hard to hit. However, with our users, it is a different story as we will replace our data each time. So, just like we did before, we will go ahead and add our Lakehouse as a data destination. However, this time we will create a new table. Once in place, we can make adjustments to our mappings and settings:

Screenshot of our Entra ID users mapping to the Fabric Lakehouse
Mapping Entra ID users to the Lakehouse

Once completed, we can click save and move to publishing our Dataflow.

One More Thing Before We Publish!

Last thing we need to do is adjust our HistoricalDayCounter from 5 days to 27 days. This will ensure that all the historical data available is captured in the refresh and will be loaded into the Lakehouse:

Screenshot of the HistoricalDayCounter parameter being updated to 30 days.
Update our HistoricalDayCounter to 27 days

Once completed, we can click publish. It will take a few minutes to validate the Dataflow. After that, we just need to setup a daily refresh and our data will start to populate into the Lakehouse!

Next Steps and Conclusion

Alright, that was a ton of information in a single article. It was hard to split up, but I wanted to keep it together in a single article as these pieces were all interconnected. It is a lot, but it is complete. Now that you have your data, you can create your model, measures, and start building out your reports.

In other news, I have been able to get the endpoint for Copilot. However, there are some data quality challenges that I want to address before doing a full write-up. In the meantime, this will get you started with understanding how complicated this level of reporting can be.

Did you make it to the end? Were you successful? If so, tell me about it in the comments below!

Microsoft Graph API Reporting in Fabric – Part 2

In my last post, I went over they why and covered the building blocks for connecting to the Microsoft Graph API for reporting. With this in place, it is time to prepare the landing zone for our data in Microsoft Fabric! To achieve this, we will need to setup our Fabric environment so we can start to build out our solution.

Before we get started, if you do not have Microsoft Fabric, do not fear! You can start a Microsoft Fabric trial for 60 days for free. When you sign into Power BI, you can click on your user avatar, and start your trial. If this is not available to you, it might be because your Fabric Administrator has turned off the trial feature. You might need to submit a request to re-enable this feature.

Creating Your Lakehouse

To get things started, we need to create a new workspace that has a Fabric capacity assigned to it. Just like you would before with Power BI, create a new workspace inside of Microsoft Fabric and ensure it is tied to your capacity.

Once you have your workspace in place, it is time to build your Lakehouse. If you go to the new button in the upper left hand corner of the workspace window, a screen will appear where you can select a Lakehouse:

Screenshot of the screen where you can select a Lakehouse to deploy.
Deploying a Lakehouse in Microsoft Fabric

From there, you can give your Lakehouse a name and click create. You will be redirected to your new Lakehouse. Once created, we will need to extract the Workspace ID and Lakehouse ID from the URL. The easiest way to capture this is to just copy the URL in the address bar to a notepad where you kept your Service Principal data in part 1 of this series. Once you capture this information, you can go can close your Lakehouse:

https://app.powerbi.com/groups/<WorkspaceId>/lakehouses/<LakehouseId>?experience=power-bi

Preparing Your Lakehouse

Now that you have a Lakehouse in place, it is time to prepare the tables. While there is an opportunity to create these tables directly out of our Dataflow, we will want to create a few ahead of time. This is simply to reduce the amount of effort on your side. The goal is to make it easy, but the process can be complex. This should make the process a little more seamless for you as a result.

While I am going to give you a Notebook you can upload, we will pause and create one from scratch so you understand the process. When you click new item in your workspace, we will scroll down to prepare data and select Notebook:

Screenshot of adding a Notebook to your Fabric Workspace.
Adding a Notebook to your Fabric Workspace

We will add our Lakehouse to the Notebook on the left hand side. Once in place, we can add our code using PySpark into a cell:

Screenshot of our Notebook in Microsoft Fabric that has been connected to a Lakehouse and PySpark code to create a table.
Preparing our Notebook to build tables in our Lakehouse

Using documentation from the Microsoft Graph API, I was able to identify the endpoints and their associated fields. This allowed me to create a block of code that would create our teams_user_activity table:

#Build out teams_user_activity table in the Lakehouse

from pyspark.sql.types import StructType, StructField, IntegerType, StringType, DateType, BooleanType
schema = StructType([
    StructField("Report_Refresh_Date", DateType(), True),
    StructField("User_Id", StringType(), True),
    StructField("User_Principal_Name", StringType(), True),
    StructField("Last_Activity_Date", DateType(), True),
    StructField("Is_Deleted", BooleanType(), True),
    StructField("Deleted_Date", StringType(), True),
    StructField("Assigned_Products", StringType(), True),
    StructField("Team_Chat_Message_Count", IntegerType(), True),
    StructField("Private_Chat_Message_Count", IntegerType(), True),
    StructField("Call_Count", IntegerType(), True),
    StructField("Meeting_Count", IntegerType(), True),
    StructField("Meetings_Organized_Count", IntegerType(), True),
    StructField("Meetings_Attended_Count", IntegerType(), True),
    StructField("Ad_Hoc_Meetings_Organized_Count", IntegerType(), True),
    StructField("Ad_Hoc_Meetings_Attended_Count", IntegerType(), True),
    StructField("Scheduled_One-time_Meetings_Organized_Count", IntegerType(), True),
    StructField("Scheduled_One-time_Meetings_Attended_Count", IntegerType(), True),
    StructField("Scheduled_Recurring_Meetings_Organized_Count", IntegerType(), True),
    StructField("Scheduled_Recurring_Meetings_Attended_Count", IntegerType(), True),
    StructField("Audio_Duration_In_Seconds", IntegerType(), True),
    StructField("Video_Duration_In_Seconds", IntegerType(), True),
    StructField("Screen_Share_Duration_In_Seconds", IntegerType(), True),
    StructField("Has_Other_Action", StringType(), True),
    StructField("Urgent_Messages", IntegerType(), True),
    StructField("Post_Messages", IntegerType(), True),
    StructField("Tenant_Display_Name", StringType(), True),
    StructField("Shared_Channel_Tenant_Display_Names", StringType(), True),
    StructField("Reply_Messages", IntegerType(), True),
    StructField("Is_Licensed", StringType(), True),
    StructField("Report_Period", IntegerType(), True)
])
df = spark.createDataFrame([], schema)
df.write.format("delta").saveAsTable("teams_user_activity")

In our scenario, we will need five tables for Teams, OneDrive, SharePoint, Email, and Viva Engage. To simplify the process for you, I have created an notebook you can upload into Fabric and run. You just need to connect your Lakehouse and select run all. To perform this task, you must switch to the Data Engineering context of Fabric and select Import on your workspace:

Screenshot of the process to import a notebook into your Fabric Workspace
Importing a Notebook into your Microsoft Fabric Workspace

Once you have run the script successfully, you can go to your SQL Endpoint for the Lakehouse and verify the tables have been created and that they are empty with the following query:

SELECT * FROM teams_user_activity ORDER BY Report_Refresh_Date DESC;

SELECT * FROM onedrive_user_activity ORDER BY Report_Refresh_Date DESC;

SELECT * FROM sharepoint_user_activity ORDER BY Report_Refresh_Date DESC;

SELECT * FROM email_user_activity ORDER BY Report_Refresh_Date DESC;

SELECT * FROM viva_engage_user_activity ORDER BY Report_Refresh_Date DESC

Preparing Your Dataflow

After creating your Lakehouse and preparing your tables, you can create a Dataflow in the same workspace. This will query the data and push it to the right tables within our Lakehouse. However, you need to be careful to select a Dataflow Gen2 as opposed to a Dataflow:

Screenshot of the screen where you can select a Dataflow Gen2 to deploy.

Adding a Dataflow Gen2 to your Fabric workspace

This is critical for us to ensure our data makes it into the Lakehouse later on, so be careful with your selection! Once you select it, you can rename your Dataflow in the upper left hand corner:

Screenshot of renaming your Dataflow in Microsoft Fabric.
Renaming your Dataflow

With your Lakehouse and Dataflow created, it is time to start getting some data.

Connecting to the Microsoft Graph API

If you remember from my last article, we will be using the application permissions with our app registration. As a result, we must acquire a OAuth2 token to facilitate the authentication process. Luckily, we made it easy for you to deploy this in your solution.

Adding Parameters in Power Query

I covered the basics of Parameters in Power Query in an older article. Instead of going through the details, I will just highlight what we need to have created.

To facilitate our authentication, we need to create three parameters that will be used to hold data from our app registration. The three parameters we need are as follows:

Parameter NameParameter TypeParameter Value
TenantIdTextDirectory (tenant) ID
AppIdTextApplication (client) ID
ClientSecretTextClient secret value
Parameters for our authentication process

Remember, capitalization counts! If you copy/paste the parameters above, you will not have an issue with the next step. It seems easy, but take great care with this process!

Creating an Authentication Function

Once you have your parameters set, you need to create a new blank query. To create what we need to make this connection happen, it requires you to build out the query in code. However, I have made that easier for you. Just copy the code below and paste it into the advanced editor with your new query:

() =>
  let
    ContentBody = "Content-Type=application/x-www-form-urlencoded&
        scope=https://graph.microsoft.com/.default&
        grant_type=client_credentials&
        client_id=" & AppId & "&
        client_secret=" & ClientSecret, 
    Source = Json.Document(
      Web.Contents(
        "https://login.microsoftonline.com/" & TenantId & "/oauth2/v2.0/token", 
        [
          Content = Text.ToBinary(ContentBody), 
          Headers = [
            Accept = "application/json", 
            ContentType = "application/x-www-form-urlencoded"
            ]
        ]
      )
    )[access_token]
  in
    Source

Once we have pasted the code into the query, click next. After that, rename the query to Get-AuthToken. Again, capitalization is important so take care with this step!

Now that we have our authentication function in place, it is time to take the next step towards building out our Microsoft Graph API functions to pull user information and Teams user activity.

Conclusion and Next Steps

There is a lot of setup involved in this process. Part of the reason is because of the complexity of the pull we need to perform, but also to avoid running into duplicates and reduce the chance for throttling with the endpoints. I know this seems like a lot of work, but I promise it will pay off in the end!

In my next post, we will finally grab our Graph API data and load it into our Lakehouse! Stay tuned to see how it happens!

Microsoft Graph API Reporting in Fabric – Part 1

The Microsoft Graph API library has been around since November 2015 and continues to expand. One of the key endpoints that has centered around utilization reports of the Microsoft 365 ecosystem. These reports can be leveraged to drive better overall adoption of the Microsoft 365 platform and understand trends in use.

Leveraging the Microsoft Graph API, the Microsoft 365 Utilization Metrics app has been the easiest deployed solution to work through this challenge. With a few clicks in AppSource, you can deploy this app within Power BI, update some details on the sematic model, and hit refresh to understand utilization. And while it does show data for the past year, customers frequently want more out of their data.

People forget about the Microsoft Graph API until new technology finds its way into the ecosystem. Enter stage left, Copilot for Microsoft 365, the newest belle at the ball! With new technology coming at a premium price point, companies want to understand utilization to ensure value. The $30 per user per month cost of Copilot suddenly has people wanting to confirm people are leveraging the tool. If not, they want to reclaim the license and get it to someone else.

This gave me the idea to revisit this challenge and see how we could leverage Fabric to make it a reality. While the Microsoft 365 Roadmap has this rolling out soon, a little preparation will allow you to deliver this data when it is finally available.

What are the Microsoft Graph API Reports?

The Microsoft Graph API reports are a collection of endpoints that provide utilization data for users. Aggregated by action, these reports provide visibility to how Microsoft 365 is leveraged. While you can pull data for users in 7, 30, 90, and 180 periods, the data is aggregated by user. This proves a challenge in getting data in a meaningful format if you are looking to track trends over time.

While those periods work, there is another option. Most endpoints allow you to specify an individual date for your query. However, you can only collect that data for the past 30 days. This is a huge restriction and can limit your ability to trend data over a longer period of time.

In the past, if you did not have a data warehousing strategy in place, you were severely limited to what Power BI could do. However, leveraging Microsoft Fabric, this will get you in the right direction pretty quickly. However, with the advent of Microsoft Fabric, we will be able to simplify this process and enable you to track this data with minimal effort!

Preparing Access to Microsoft Graph API Reports

To get ready for the data warehousing process, you must deploy an app registration inside of Entra ID. To get started, sign into Entra ID, navigate to Applications, select App Registration, and start a new registration. Name your new app and keep everything else the same:

Screenshot of an app being registered in Entra ID
Registering an app via Entra ID

Once you click register, copy the Application (Client) ID and Tenant ID from the overview page and save it for later. Next, go to API permissions and click add permission. You will select Microsoft Graph and then application permissions. From there, you will need to add the following permissions:

Permission NamePurpose
User.Read.AllProvide dimensional data around users within your organization from Entra ID. While User.Read has already been added to the app, it will not work in this situation.
Reports.Read.AllProvides access to all reporting endpoints for Microsoft 365. This includes Microsoft Teams, SharePoint, OneDrive, Exchange, and Viva Engage
Details about API permissions for your app

Once these permissions are assigned, grant admin consent for your organization. The permissions you added should have green checkmarks next to them:

Screenshot of API permissions granted for your application
Admin consent granted for API permissions

Lastly, you need to generate a client secret. Go to Certificates & Secrets and add a new Client Secret. Once created, make sure you copy the value for your secret and save it. You cannot get it again and will need to re-create it when you are finished.

Turn off Report Masking

As a security feature, the Microsoft Graph API has user masking available for this data. This helps protect the anonymity of user activity in the tenant. However, if we want to track Copilot for Microsoft 365 utilization, we cannot be anonymous. For some of you, this will not apply. However, this process could require some approvals. Asking early will ensure you can the report mask turned off ahead of time.

To turn off report masking, go to the Microsoft 365 Admin Center and expand settings on the left side of the screen. From there, select org settings and then select reports. A panel will appear on the right side of the screen. Uncheck the option titled display concealed user, group, and site names in all reports:

Turning off the report masking in the Microsoft 365 admin center
Turn off the report masking in the Microsoft 365 Admin Center

It is important to note that this feature is either on or off for everyone in the organization for all reports. There are no granular controls for this feature. Therefore, make sure you understand the implications of turning off this setting and communicate it to fellow Microsoft 365 administrators who might be using these reporting features whether by API endpoint or in their respective administrator portal.

What else do I need to know?

This project will require you to understand how to use REST APIs in Dataflows. If you have not read my series on leveraging REST APIs in Power BI, this would be a good place to start. We will need to build a number of connections leveraging this method.

Next, we will have to navigate an authentication process that will require us to use the POST method. I have also written an article on navigating this challenge. This will be a little more complicated, but the code found in this article will be helpful for understanding how we will build a function to drive authentication for these endpoints.

Lasty, you will find yourself writing some M Query code. If you are new to this, the code will be provided to assist you. However, you will want to learn a little more about the language so you understand the method behind the madness.

Conclusion and Next Steps

This setup only takes a few minutes, but it will go a long way in preparing our environment. As mentioned above, if you are not familiar with using REST APIs in Power Query, please take some time to learn about them. I promise you that you will thank me later. This can get very complicated, so understanding the process ahead of time goes a long way!

This will be enough to get thing started for right now. In my next article, we will explore setting up our Fabric environment to query this data and prepare it for the data warehouse.

Simulation Integration with Power Apps

Simulating scenarios with Power BI is a common use case that organizations need to get the most out of their data. Being able to manipulate elements of the data to simulate an outcome brings deeper insights and accelerates decision making. Everyone says “what if”, but Power BI paired with Power Apps can help you build the simulation and put it into action.

About two years ago, I wrote an article on how to simulate scenarios with numeric parameters. This solution still applies and works well for most scenarios. However, there is a drawback to this solution. If you have a lot of inputs, you could end up with a painful user experience. Because of the constant recalculation of your measures, it could bog down your experience and eat away at your capacity.

While not a solution for every scenario, we can use Power Apps to achieve the same outcome. In our scenario for this article, we want to drive profitability by manipulating a profit and loss statement. We want to adjust inputs such as revenue, expenses, headcount, and taxes. Because of the numerous inputs that drive a profit and loss statement, this is a perfect example for Power Apps.

Preparing the Report Elements

To start, we are using an existing report that contains a profit and loss statement. We have a matrix visual that shows the lines of the statement and their associated values. If you notice, each line is labeled with a section letter and line number. This makes it easy to identify each element:

Screenshot of a report canvas in Power BI Desktop with a profit and loss statement.
Profit and loss statement within Power BI Desktop

To setup our simulation, we need to have measures to aggregate the data and bring it into the app. You will note that I created a Simulation App Measures query and added a measure for each line item. This will make it easy to bring the data into Power Apps.

Lastly, you will note that I left space on the right hand of the canvas to make it easy to add our application. While we are adding a Power App, it is like any other visualization and requires the appropriate space. While it is perfectly fine to place the app on it’s own page, we want to keep it close to our profit and loss statement for comparison.

Preparing the Power App

While it seems simple, we need to take a few steps to integrate Power Apps into our report. The first step is that we need to select the Power App visual in the visualization pane. Once you add the Power App visual, it provides the instructions for the next few steps:

Screenshot of adding a Power App visual to Power BI Desktop
Adding the Power App visual to our report canvas.

Once we have the visual in place, we need to add fields from our data pane to the visual. This data is the base for supporting your simulation. Once the data has been selected, we can select which Power Apps environment we want to use. Next, if we have an existing app, we can select it here. However, we have not created the app yet so we are going to create one from scratch. When you do this, you will be warned that a new window is being opened so you can edit the app with Power Apps Studio.

Creating Your Simulation App

Now, there are a lot of components that go into a Power App. Because of the complexities that go into creating a Power App, I am going to direct you to Microsoft Learn to get all of these details. However, I am going to go over a few basics to help you get started.

Basic App Settings

There are a number of settings you will need to get in place. A simple example is the name of the app, icon for the app, and the orientation of the application. If you go to the three apps at the end of the menu and select settings, you can set all of these items. You can also set other items such as background color or images:

Screenshot of Power Apps Studio and how to get to the app settings.
Adjust settings for your app within Power Apps Studio

You will also note that this is an option for version history. If you are new to Power Apps, this is a great tool to assist you in the event you make a mistake. You can easily roll back your version if you break something and need to go back. Just remember to save versions periodically with comments so you know where you are at in your versioning.

Using Data for Simulation

If you are familiar with Power Apps, you know you can bring data from various sources into a single application. However, if you go to the data view, you will find it empty. But you need data to build your simulation, so where is it at? There is an item on your tree view called “PowerBIIntegration” which holds your data in a tabular format:

Screenshot of Power Apps Studio showing the PowerBIIntegration data source for simulation.
The PowerBIIntegration provides the simulation data for our app.

Since we put only measures in this app and no dimensions with it, the table only has a single row in it. However, we will still need to provide the aggregation of our data. To do this, you will need to write a little code in Power Apps to achieve the desired value to appear in a text box. For example, if you want to aggregate the total gross revenue, you will need this code to achieve the outcome:

Text(Sum([@PowerBIIntegration].Data, A1), "$#,0")

This code aggregates the data from [@PowerBIIntegration] and formats it properly for display. You do not have to go as crazy, but I really like to make sure the formatting is correct. It requires a little extra work, but it goes a long way.

Testing, Saving, and Publishing the Simulation App

Once you have something ready, you can click the play button on the menu bar to test your application. This will allow you to test the functionality of your application and ensure it works. You also can hold down your ALT key to test elements on the app canvas. Once you have it tested, it is time to publish.

Screenshot of Power Apps Studio highlighting the play button, save button, and publish button.
Test, save, and publish your reports from Power Apps Studio

Once it is ready, click the save button to ensure your changes are captured. Next, you can click the publish button and this will push the changes live. This will allow you to view the app inside of Power BI and available for your report consumers to use. Depending on the environment settings, you might need to share the application with others, but that can easily be performed with using the same Entra or Active Directory security group you leveraged to provide access to the report in the Power BI Service. However, when you are finished, the report canvas will now have the app integrated right on the page:

Screenshot of the simulation Power App inside of Power BI Desktop.
The final product – Simulation app inside of the Power BI Report

Common Questions with Power Apps

Every time I bring up this solution, I receive a lot of questions. However, they are often variations of the same three topics that are in this section. I wanted to address them as I am guessing you have the same questions as well.

Question 1 – Licensing and Cost

We all know Microsoft does not give much out for free. However, with how this app is designed, there is no additional impact to the cost of your licensing. Power Apps is using seeded capabilities from your base Microsoft 365 licensing and therefore does not cost you anything extra. While licensing changes often, I recommend you check out Michael Heath’s blog post on Power Platform licensing if you are new to it.

Question 2 – App Use Outside of Power BI

A lot of people ask about being able to leverage the app outside of Power BI. The answer on this is a hard no. We must keep in mind that the app is leveraging data from your Power BI report and filtering based upon the report context. Therefore, this app is not usable outside of the report. However, you could easily re-deploy this app in another report assuming you leverage the same measure names to load the data into the app.

Question 3 – Power Apps and Writeback

The last question I receive is if Power Apps provides capabilities for writeback to data sources. The short answer is yes, but depending on your data source additional licensing might be required. There are tools out there that allow you to perform write back with less complexity, but Power Apps absolutely works for this. It is up to you if it is worth the effort and cost to make it happen, but it is absolutely possible.

Conclusion

People forget that Power BI is actually part of the Power Platform suite. Sadly, many forget that there are opportunities to integrate these apps together. However, this is an example of being able to simplify integration of Power Apps to extend capabilities with little effort. If you are not an expert, do not be afraid to use Copilot for Power Apps to help you out as well!

Have you ever integrated a Power App into a Power BI report? What was your use case? Did you have any challenges in making it happen? Tell me in the comments below!

How do I connect to REST APIs with Power BI? (Part 5)

In preparing for the Global Power BI and Fabric Summit this year, I got a little nostalgic. I went back to my presentation and early blog posts on connecting to REST APIs. Two years, several blog posts, and a book on Power Query later, I felt it was time to add a fifth part to this series. I realized that I left out one additional step I could take to optimize my query a little more to allow me to squeeze the most out of my API call limits.

I will be picking up where we left off with part 4 of how to connect to data with REST APIs. However, if you are new to my blog, you can start from part 1 and get caught up. If you recall, I am limited to 100 API calls a day with my scenario. If I am watching more than 100 stocks, I might need to consider moving to the next subscription level. However, I have an option to squeeze even more out and potentially return 1,000 stock symbols. This is a bonus as I do not need to increase my subscription!

What Did Dom Miss in Part 4?

To note what I missed, I need to go back to part 2 on REST APIs. You must Read the Fine Manual! In reading the documentation, I realized that I could batch multiple stock symbols at once. Instead of making individual calls for each symbol, I can group ten of them in a single call.

Screenshot of the Yahoo Finance REST APIs documentation
Guess I need to take my own advice?

With this knowledge in hand, I set out to take my query one step further and get more out of my subscription. It is not as straight forward as it may seem, but with a little manipulation we can make it work.

Prepare my List

We already have a spreadsheet of stock symbols we have brought in. We need to do start by removing all of the columns except the stock symbol. Just right click on the symbol column and remove other columns. Once that is complete, insert an index column that starts from 1. Your query should look like this:

Screenshot of the symbol column with the index column next to it.
Make sure your index starts from 1!

Next, we need to add a batch column. We will need to identify every 10th row and assign it an index number. We will need to create a custom column for this. Using a modulo division function, we will identify every tenth row. If it is the tenth row, the remainder will be zero. If the modulo division equals zero, we will divide the index by 10. Otherwise, we will return a null value. The formula looks like this:

if 
  Number.Mod([Index], 10) = 0 
  then Number.IntegerDivide([Index], 10) 
  else null

Next, we just need to use the fill up feature to make sure the rows are assigned to a batch. The last few rows will still be null. This is not an issue as it will still be its own batch. The result looks like this:

Screenshot of our stock symbol list with the associated batch number.
Our stock symbols associated with a batch

With our batch in place, we can remove the index column. Next, we need to group our symbols together.

Use Group By to Bring Symbols Together

Going back to our documentation, each symbol must be separated by a comma. We will use the group by function to bring ten rows at a time together into a single column. We will select our batch column and select the group by function in the transform ribbon. Using the sum aggregation, we are going to summarize our symbol column. Naturally, this will result in an error, but we will fix that next.

Screenshot of the group by function in Power Query
Setting up the group by function

Next, we will need to update the M code in the advanced editor. We need to replace the List.Sum() function with the Text.Combine() function. Our code for this step will look like this:

Table.Group(#"Removed Columns", {"Batch"}, 
  {{"Symbols", each Text.Combine([Symbol], ","), 
  type nullable text}})

The result an updated query with our stock symbols combined:

Screenshot of the combined stock symbols
Stock symbols combined in a single cell by batch

Now that the hard part is over, we just need to apply our function.

Functions and REST APIs

We already created a function in part 3 of my REST APIs series. We just need to apply it to our new query. Go to the add column ribbon and insert a column from function. Using the symbols column for the parameter, we can query all of the symbols in the cell.

Invoking our custom function that use REST APIs
Invoking the Get-StockInfo function

All that is left to do is expand the results so they can be leveraged in the model:

Screenshot of our results from the REST APIs
Stock data from our endpoint

And just like that, we turned twenty five API calls to three. Not too shabby!

Conclusion

It is extra work, but it is worth while if you can save a few dollars and minutes with your refresh. It may seem excessive, but it might make the difference. Besides, I could not resist a post where I tell people to Read the Fine Manual as I know my friend Mike will get a good laugh out of it again!

Have you ever used the group by function in Power BI? Or ever extended like this? Did I miss another way to optimize this query? If so, tell me in the comments below!

FabCon Takeaways

Did you attend FabCon during the last week of March in Las Vegas, NV? If you were not able to attend, it was packed with a lot of great knowledge, exciting announcements, and wonderful data professionals to bounce ideas off of all week. I was lucky enough to attend with my colleague and friend, Michael Heath. At every break when we would reunite, we were so excited to share what we had just learned.

Picture of Michael Heath and Dominick Raimato at FabCon.
Michael and I connecting in person at FabCon!

I know not everyone got to attend this event. And even for those of us who were there, it was a lot of information to take in over three days. I wanted to summarize some of my top takeaways. I mean, it is difficult to distill an entire conference into a single blog post, but I wanted to at least highlight some of the key moments that made me go aha!

FabCon Takeaway #1 – Out of the Box

While I heard this many times before, I finally was able to wrap my head around it during the conference. Fabric is an all inclusive platform to simplify the management of data for all forms of projects. Whether it is warehousing, reporting, or data science, the platform is ready to use.

Like Buying a Car

The best analogy I heard all week was the concept of buying a car. Before Fabric, you needed to pick out each part individually. You needed to identify your storage medium, ETL tool, virtual network components, and data science tools. This is the equivalent of going to the auto parts store and picking each component out one by one and then assembling it to build a car. While some of you might choose to do this, it is a lot of work. You need to not only have the skills to do the data work, but the experience to setup your Azure environment.

Instead, Fabric allows you to buy the car as it is off the lot. When you setup your workspace, you can add components without the hassle of configuration. It just works which is a really nice feeling. Even better, because you are using a capacity license, it is easier to estimate costs. I know there are calculators out there for Azure, but you always forget something when you are attempting to build a budget. Fabric eliminates that challenge with a clear pricing structure. This is a huge step forward for self-service data science outside of the traditional IT function.

Microsoft has been pushing their “5×5” approach with Fabric. The concept is 5 seconds to sign up and 5 minutes to your first wow. While I appreciate this approach, I think there is a little work to go in this space. If you have a guide to help you, I believe this to be true. However, if you are new to Fabric, I do not find it to be quite intuitive for that. Long story short, make sure you do a little learning before you try it on your own.

What About Data Engineers?

Does this mean they are out of a job? Not at all! Data engineers will still be essential. For me, I think it redirects their efforts into more valuable workloads. Instead of fighting with Azure, they are focused on getting value out of the platform. It also opens the door to data engineers and scientists with deep domain knowledge in the business. That is a game changer!

Does this mean our existing data infrastructure is obsolete? Also no! These tools are still essential. If anything, this might allow less important workloads be shifted to Fabric reducing administrative burden. That allows you to focus on the mission critical workloads within your existing infrastructure. While Microsoft would like you to move your workloads to Fabric, not everything has to go.

FabCon Takeaway #2 – CI/CD Is Close

If unfamiliar, CI/CD stands for Continuous Integration and Continuous Deployment. To handle this, you need to have a mechanism to manage your source code for components of Fabric. While it is not 100% there yet, the future is quite bright. While some might not like how this is done, this is a huge step forward.

The integration with Azure DevOps is slick. The ability to create a feature branch and merge it back to the main branch is slick. With a few clicks, you can create a new workspace, replicate your components from the main branch, and replicate it to Git. When you are finished with your feature, you can easily merge it back in. But for me, the nicest part is that you can retire your workspace as part of the merge which is so nice.

I also love the fact you can choose your deployment path. If you would prefer to use Deployment Pipelines in Fabric, you can do so. However, if you rather have Git repos for Dev, Test, and Prod, you can do it too. This allows you the flexibility to manage and control your deployments as you see fit.

Stay tuned for an article on this in the near future!

FabCon Takeaway #3 – Integration Points

For me, the biggest question I kept asking was how well could I integrate these components into other solutions. Creating a machine learning model is fantastic, but useless if you cannot integrate it with other applications. This is huge!

I need to investigate how this works further, but the promise was pretty clear. The idea is that your data scientists can perform their analysis and train a model with Microsoft Fabric. Once completed, they can leverage it via endpoints and provide predictions. I have done this with Azure ML in the past, but the process was a tad clunky. I am hoping this experience gets better with Fabric.

Another teaser was the ability to build your own custom Copilot chat bots within Fabric. This is not using Copilot to create a model, but rather training a Copilot to use your data. Eventually, this data could find its way into other tools. To be honest, this has been somewhat happening already today with the Q&A feature in Power BI. However, if Microsoft can streamline this into a single Copilot experience, this would be huge. Being able to query your data without having to go Fabric would go a long way.

Anything Else from FabCon?

Yes, there is a ton more to talk about. But these were the stand outs for myself. What is hard to put into a post are the conversations with other professionals were outstanding. Meeting some of the conquering heroes of the Power BI world is always fun, but eating lunch with different people was where the biggest insights were gained. Learning about everyone’s unique scenarios allows you to refocus your vision of the platform. I always recommend people attend conferences for that reason alone. That is where you gain the most interesting insights!

Also, since everyone asks this question, Michael and I were able to enjoy some of the fun around us. It was a busy week at the conference and work did not slow down in our absence. While we did not get to any shows, we did get to unwind with other professionals and even meet one of the best Elvis impersonators I have seen in my lifetime.

Michael and I posing with The King

Conclusion

This coming year is going to be exciting when it comes to Microsoft Fabric. Between what was announced at FabCon and what is coming in the near future, the future is bright! I look forward to sharing more insights over the coming months.

Did you make it to FabCon? What were your key insights? Anything I missed in my summary? Tell me in the comments below!

Get Ready for the DP-600 Exam

I took the Implementing Analytics Solutions Using Microsoft Fabric (DP-600) recently and it was a unique experience for me. This was the first time I took a beta exam from Microsoft. It was also the first time I took an exam with the new rules around being able to search Microsoft Learn. I felt it was a different experience, but a successful one as I did pass.

It would be unethical to discuss specific questions and details around the exam. However, there are a few things you can do to make sure you are prepared for this exam. A little preparation can go a long way!

Content Breakdown for the Exam

This exam assesses your knowledge with Microsoft Fabric. As a user of Power BI, you might find elements of this exam easier. Because Power BI is a big part of Fabric, you will find elements of it in there. The breakdown of the exam is as follows:

  • Plan, implement, and manage a solution for data analytics (10–15%)
  • Prepare and serve data (40–45%)
  • Implement and manage semantic models (20–25%)
  • Explore and analyze data (20–25%)

With this knowledge in hand, it is easier to start to understand what you need to focus on when it comes to the prep for the exam. Clearly, prepare and serve data is a hefty portion of the exam. And if you have not touched Fabric, you will not perform well with this exam. But all is not lost as you will need some of that knowledge to be successful!

Studying for the DP-600

Ideally, everyone would attend a training for the DP-600. However, that is not always realistic. My work schedule has been chaotic, so even if I wanted to attend a class, how would I do it?!

To start, Microsoft has a few learning paths that can get you on the right track. They are designed to walk you through the core components of Microsoft Fabric. They are available on the DP-600 page, but I have them summarized here:

In addition, Microsoft has created some learning paths to help you out. It might be best to start with the DP-600 Study Guide from Microsoft. This will help you wrap your head around the core components of the exam. It also gives you some links to additional learning paths to help you out.

However, you will want to get some hands on experience with Microsoft Fabric. You can fire up a free trial if you have an active Microsoft 365 environment in place. To get started, sign into Power BI, go to your account in the upper right hand corner, and start the trial:

Screenshot of account view where you can initiate a trial of Microsoft Fabric.

You will have sixty days to explore Microsoft Fabric at no cost. Make sure you are ready to commit to your learning to ensure you do not run out of time before you take the exam!

Practice Makes Perfect!

I am always prompted to add a practice exam when I sign up for an exam. But Microsoft has a free exam you can take to get a baseline of your knowledge. Take a look at this screenshot from the DP-600 exam page:

Screenshot of the scheduling pane for the DP-600 exam

This practice exam runs you through fifty multiple choice questions. It does not mimic the actual exam as there are no case studies. In addition, the exam does have some questions that do not use multiple choice for the answers. It is not an accurate representation of the exam, but it does a good job of giving you a baseline of your knowledge.

If you have not taken many certification exams, you might want to invest in the paid practice exam. In general, most exams are broken into three sections. The majority of the exam is a collection of questions that you can come back and review before moving on to the next section. However, once you move on, you cannot go back. The same is true for the lab section. Then, there is a section where you cannot go back to update previous answers. That causes some anxiety for some people, so taking a paid practice exam can help prepare you for that experience.

Regardless of the route you choose, I love that you can check the answer on the spot. When you do it, they provide a link to the documentation on Microsoft Learn that explains why the answer is correct. Sometimes you are just guessing and get lucky, but this will help you confirm why you chose the right answer.

Anything Else to Consider?

The resources above helped me pass the exam. However, you might need to know a little more than what is outlined above. Being multi-lingual will help you be successful with this exam. You need to know your DAX, M (both code and GUI), T-SQL, and PySpark syntax and best practices. You might not be an expert in all of them, but a strong command of the basics is critical.

Microsoft Learn was helpful for me as there were a few items that I got stuck on. In fact, my wife commented on the fact that I spent more time on this exam than any others I have taken in the past. Having Microsoft Learn at my disposal helped a lot, but it was not a replacement for my existing knowledge. While it will assist you, relying on it will result in you running out of time.

Conclusion

I felt like this exam was one of the better exams I have taken in the recent past. I feel like so many certification exams are about trying to trick you. The broad amount of content in this exam allowed it to be both challenging, yet meaningful. It may seem weird, but I really enjoyed this exam! If you are looking to continue your analytics journey, you will want to add this exam to your list!

Have you taken the DP-600 yet? What did you think about it? Did you find it meaningful? Tell me in the comments below!

Drill Through for Deeper Insights with Power BI

It is a pet peeve of mine and it drive me nuts. I hate it when someone tries to fit as many visuals on the canvas as humanly possible. It is cluttered, ugly, and hurts the performance of your report. But how do we get around this issue? The answer is simple! Add a drill through to your report!

I will be honest – I was confused by the initial setup like you might be. The setup is a little intimidating at first, but once you understand how to configure the drill through, it is easy to implement. Like what I said at the end of my article about bookmarks in Power BI, you will probably get things wrong the first time. However, with some guidance, I hope you will be able to implement them with ease!

Preparing Your Report

Generally, this starts with a basic report page with a goal. Right now, I have a Sales Data report page that has a lot of information. However, I want to drill down on a manufacturer and learn more about the products being sold. If you look at our current Sales Data tab, you will note that there is some product information, but not the context I want:

Screenshot of the Sales Data report page with a line chart comparing year over year of sales data, revenue by state on a map, a table of the top 25 products and their manufacturer by revenue, and a bar chart of quantity sold by manufacturer.
Current state of the Sales Data report

Our first challenge is to create a meaningful drilldown page that shows information by manufacturer. In my scenario, I wanted to see the top products by both revenue and quantity. I also wanted to see product segments and have the ability to filter by category. The result is a drilldown page that looks like this:

Screenshot of the manufacturer drilldown that shows two tables that return the top 20 products by revenue and quantity respectively.  There is a donut chat showing distribution of revenue by segment and a tile slicer for product category.
Manufacturer drilldown report page

You will note that I did not place a slicer for manufacturer on the canvas. This is because the drill through will handle the filtering for us. We will tackle that next.

Add Drill Through Logic

With our detailed report created, we need to enable the drill through functionality. On the report page, we must add the field we want to use for the drill through. Since we want to look at the manufacturer, we will add that field to our visualization pane under drill through:

Adding the manufacturer field from the data panel to the drill through section for the visualization panel.
Adding the drill through field

One thing you might not have noticed right away is the back arrow that was added to the page in the upper left hand corner. This allows you to return to the original page when you drill down on your data. By default, this is set to just go back to the last page. However, you might want to hard code a page navigation. This will ensure you return to the correct page and is a requirement if you allow drill through on multiple levels:

Screenshot of the back button that is added to the page along with the configuration of the button.
Configuring the back button added to the report canvas

You will note that with the drill through, there were two additional settings available to us. The first setting is on by default which is to keep all filters. This simply maintains the filtering from the page where the drill through occurred. So if it is filtered on a specific year, it will be maintained.

The cross-report drill through allows you to make this report page available on other reports. When published to the Power BI service, other reports in the workspace can drill through with this page and see the detail. This allows you to reuse this page on multiple reports which is very convenient.

Testing the Drill Through

With everything in place, it is time to test our drill through. Going back to the Sales Data page, we can right click on one of the bars in our quantity by revenue visual. From there, we can see the drill through menu option with our new report page:

Screenshot of the drill through menu on our quantity by manufacturer visualization.
Testing our drill through settings

If you have the ability to select your new page, then you did it right! Go ahead and click on it and you will see the new detail page filtered on the manufacturer:

Screenshot of the drill through report page filtered on a manufacturer.
Drill through report page filtered for Van Arsdel

It take a little work to configure, but the results are beautiful. It does a lot to help clean up your canvas and allow you to get more detail on a specific dimension. With a clean look like this, it makes it easier to consume your report with the ability to dig further for details.

Anything Else to Consider?

What if you want to use other fields for the drill through on the same report? You can easily facilitate that by adding them to the drill through field. This will give you the ability to use the same page for multiple dimensions to reduce effort.

In addition, consider the detail of your page when designing it. The ability to reuse it over and over again will reduce a lot of repeated effort. If you leverage the cross report feature, it could make your drill through page an extremely helpful feature to any report you create.

Conclusion

This is one of those unsung heroes of the Power BI Report. It is hidden in plain sight and not always easy to create. But with a little effort, it can really transform your report. It is worth the time as it will take the experience of your reports to the next level!

So, have you used drill throughs on your reports? What are your use cases? Did I miss anything? If so, tell me in the comments below!

Refresh from Files Without a Power BI Gateway?!

I cannot begin to tell you how many people I meet who think they need a Power BI Gateway to refresh their dataset using a file as a source. Yes, if you have selected a file data source, you will need to use a Power BI Gateway. However, there is a way around this challenge. It takes a little extra work to prepare your dataset, but you can make it work.

Before everyone starts moaning about using files as a data source, sometimes this is unavoidable. I worked with a third-party business travel platform that did not provide an endpoint for me to connect directly to the data. As a result, I needed to export data on a weekly basis to feed my dashboard. Since then, that platform has resolved this issue.

I also have a current scenario where I do not have access to the APIs for an ERP solution. Because of roles and permissions, I am not allowed to access this data. A creative solution is to export the data through a scheduled job that sends a CSV file to my email. From there, I can use Power Automate to position the data so it can be accessed by the Power BI Service without using a gateway. Is it ugly? No doubt. Somewhat obnoxious of a process? You betcha! But does it work? Absolutely!

What is the Power BI Gateway?

The Power BI Gateway might be new to you. If you are not an administrator of your Power BI environment, you probably do not know about it. The concept is simple – the gateway brokers access to your data sources that reside on premises within your network. Instead of opening up firewall ports or creating crazy firewall rules, you can deploy a gateway to broker the connection.

There are two types of gateways that can be used. The enterprise gateway provides an enterprise-wide connection for your data sources with the controls to manage access to them. Generally, you install the gateway on a server that has access to your data sources. Once it is in place, you configure your data connections within the Power BI Service by adding credentials and delegating access. The biggest benefit besides the controls, you can provide redundancy to your gateway.

The other type of gateway is the personal gateway. This is deployed by an individual user. The data source must be accessible from the device from the personal gateway, but controls of the data source are essentially inherited by the user. This is a quick way to get a gateway in place, but causes a risk because it is tied to an individual user with no oversight. In addition, your device must remain on for the refresh to be successful which does not always work out.

Regardless of the type, the Power BI Gateway allows you to connect to any data source that resides on premises. While files work with gateways, they are a bit of a burden to manage. Unlike other sources such as SQL Servers, they have a tendency to move around or adjust. If we can manage this without the gateway, it will be easier for end users to manage without having to involve an administrator in the process.

Bypassing the Power BI Gateway

In short, the easiest way to bypass using a Power BI Gateway for your files is to use SharePoint Online. Since it is part of Microsoft 365 and already in the cloud, you do not need a gateway to connect to your file. Start by making sure your file is in SharePoint and open it in the desktop app from the site. From there, go to the file ribbon, select info, and click on copy path.

Once you have a file path, create a new source in Power Query and select the web connector. Paste your URL into the field, but remove the ?web=1 at the end of the path. Click OK and use your Organizational Account to access your file. From there, you will be prompted to go through and access the data based on the file structure. I talk about the web connector with REST APIs, but the concept is the same if you have never used this connector before.

Screenshot of the Web Connector configured to connect to a file inside of SharePoint Online without the Power BI Gateway.
Configuring the Web Connector for your file in SharePoint

Now, this process works well for Excel Workbooks and CSV files as they can be opened with Microsoft Excel. But what about JSON, XML, or Parquet files? You can get around this challenge by simply adding any Microsoft Office file into the directory to get the file path and replace the file name with the correct one. This will save you the effort of building out the URL.

But I Already Have Done So Much!

So often, I have people tell me that they have spent hours transforming their query already and fear they need to start from scratch. But believe it or not, it is easy to resolve. Let’s start with this block of code in your advanced editor:

let
  Source = Excel.Workbook("c:\data\taxi_zone_lookup.xlsx", 
    null, true), 
  ExcelData = Source{[Item = "TaxiZoneTable", Kind = "Table"]}[Data]
in
  ExcelData

We can see from the code block above, we are accessing the file from a folder in the root of our hard drive. Once we have in inside of SharePoint, we can replace the “c:\data\taxi_zone_lookup.xlsx” with a web connector:

let  
  Source = Excel.Workbook(
    Web.Contents(   "https://<tenant>.sharepoint.com/sites/NYCTaxi/Shared%20Documents/Taxi%20Data/taxi_zone_lookup.xlsx"
    ), 
    null, 
    true
  ), 
  ExcelData = Source{[Item = "TaxiZoneTable", Kind = "Table"]}[Data]
in
  ExcelData

You will have to excuse the formatting in this example, but you can see it was a simple swap out in the code. With the change in place, we did not need to make any additional changes to the downstream code. While this code block is small, we could have a lot of transformations in place which would require considerable rework.

Anything Else to Consider?

A big benefit to moving to the cloud location is that you can easily share your Power BI Desktop file for development work without having to update the file path. You will need to share the file with users who will be working directly with the source, but that is easy to manage.

In addition, you can easily add files to your data source without having to get an administrator involved. Regardless of the type of Power BI Gateway you want to use, you will likely need an administrator to help you. If you are using the enterprise gateway, you will need a platform administrator to configure the connection. If you want to use a personal gateway, you might need someone with administrator rights to install the software on your machine. Regardless of the type, you will likely need some help.

Conclusion

I wish more people talked about managing file refreshes this way. This is a great option, yet Microsoft tends to keep quiet about it. In fact, you can leverage this method to provide a refreshable report from a form you created. Patrick from Guy in a Cube created a video on how to achieve this. Hard to believe how a few small changes could make your life so much easier!

So have you ever created a refreshable interface with Power BI like this? Do you have a use case for this? Tell me about it in the comments below!

Artificial Intelligence is Available Inside of Power Query?!

Unless you have lived under a rock in the past year, you know that artificial intelligence is a popular topic. Whether it is Copilot, ChatGPT, Bard, or Open AI, artificial intelligence is here to stay. It is time to embrace the utilization of artificial intelligence and learn how to embrace it for our benefit.

But surely, the use of artificial intelligence inside of Power Query is brand new, right? Wrong! I remember seeing it available five years ago and it is a powerful tool to integrate with your models! I am not going to be able to go into all of the details of artificial intelligence in this article, but rather help you understand how to integrate it within your existing queries.

What Is Artificial Intelligence?

In the fast few years, Artificial Intelligence (AI) has become a force in society. It permeates the world we live in and has become an integral part of how things are done. We see examples of it in our lives through things like chat bots, self-driving cars, and even social media. While some may find it intimidating, the reality of AI is that it has the power to make our lives easier.

Artificial Intelligence is used to describe technologies that perform tasks that once required human intelligence to complete. When properly deployed, AI models can recognize patterns, solve problems, make decisions, and understand natural language. These models do not organically exist. They are created through extensive training activities to ensure the results you receive are valid. The work we perform with Power Query may result in the creation of a model while others will leverage one.

While generative artificial intelligence (GenAI) is all the rage right now, we will not be using it for our queries. Instead, we will be integrating models that have been trained to provide a response. You might be the creator of these models while others are creating them for you. The result is the opportunity to bring back a predictive response based upon your query.

How to Get Started?

While Power Query is available in both Microsoft Excel and Power BI, this functionality is only found in Power BI. While you can leverage similar AI models in Microsoft Excel, it is done outside of Power Query. This is an important item to note if you prefer using Power Query inside of Excel.

Second, you must have some form of a premium workspace to enable this functionality. This can be in the form of a Premium Per User, Premium Capacity, or Azure Power BI Embed environment. If you already have one of these environment at your disposal, you are set. If not, check out my article on licensing strategies around Power BI. I recommend spinning up an Azure Power BI Embed SKU or activate a Premium Per User trial if you want to try these features at a minimal cost.

What if you do not have a Power BI environment available to you? Unfortunately, you will not be able to take advantage of this functionality inside of Power Query. I recommend you work with your organization to spin up a trial to explore this functionality.

Pre-Built Models in Power Query

Power Query allows us to leverage pre-built models. Leveraging Azure Cognitive Services, we can perform text and image analysis on our queries. We can do this from the add column ribbon inside of Power Query:

Screenshot of the add column ribbon inside of Power Query with an arrow pointing to the text analytics function.
Activating the Text Analytics function

If you are doing this for the first time, you will be prompted to allow the analysis of your data. Once you have done that, you can select which model you wish to use:

Screenshot of the Text Analytics screen with options to select the detect language, extract key phrases, and score sentiment models.  Next, you can select the column to analyze and language associated with the text.
Selecting a Text Analytics model for your analysis

As you can see, you can select multiple pre-built models to perform your analysis. With text based analysis, you should consider both the text and the language being used. This helps ensure the proper analysis is being performed. By default, it will use your tenant’s default language. If your column is using the same language consistently in the query, you can hard code it in this screen. However, if you have multiple languages, you might need to to use the detect language model to bolster your analysis.

You can also identify tags for images you ingest with Power Query. This is not easily performed, but is ideal for small images. Because of the practicality of this, you will likely not use the vision model. However, you can try it out for yourself:

Screenshot of the tag images model in the vision module from Azure Cognitive Services.
Configuring the tag images model

These models can provide quick analysis and provide meaningful results with minimal effort. I find that combining these results with other dimensions it allows me to sort through results and find the most meaningful responses in surveys.

Custom Artificial Intelligence Models

If you have created a custom model using Azure Machine Learning, you can leverage it within Power Query. Leveraging a public dataset around diabetes, I created a custom model with Azure Machine Learning. With an endpoint created, I can access it within Power Query with a few clicks:

Screenshot of the home ribbon in Power Query with an arrow pointing to the Azure Machine Learning function.
Activating the Azure Machine Learning function

Once I have selected Azure Machine Learning, I can select which endpoint I wish to use. I have already deployed my model and selected it below:

Screenshot of an Azure Machine Learning endpoint being configured for Power Query.
Configuring my predictive model in Power Query

As you can see, there are multiple inputs I must identify for my endpoint to return a predictive response. I can identify the different columns in my query and return a result. We will need to identify values for each field to allow this to work.

It is important to consider that Azure Machine Learning uses a consumption pricing model. Every time you refresh your query, it is seeking new predictions. Keep this in mind when building out your queries as you could rack up some extra costs without realizing it!

Any Other Options?

If you are a user of R or Python, you can leverage custom algorithms and models inside of Power Query as well. You need to deploy software on your device to make this work and can be prohibitive if you do not have admin access on your machine. I do go into detail on these tools in Chapter 11 of my book, but decided not to discuss them in this post. Primarily, I wanted this to be about out of the box and easy implementation.

Conclusion

It is amazing how artificial intelligence has infiltrated Power Query. For many, they have no clue it was there or what is needed to get it activated. With a little effort, you can really transform your queries with a few clicks. I recommend you try it out to see how you can harness the value of artificial intelligence in your queries!

So did you know about this feature in Power Query? Have you used it before? Are you going to try it out? Tell me about it in the comments below!

Page 1 of 4

Powered by WordPress & Theme by Anders Norén