Grouping your queries

Grouping queries by their content can make our Power Query environment easier to go through and debug.
If you want to know how, keep reading.

When managing data on Power Query Editor we load it from our sources and modify it as we need to please our clients, but sometimes we get errors in the process or need someone to give us some advice, which is a lot easier if our queries are organized in a way that they are easier to go through. I will explain how I recommend organizing your queries to have a better working environment that allows you to find things faster.

Before we start I want to share the link to the page where I got my sample excel files to make the examples in the article:

What is a group?

A group is what Power Query Editor offers you to have your different queries stored in an organized way, you could say that they are similar to a folder. When you first upload your data and click over transform data, the Power Query Editor will start with just the query containing the source of data you have just uploaded.

As it is shown in the picture, there are no groups of any kind, just your query with the data you got from your source. This seems completely fine now, but when you have over 5, 10, 20, or more queries and parameters it becomes messy:

Even with the representative names that I tried to give to my queries, my environment looks unorganized. By grouping those queries, it’s more intuitive where to look for concrete data, either for further development or debugging:

 As you can see, every sheet is grouped depending on whether it has been modified or not, which source path it came from, and so on.

You could freely choose more or less grouping as you find it easier to locate yourself in the environment while working but creating at least two or three groups is definitely going to help your work look more organized.

Also, you can hide the queries you are not interested in working with by clicking over the black arrow, which makes the environment more compact even if you have a million queries (which is very extreme, but you get the idea).

How to create, delete and ungroup groups

To create a group, you have to right-click over the grey space in your Query tab, then press “new group”, after doing so, fill up the information of its name (the description is optional but can help you remember what the data in the folder was for):

After creating your first group, it will be empty, and all the queries will be stored in an automatically generated query. To get the queries you want into your group, just click over them and drag them to your group. After doing so, you will obtain something like: 

This automatically generated group is a feature of Power BI, and you cannot do anything about it. You are able to rename, ungroup and delete every group created by yourself, but none of the above actions can be performed towards the automatically generated group called “Other Queries”.

When working with any other group, to delete it will erase the folder and all the queries it contained, so be very careful about the groups you decide to delete. To ungroup on the other hand will make the folder disappear, but the queries will not disappear with it.

Apart from that, you can also create new groups inside a group if you find it fit.

To either delete, ungroup, rename, add a new group or query, and many other things, you should only right-click on the group you want to perform over as shown in the next image:

As you can see, there are multiple actions you can perform, which are intuitive enough as to not have to explain what they are for.

The most confusing one may be “Properties…” since it will only show you a pop up with the name and description the group was given when created.

Which groups should you create

Of course, this is up to you and the standards of the organizations you are working with, but I will give you the pattern I recommend.

In my opinion, you should always keep your sources just as they came in and make all the changes to them into a query duplicated or referenced to it. The reason for it is that if you notice you are missing data or your query is giving an error, you can check whether it was you who accidentally deleted a necessary column or row, or on the other hand, such information wasn’t included on the query from the beginning.

Therefore, I recommend having at least two groups, one named Sources to store the queries that are just as they were loaded, and a second one named Transformations where you can find the queries that have steps applied to them.

Apart from those two, sometimes you will need a third group to store parameters, this is so because parameters can be referred to either your sources or the transformations in your queries, so if you don’t create an extra group for them and have them mixed with the queries they relate to, it can end up getting messy, which we want to avoid at all costs.

This is the base structure that I’d recommend to everyone, leaving the other Queries table either empty or filled with queries created by yourself (meaning they do have steps but do not come from any source in your Sources group).

Leave a Reply

Your email address will not be published. Required fields are marked *