Categories
PowerBI

Composite keys in power BI

Data comes in different structures, which means we cannot treat all data equally. Is our job to identify its main traits so that we interact with it as efficiently as possible.

Data must be treated in a clear, organized way that makes it easy to create relationships between different data tables, or directly to make the visuals that use our data to show information.

In order to give a good structure to our tables, we must identify its primary key. Doing so will allow us a way to clearly refer to every row of data in our table, making it easy to create relationships with other tables using our table’s primary key as a foreign key in the new table to connect them.

This can be obvious in cases that have clearly identified data (for example a list of people with its identifier, even if two people have the same name, surname, and income, each one of them will have its own unique ID so you can tell they are not the same person). But in some other cases, it may get a bit tricky.

I have created fake data to create an example of this tricky situation[1]. Specifically, a cinema data of the expected income per week and film on one table and the track of the tickets sold in one concrete month. Please notice that the data may not be realistic, but that is not our current concern. The power BI data would look like this:

Expected income table
Month sessions track table

We’d probably have more tables (for example one that relates the movie ID and session with the price of the ticket), but we only need this two for the current explanation.

Imagine that with this data, the client asks for a way to see whether the expected income for each movie is being obtained in each week, you will need to use both columns (movie ID and week) to identify the expected income on one table, and you’d need to do the same to count how many tickets were sold in that week for each film (with which you could, later on, relate the tickets to their price). It is easy to notice that we’ll be working with composite keys.

Identify primary and foreign keys

In our “Expected Income” table the primary key would be made of the movie ID and the Week since there’s only one expected income per movie each week.

In the “Month sessions track” table the ticket ID would be the primary key since it is unique for each row of data, but we cannot use it to create the connection we are looking for. Knowing this we look for a key that we can use to create a connection with the “expected income” table.

The first thing that is clear is that we are looking for a one-to-many relationship, where the table with the income has only one row for each pair of Movie ID and Week but the Month sales track will have multiple instances for each pair.  To make it more visual, I’ve made a representative schema with just one pair (of course both tables have more fields as shown with the previous images, but just so you get the idea):   

In some working environments, this kind of relationship is detected by the app or web, and you are given the option to create the relationship between these two tables making use of this composite primary key, but power BI is not one of these environments.

Power BI procedure

For creating a relationship

In order to properly work with this table’s primary key in power BI, we must create a new column combining all the columns that compose the primary key.  This can be done by adding a custom column with the concatenate operator or by adding a column from examples.

If you choose the custom column option, the expression for the new column should look like the following.

There are two crucial details to notice here:

  • You must transform the number into text in order to use the concatenate operator, if you try to use it directly with a number Power BI will complain with errors.
  • The “- “does not add any information but will act as a delimiter for any human who works with this data to easily understand where one field begins and the other ends. I used – but you can use the one you prefer.

Then you should go to the other table and apply the same procedure.

Once done, you have a column that acts as a primary key for the “Expected Income” table and the same column acting as a foreign key in “Month sessions track “. This allows you to tell Power BI to create the one-to-many relationship that we know exist between both tables. To do so close and apply the changes you’ve made in power query editor and get into the power BI’s model environment, add a relationship connecting the new fabricated columns and you are done.

Your new relationship should look like this:

For merging the queries

Depending on your situation you may want to merge both queries into one instead of creating a relationship between them. If this is your case, then power BI does allow you to merge two queries by as many fields as you want.

To do so the first thing is clicking over the merge queries option on power query editor (I will merge them as a new query, but you can choose to do a normal merge if it suits you).

After clicking a helper window will appear, once there you select the tables you want to merge and then select as many columns of them as you need by holding “ctrl” while you click over them. Once the fields are selected, choose the kind of join that better suits your objective and click OK.

After clicking OK your new merged query will be ready for you to modify or visualize as it pleases you.

Merged query
Merged Query after expanding Month sessions track table

[1] To do so I used https://www.mockaroo.com/

Leave a Reply

Your email address will not be published.