POWER BI – Tutorial Notes
Data Shaping Techniques:
Renaming Tables:
• Context: Often, imported tables may have default or non-descriptive names.
Renaming them provides clarity and improves data understanding.
• Steps:
• Right-click on the table name in the Power BI Model view.
• Select “Rename.”
• Enter a new, meaningful name.
Promoting the First Row:
• Context: In some cases, the first row of a table might contain column headers
instead of actual data. Promoting this row ensures accurate data analysis.
• Steps:
• Right-click on the table header in the Power Query Editor.
• Select “Promote Headers.”
Duplicating and Replacing Values:
• Context: You might need to duplicate a column and replace its values based on
specific criteria.
• Steps:
• Right-click on the column you want to duplicate.
• Select “Duplicate.”
• Rename the new column.
• Use a formula or conditional statements to replace values in the new column.
Removing Columns:
• Context: Unnecessary columns can clutter your data and make analysis more
complex. Removing them streamlines your dataset.
• Steps:
• Right-click on the column you want to remove.
• Select “Remove.”
Filtering, Sorting, and Applying Changes:
• Context: These operations are fundamental for data shaping, allowing you to focus
on relevant data and apply transformations.
• Steps:
• Filtering:
, • Use the filter arrow in the column header to select specific values.
• Apply filters based on conditions or expressions.
• Sorting:
• Click on the column header to sort data in ascending or descending order.
• Use custom sorting criteria if needed.
• Applying Changes:
• After making changes, click “Close & Apply” to save them to the model.
Enhancing Data Structure
Join Types:
• Inner Join: Returns rows that have matching values in both tables.
• Left Outer Join: Returns all rows from the left table, even if there are no matches in
the right table.
• Right Outer Join: Returns all rows from the right table, even if there are no matches
in the left table.
• Full Outer Join: Returns all rows when there is a match in either left or right table.
Merge Queries:
• Append: Combines rows from multiple tables into a single table.
• Prepend: Adds rows from one table to the beginning of another table.
• Combine: Merges rows from multiple tables based on a common column.
Key Considerations:
• Cardinality: Understand the relationship between tables (one-to-one, one-to-many,
many-to-many) to determine the appropriate join type.
• Data Integrity: Ensure that the join conditions are accurate to avoid data
inconsistencies.
• Performance: Consider the size of the tables and the complexity of the join
conditions when optimizing performance.
Data Profiling Recap
Data Profiling is the process of examining data to understand its characteristics, quality,
and suitability for analysis. It involves:
• Data Quality Assessment: Identifying and addressing data issues such as missing
values, inconsistencies, and outliers.
• Data Type Identification: Determining the appropriate data types for columns (e.g.,
text, numeric, date).
• Data Distribution Analysis: Understanding the distribution of values within columns
(e.g., normal, skewed).
Data Shaping Techniques:
Renaming Tables:
• Context: Often, imported tables may have default or non-descriptive names.
Renaming them provides clarity and improves data understanding.
• Steps:
• Right-click on the table name in the Power BI Model view.
• Select “Rename.”
• Enter a new, meaningful name.
Promoting the First Row:
• Context: In some cases, the first row of a table might contain column headers
instead of actual data. Promoting this row ensures accurate data analysis.
• Steps:
• Right-click on the table header in the Power Query Editor.
• Select “Promote Headers.”
Duplicating and Replacing Values:
• Context: You might need to duplicate a column and replace its values based on
specific criteria.
• Steps:
• Right-click on the column you want to duplicate.
• Select “Duplicate.”
• Rename the new column.
• Use a formula or conditional statements to replace values in the new column.
Removing Columns:
• Context: Unnecessary columns can clutter your data and make analysis more
complex. Removing them streamlines your dataset.
• Steps:
• Right-click on the column you want to remove.
• Select “Remove.”
Filtering, Sorting, and Applying Changes:
• Context: These operations are fundamental for data shaping, allowing you to focus
on relevant data and apply transformations.
• Steps:
• Filtering:
, • Use the filter arrow in the column header to select specific values.
• Apply filters based on conditions or expressions.
• Sorting:
• Click on the column header to sort data in ascending or descending order.
• Use custom sorting criteria if needed.
• Applying Changes:
• After making changes, click “Close & Apply” to save them to the model.
Enhancing Data Structure
Join Types:
• Inner Join: Returns rows that have matching values in both tables.
• Left Outer Join: Returns all rows from the left table, even if there are no matches in
the right table.
• Right Outer Join: Returns all rows from the right table, even if there are no matches
in the left table.
• Full Outer Join: Returns all rows when there is a match in either left or right table.
Merge Queries:
• Append: Combines rows from multiple tables into a single table.
• Prepend: Adds rows from one table to the beginning of another table.
• Combine: Merges rows from multiple tables based on a common column.
Key Considerations:
• Cardinality: Understand the relationship between tables (one-to-one, one-to-many,
many-to-many) to determine the appropriate join type.
• Data Integrity: Ensure that the join conditions are accurate to avoid data
inconsistencies.
• Performance: Consider the size of the tables and the complexity of the join
conditions when optimizing performance.
Data Profiling Recap
Data Profiling is the process of examining data to understand its characteristics, quality,
and suitability for analysis. It involves:
• Data Quality Assessment: Identifying and addressing data issues such as missing
values, inconsistencies, and outliers.
• Data Type Identification: Determining the appropriate data types for columns (e.g.,
text, numeric, date).
• Data Distribution Analysis: Understanding the distribution of values within columns
(e.g., normal, skewed).