I recently did some advertising on LinkedIn and wanted to analyse the demographic that I was hitting. The Campaign Manager has an export button, so I used this to download a Demographics CSV. The data was typical CSV, what some would refer to as unstructured. This analyses the data into eight segments, there are eight segment tables in the CSV data, my first thoughts were how to split and combine these segments into one structured dataset for further analysis in PowerBI.
Let’s have a look at the table:
That’s the Segment that I have highlighted, the data underneath the segment we’ll call ‘Category’. Note the word ‘Segment’, which we will come back to and use to identify this row of data.
The tables are separated by a blank row, followed by the next segment header:
These are the segments that we can extract. There’s an empty ‘County’ Segment as well, that gets ignored because it’s empty.
Let’s talk Power Query, here’s the finished query. I know this is what you came here for:
let
Source = Csv.Document(File.Contents("C:\Users\mike\Downloads\demographics_report.csv"),[Delimiter=" ",
Columns=9, Encoding=1200, QuoteStyle=QuoteStyle.None]),
#"Removed Top Rows" = Table.Skip(Source,5),
#"Removed Columns" = Table.RemoveColumns(#"Removed Top Rows",{"Column3", "Column5", "Column6",
"Column7", "Column8", "Column9"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column1", "Category"}
,{"Column2", "Impressions"},
{"Column4", "Clicks"}}),
#"Added Segment" = Table.AddColumn(#"Renamed Columns", "Segment",
each if Text.Contains([Category],"Segment") then [Category] else null),
#"Filled Down" = Table.FillDown(#"Added Segment",{"Segment"}),
ReplaceWord = Table.ReplaceValue(#"Filled Down"," Segment","",Replacer.ReplaceText,{"Segment"}),
FilterHeaders = Table.SelectRows(ReplaceWord, each not Text.Contains([Category], "Segment")),
FilterTrailers = Table.SelectRows(FilterHeaders, each ([Category] <> ""))
in
FilterTrailers
This is fairly routine and was all done from the Ribbon, at the #"Added Segment" step, I use Add Column –>Custom Column, typing in this formula:
if Text.Contains([Category],"Segment") then [Category] else null)
That null is very important for the next step, our data now looks like this:
That last column is our segment column, for the next step we select the Segment Column and go to the ribbon for Transform->Fill, which gives us:
The remaining steps, remove the word ‘Segment’ from the segment column and get rid of headers and trailers, so that our final product is like this:
This basically works, I will need to refine the Add Column so check a little more carefully to make sure it’s got a header. The code above is working code that you can cut and paste into your own solution. Feedback and comments are always welcome, you can do that on Twitter or LinkedIn.