Mike Allen's Blog

Doing Smarter Things With IT

Power Query Primes in Excel

I’m going to show you how to calculate primes in Excel using Power Query and the Sieve of Eratosthenes. This is Power Query without tables, and highlights how to use parameters, functions, loops and recursions in Power Query, all fun stuff; and to make it even more fun we’re doing it in Excel. Download the spreadsheet to have a play Power Query Primes.

Open the spreadsheet and select Data>Queries & Connections. You will see the interesting stuff on the right, look at PrimeLimit, which you can edit this parameter which is the number up to which we will calculate primes. Then have a look at fxSieveGenerate, this is a function that gives us a list of numbers for the sieve as defined in the algorithm using List.Generate

Here’s the code for fxSieveGenerate:

     Source = (pStart as number, pLimit as number) as list =>
            xList = List.Generate(() => pStart * 2 , each _ < pLimit, each _ + pStart)

It’s a very simple function with two parameters, the start number and the limit. Now we are ready for the main act, the PrimeList query. This uses some useful PowerQuery techniques, looping is the main feature, of course:

     //PrimeLimit = 1000,
     SieveLimit = Number.Sqrt(PrimeLimit),
     Source = fxSieveGenerate(1, PrimeLimit),
     SieveNumbers = (parameter as record) as record =>
          Iterator = parameter[pIterator] + 1,
          Sieve = fxSieveGenerate(parameter[pPrimes]{Iterator},PrimeLimit),
          NewPrimes = List.RemoveMatchingItems(parameter[pPrimes],Sieve),
          AllPrimes =
          if NewPrimes{Iterator+1} <= SieveLimit then

     PrimeList = SieveNumbers([pPrimes=Source,pIterator=-1])

First up note the comment using //, always handy I used it here because PrimeLimit started as a variable, then I made it a parameter and commented the line out. Then I generate the starting number set, so I pass one(1) and the function doubles it and increments it by 1 for our starter list. I called this Source, which is kind of traditional Power Query naming convention. Next, we declare a function called SieveNumbers, which is highlighted. It’s worth pointing out that lists are accessed by the iterator and zero(0) is the first element. The function has a record as a parameter and returns a record when it is finished. The record is not clearly visible, but we give it two fields the list and the iterator. The function takes the list generates a new list using fxSieveGenerate, uses List.RemoveMatchingItems to remove those numbers and then recursively calls itself using @SieveNumbers until it reaches the sieve limit. The real work is then done by calling the function with a record to start it off, note the –1 in the iterator so that it starts at 0 in the function:
PrimeList = SieveNumbers([pPrimes=Source,pIterator=-1])

The list is then returned to the Excel sheet.

Note that this is Power Query in Excel and no tables were used or harmed in this exercise. I wrote this as an experiment on recursion and looping in Power Query/ Any feedback greatly appreciated, find me on Twitter




EMA Calculations with Power Query

This is my first attempt attempt at EMA calculation in Power Query, this is a work in progress. My intention is to use this work as a basis for technical indicators used in the stock market such as Bollinger Bands.

I have taken my EMA Example from https://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:moving_averages which I find very useful on the subject. I was also informed by Reza Rad at https://radacad.com/list-accumulate-hidden-gem-of-power-query-list-functions-in-power-bi which showed me how to use List.Accumulate, then I just needed to find an application for that knowledge and here it is.

The data that I have used is from the stockcharts.com EMA example, this helped me validate my solution, I started from their downloadable data. I’ve put some comments in the code, the magic is in the List.Accumulate, of course. The input data is in an Excel Workbook, in a table called “DataTable”, with two columns “Date” and “Close”

     Source = Excel.Workbook(File.Contents("emadata.xlsx"), null, true),
     DataTable_Table = Source{[Item="DataTable",Kind="Table"]}[Data],
     #"Changed Type1" = Table.TransformColumnTypes(DataTable_Table,{{"Date", type date}, {"Close", type number}}),
     EMA_Factor = .1818 ,
     Average = 10 ,
     #"Changed Type" = Table.TransformColumnTypes(#"Changed Type1",{{"Close", type number}}),
     // Add an Index to out close price table
     #"Add Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1),
     //  Create a list of the closing prices
     ClosePriceList = #"Add Index"[Close],
     // Calculate SMA , the Simple Moving Average
     #"Add SMA" = Table.AddColumn(#"Add Index", "SMA", each if [Index] >= Average then List.Sum(List.Range(ClosePriceList,[Index]-Average,Average))/Average else ""),
     // Get the first SMA value to be the first EMA
     StartEMA = Table.Range(#"Add SMA",Average-1,1)[SMA]{0},
     // Calculate the Expotential Moving Average
     #"Add EMA" = Table.AddColumn(#"Add SMA", "EMA", each if [Index] >= Average then List.Accumulate(List.Range(ClosePriceList,Average,[Index]-Average),StartEMA,(state,current)=>((current - state) * EMA_Factor) + state) else "")
     #"Add EMA"

Here’s a picture of the final output, any feedback is greatly appreciated, thank you for reading:


Aging Debt using Power Query Lists

Power BI problems that require accessing previous rows of data can be difficult in Power BI, one way of resolving this is to translate columns into lists and then process the lists. This is an example that performs debtors aging, where we have one payments and five aging buckets that we wish to allocate those payments into. Please refer to my previous post http://mikeallen.com.au/post/Creating-an-Aging-Calendar-in-Power-Query as to how the aging buckets are created and assume that transactions have been grouped similar buckets numbered B1 thru B5.

Here is some sample data:


Here are the payments:


The method I’m going to use to age involves comparing the payment to the cumulative debt on the account going from B5, the oldest to B1 the newest,  so I need to get the data to look like this:


When I have this data I then apply this formula to each line to calculate the aging value:

if [Payment] >= [Cumulative] then 0
                     else if [Payment] < [Cumulative] and [Payment] > ([Cumulative] - [Actual])
                             then [Cumulative] - [Payment] else [Actual])

This handles the three cases of a fully paid, partially paid and unpaid buckets in one calculation, which I think is pretty neat!

Here is the Power Query that does all this, this is a Beta, I’m still refining it. The data is held in an Excel workbook in the format as shown above:

     Source = Excel.Workbook(File.Contents("BucketAging.xlsx"), null, true),
     Buckets_Sheet = Source{[Item="Buckets",Kind="Sheet"]}[Data],
     #"Promoted Headers" = Table.PromoteHeaders(Buckets_Sheet, [PromoteAllScalars=true]),
     #"Removed Other Columns" = Table.SelectColumns(#"Promoted Headers",{"Account", "Type", "Value"}),
     GroupRowsByAccount = Table.Group(#"Removed Other Columns", {"Account"}, {{"DataTable", each _, type table [Account=number, Type=text, Value=number]}}),
     MergePayments = Table.NestedJoin(GroupRowsByAccount, {"Account"}, Payments, {"Account"}, "Payments", JoinKind.LeftOuter),
     ExpandPaymentValue = Table.ExpandTableColumn(MergePayments, "Payments", {"Value"}, {"Value"}),
     RenamedPaymentColumn = Table.RenameColumns(ExpandPaymentValue,{{"Value", "Payment"}}),
     AddBucketNumbers = Table.AddColumn(RenamedPaymentColumn, "Bucket", each List.Numbers(1,5)),
     ExpandBuckets = Table.ExpandListColumn(AddBucketNumbers, "Bucket"),
     AddAgingList = Table.AddColumn(ExpandBuckets, "AgingList", each [DataTable][Value]),
     AddCumulativeValue = Table.AddColumn(AddAgingList, "Cumulative", each List.Sum(List.FirstN([AgingList],[Bucket]))),
     AddActualValue = Table.AddColumn(AddCumulativeValue, "Actual", each [AgingList]{[Bucket]-1}),
     AddAgingCalc = Table.AddColumn(AddActualValue, "Aging", each if [Payment] >= [Cumulative] then 0
                     else if [Payment] < [Cumulative] and [Payment] > ([Cumulative] - [Actual])
                             then [Cumulative] - [Payment] else [Actual]),
     AddBucketList = Table.AddColumn(AddAgingCalc, "AgingBuckets", each [DataTable][Type]),
     AddAgingBucket = Table.AddColumn(AddBucketList, "AgingBucket", each [AgingBuckets]{[Bucket]-1})

That’s all expanded out, so remove the columns that you don’t want.

I’ve tried to make the code self-documenting, it is conceptually fairly simple, you group the data down to a row for each account, expand it out for each bucket (in reverse order), then convert the values into a list and perform arithmetic on the list values while extracting the values back out again. Have a play, Lists in Power Query make it possible to do computation across rows of data.

This technique has been tested on several thousand accounts and performs well, any feedback is appreciated, this information is provided with no guarantee whatsoever Smile.