Mike Allen's Blog

Doing Smarter Things With IT

Power Query Primes in Excel

I’m going to show you how to calculate primes in Excel using Power Query and the Sieve of Eratosthenes. This is Power Query without tables, and highlights how to use parameters, functions, loops and recursions in Power Query, all fun stuff; and to make it even more fun we’re doing it in Excel. Download the spreadsheet to have a play Power Query Primes.

Open the spreadsheet and select Data>Queries & Connections. You will see the interesting stuff on the right, look at PrimeLimit, which you can edit this parameter which is the number up to which we will calculate primes. Then have a look at fxSieveGenerate, this is a function that gives us a list of numbers for the sieve as defined in the algorithm using List.Generate

Here’s the code for fxSieveGenerate:

     Source = (pStart as number, pLimit as number) as list =>
            xList = List.Generate(() => pStart * 2 , each _ < pLimit, each _ + pStart)

It’s a very simple function with two parameters, the start number and the limit. Now we are ready for the main act, the PrimeList query. This uses some useful PowerQuery techniques, looping is the main feature, of course:

     //PrimeLimit = 1000,
     SieveLimit = Number.Sqrt(PrimeLimit),
     Source = fxSieveGenerate(1, PrimeLimit),
     SieveNumbers = (parameter as record) as record =>
          Iterator = parameter[pIterator] + 1,
          Sieve = fxSieveGenerate(parameter[pPrimes]{Iterator},PrimeLimit),
          NewPrimes = List.RemoveMatchingItems(parameter[pPrimes],Sieve),
          AllPrimes =
          if NewPrimes{Iterator+1} <= SieveLimit then

     PrimeList = SieveNumbers([pPrimes=Source,pIterator=-1])

First up note the comment using //, always handy I used it here because PrimeLimit started as a variable, then I made it a parameter and commented the line out. Then I generate the starting number set, so I pass one(1) and the function doubles it and increments it by 1 for our starter list. I called this Source, which is kind of traditional Power Query naming convention. Next, we declare a function called SieveNumbers, which is highlighted. It’s worth pointing out that lists are accessed by the iterator and zero(0) is the first element. The function has a record as a parameter and returns a record when it is finished. The record is not clearly visible, but we give it two fields the list and the iterator. The function takes the list generates a new list using fxSieveGenerate, uses List.RemoveMatchingItems to remove those numbers and then recursively calls itself using @SieveNumbers until it reaches the sieve limit. The real work is then done by calling the function with a record to start it off, note the –1 in the iterator so that it starts at 0 in the function:
PrimeList = SieveNumbers([pPrimes=Source,pIterator=-1])

The list is then returned to the Excel sheet.

Note that this is Power Query in Excel and no tables were used or harmed in this exercise. I wrote this as an experiment on recursion and looping in Power Query/ Any feedback greatly appreciated, find me on Twitter