Rob Farley's back with two comments in regard to my prime number methodology:

Ward - how do you justify having the 'initial fill of sieve' in your tweak of Hugo's? If you pull that stuff out (as per my first post), it takes much longer. Rob (new SQL MVP, by the way!)

Like... just populate both with '2', and see how they go. My first one (without the looping) seems to do better in that scenario, but I prefer the algorithm of the second one.

First things first: congratulations on your MVP designation, Rob!

Now..  on to your question.

I justify my initial fill of the sieve as follows:

  • the job here is to build a list of the prime numbers below 1M as quickly as possible.
  • in order to perform this task, I used two "expertises":
    • knowledge of T-SQL/SQL Server
    • knowledge of mathematics
      • initially what my sons' teachers call "basic math facts"
      • subsequently augmented by introduction of the Sieve of Eratosthenes
  • in the "real world" we'd never run this code more than once; we'd render the data in a table and subsequently query it

Given that we approach our task with all of our accumulated knowledge, I think it's perfectly reasonable to attempt to code around the low hanging fruit -- if I can remove all of the non-prime multiples of 2 with a single filter (WHERE (Id % 2  <> 0 OR Id = 2)), and that filter is faster than the looping code, why wouldn't I take this approach?

Furthermore, once I've taken that approach, why wouldn't I continue to take it until the onset of diminishing returns?  In this case, on my laptop, that meant going up to 31.

I interpreted Denis' original challenge to be focused on pure speed of T-SQL execution and the validity of the generated list, so I believe that the approach I've taken is justifiable.

I once worked with a tester who opened a particular bug report narrative with, "When I change line 355 of the procedure from this to that..".  After I stopped laughing, I closed the bug as "By Design" with a notation that I couldn't reasonably be expected to control the behavior of code that had been overtly edited by someone else; that scenario is outside the rules of engagement for our coding environment.

Similarly, under the rules of engagement for this challenge, since nothing was explicitly excluded, any and all T-SQL coding techniques and mathematical algorithms should be available for use.  We should examine all of the arrows in our quiver and choose the most performant alternative.

So, under those rules, I feel justified with my approach.

I hope this answers your questions, Rob, and that you're at peace with my approach..  if not, we can open a "single algorithm" wing in our T-SQL Prime Number Hall of Fame.

     -wp