Using SQL Server to generate ranges and group data into them
One of the reporting requests I see from time to time is to have output grouped into ranges, or bands if you prefer, e.g:
From | To | Quantity | Value |
0 | 250 | 3 | 125.52 |
250 | 500 | 8 | 3,121.42 |
500 | 750 | 1 | 582.25 |
750 | 1000 | 6 | 4,877 |
So, take all the sales that happened (with whatever criteria may apply to choose them, be it date range, sales person, etc,..) and then group them to show the number of sales where the value was between 0.00 and 250.00, between 250.01 and 500.00, and so on. I'm sure there are multiple ways to achieve this, but the most recent one I've employed is another recursive Common Table Expression.
More often than not, the people who want the report would prefer that the output contain rows with zeroes for Quantity and Value for ranges that don't have any data, this eliminates a hard-coded "simple" (albeit gargantuan) CASE statement in the GROUP BY.
Using this CTE:
DECLARE @start int = 0 DECLARE @end int = 10000 DECLARE @increment INT = 250 ;WITH RANGES AS ( SELECT @start [From], @start + @increment [To] UNION ALL SELECT [To], [To] + @increment FROM [RANGES] WHERE [To] < @end )
SQL quickly generates (and with a minimum of code - which helps with clarity!) [RANGES] which takes the overall range 0...10000 and splits it into chunks of 250.
Using an old backup of the Orchard database for my site, here's a query that uses the [RANGES] common table expression to give the count of items that fall into each band, based on their Id property. It's a pretty meaningless query, but it does give a reasonably clear view of using the CTE in context:
SELECT R.[From], R.[To], COUNT(CIR.Id) [RecordCount] FROM [RANGES] R LEFT JOIN [www.robertwray.co.uk].dbo.[Orchard_Framework_ContentItemRecord] CIR ON CIR.[Id] >= R.[From] AND CIR.[Id] < R.[To] GROUP BY R.[From], R.[To]
It's important that if you're performing a COUNT of records you count based on the table being joined in, otherwise you'll see 1 in the RecordCount column for every range that doesn't have any data - there is one row, overall, as there's the row for the range itself.