Using Common Table Expressions to explode recipes - Part 1 - The CTE
A lot of the examples that you see for using Common Table Expressions (CTEs) in SQL Server, particularly recursive ones use examples that are around user hierarchies - showing ways to find all the people that directly and indirectly report into one user, for example.
In order to have a play around with recursive CTEs, I decided to use a data structure that describes recipes and the ingredients within them. This is a throw-back to a job I had many years ago which involved working on a cloud-based stock control system for the hospitality industry. One of the things that companies in this area do is keep track of the individual ingredients in each menu item, but the complexity occurs because you can have one menu item that's composed of more than one "recipe" and you can also have more than one menu item that consumes the same "recipe", take for example:
- Lasagne main - uses the "Lasagne" recipe, perhaps as a fraction of 0.25 if the recipe is made for four portions
- Kids Lasagne - uses the "Lasagne" recipe, perhaps as a fraction of 0.125 of the recipe as it uses half an adult portion
A recipe, inside a recipe, inside a recipe
This is the point at which CTEs come in handy. The "Lasagne main" recipe could contain:
- The "Lasagne" recipe, which is made from:
- The "Lasagne Bolognese" recipe
- The "White Sauce" recipe
- Several distinct ingredients (the grated cheese that's sprinkled on the top)
Now the Lasagne Bolognese and White Sauce each contain individual ingredients as well, and if we were actually modelling the "Lasgane main" and "Kids Lasagne" each of these could contain the accompaniments like chips, coleslaw, side-salad that are included. This is where being able to recurse through the recipe hierarchy comes in handy so you can determine the total of ingredients for a given recipe, and from that the recipe cost.
In order to model this, we need some data structures, these are:
- An ingredient: This is something that's used to make one of our recipes, for example Plain Flour
- A recipe: This would be "Lasagne", "Lasagne main", "Kids Lasagne", "Lasagne Bolognese", "White Sauce", or in other words anything that's made up of one or more ingredients
- An association between an ingredient and a recipe: This is a way of saying that the "White Sauce" recipe contains the "Plain Flour" ingredient
- An association between a recipe as an ingredient and a recipe: This is a way of saying that the "Lasagne" recipe contains the "White Sauce" recipe as an ingredient
It's the last association that introduces the hierarchy of ingredients that necessitates performing a recursive query. If we only ever recorded recipes in a fully de-normalised fashio then it wouldn't be necessary. It would also make adjusting our menus that much more complex - say that we started buying in the white sauce pre-made, we'd have to go and remove all the ingredients for the white sauce from every single recipe that uses it, ouch!
The table structure for recipes
There's a lot of scripts to setup this data (much more so than a simple "User Hierarchy" example, so I've dropped the script for populating data into the tables down at the bottom of this post. The structure of tthe tables used to describe all the recipes looks like this (all based on my recipe for Lasagne):
CREATE TABLE Recipe ( [RecipeId] INT NOT NULL IDENTITY(1, 1), [Description] NVARCHAR(100) NOT NULL ) CREATE TABLE [Recipe_Recipe_Ingredient] ( [RecipeRecipeIngredientId] INT NOT NULL IDENTITY(1, 1), [ParentRecipeId] INT NOT NULL, [RecipeId] INT NOT NULL, [Quantity] DECIMAL(10, 3) NOT NULL ) GO CREATE TABLE [Recipe_Ingredient_Ingredient] ( [RecipeRecipeIngredientId] INT NOT NULL IDENTITY(1, 1), [ParentRecipeId] INT NOT NULL, [IngredientId] INT NOT NULL, [Quantity] DECIMAL(10, 3) NOT NULL ) GO CREATE TABLE [Ingredient] ( [IngredientId] INT NOT NULL IDENTITY(1, 1), [Description] NVARCHAR(100) NOT NULL, [UnitOfMeasure] NVARCHAR(20) NOT NULL )
It's a fairly un-interesting schema in and of itself with the two key tables being Recipe and Ingredient along with the glue tables Recipe_Recipe_Ingredient and Recipe_Ingredient_Ingredient used to describe how ingredients and receipes sit inside recipes. That's to say that both an Ingredient and a Recipe can be ingredients of a Recipe as I've described previously.
Exploding the Recipes
Unfortunately what comes next is quite a long bit of SQL but I'm going to try and break it down into its constituent parts, starting with the CTE for un-packing ingredients of recipes that are recipes themselves. This is:
;WITH RecipeItems AS ( SELECT ParentRecipeId, RecipeId, Quantity FROM [Recipe_Recipe_Ingredient] RRI UNION ALL SELECT RI.ParentRecipeId, RRI.RecipeId, CAST(RI.Quantity * RRI.Quantity AS DECIMAL(10, 3)) FROM [Recipe_Recipe_Ingredient] RRI INNER JOIN RecipeItems RI ON RRI.ParentRecipeId = RI.RecipeId )
SELECT * FROM RecipeItems
This is the core of the query (with a SELECT statement to see what it outputs) which gives the result:
If you compare this to the data that we insert (right down at the bottom of the post) you'll notice that there are 8 rows, but we only add 5 recipes. That's because this CTE recurses (the second part of the UNION ALL where the CTE joins back to itself) to progressively retrieve each child recipe until there are none left to retrieve - using more exciting terminology, to explode!. This is how we end up with the three rows for ParentRecipeId 5 "Lasagne (Kids Portion)" even though the underlying data only has one recipe ingredient (entry in Recipe_Recipe_Ingredient) for this, saying it contains 0.125 * RecipeId 1, also known as "Lasagne". The other two rows (for Recipe Id 2 and 3) are for Lasagne Bolognese and White Sauce respectively.
As you can see from the results above the rows for ParentRecipeId 1 "Lasagne" also show entries for RecipeId 2 and 3 (and as does Recipe 4, which is the "Lasagne (Portion)" which is the adult portion) which shows that the recipe has been exploded correctly. Here's the execution plan that SQL Server has generated:
As you can see, it's pretty clean without anything horrid introduced like (us) looping round the data, building up a temporary store and iteratively adding to it. In fact, just to prove that this execution plan isn't as a result of only having a recipe hierarchy that's two levels deep, let's add another recipe to the mix. I've added a new recipe with a RecipeId of 6 and made it an ingredient of Recipe 3 ("White Sauce") which is itself contained in recipe 1 ("Lasagne") which is itself contained in recipe 4 ("Lasagne (Portion)") which means that as recipe ingredients go, it now goes three levels down. Here's the execution plan from querying the CTE:
Yup, exactly the same (I did contemplate using the same screenshot but thought that'd be a bit lazy/cheeky!). Just for good measure, here's the resulting data showing that the new recipe (now an ingredient in White Sauce) has worked its way into the data in every recipe that contains White Sauce:
Now I've covered how to use the CTE recursively to find all the recipes that sit inside each of the other recipes (it's recipes all the way down!) what's left is to show how to use the CTE to do that. That's coming up in the next post which will show how this works, purely for completeness as it's the less interesting bit now the CTE's out of the way!
This script can be used to populate the Recipe, Ingredient, Recipe_Recipe_Ingredient and Recipe_Ingredient_Ingredient tables:
SET IDENTITY_INSERT [Recipe] ON GO INSERT INTO [Recipe] ( [RecipeId], [Description] ) VALUES (1, 'Lasagne'), (2, 'Lasagne Bolognese'), (3, 'White Sauce'), (4, 'Lasagne (Portion)'), (5, 'Lasagne (Kids Portion)') SET IDENTITY_INSERT [Recipe] OFF GO SET IDENTITY_INSERT [Ingredient] ON GO INSERT INTO [Ingredient] ( [IngredientId], [Description], [UnitOfMeasure] ) VALUES (101, 'Beef Mince', 'KG'), (102, 'Large White Onion', 'EACH'), (103, 'Olive Oil', 'LITRE'), (104, 'Garlic Clove', 'EACH'), (105, 'Red Wine', 'LITRE'), (106, 'OXO Cube', 'EACH'), (107, 'Tomato Puree', 'GRAM'), (108, 'Chopped Tomatoes', 'KG'), (109, 'Basil', 'GRAM'), (110, 'Milk', 'LITRE'), (111, 'Butter', 'GRAM'), (112, 'Plain Flour', 'GRAM'), (113, 'Bay Leaf', 'EACH'), (114, 'Black Peppercorn', 'EACH'), (115, 'Blade Mace', 'EACH'), (116, 'Mozarella, Grated', 'KG'), (117, 'Fresh Parmesan', 'GRAM'), (118, 'Shropshire Red, Grated', 'KG'), (119, 'Lasagne Sheet', 'EACH') SET IDENTITY_INSERT [Ingredient] OFF GO -- White Sauce INSERT INTO dbo.[Recipe_Ingredient_Ingredient] ( ParentRecipeId, IngredientId, Quantity ) VALUES (3, 110, 0.568), -- Milk (3, 111, 0.050), -- Butter (3, 112, 0.035), -- Flour (3, 113, 1.000), -- Bay Leaf (3, 114, 8.000), -- Black Peppercorn (3, 115, 1.000), -- Blade Mace, (3, 102, 0.500) -- HALF Large White Onion -- Lasagne Bolognese INSERT INTO dbo.[Recipe_Ingredient_Ingredient] ( ParentRecipeId, IngredientId, Quantity ) VALUES (2, 101, 0.500), -- Beef Mince (2, 102, 1.000), -- Large White Onion (2, 103, 0.020), -- Olive Oil (2, 104, 4.000), -- Garlic Clove (2, 105, 0.300), -- Red Wine (2, 106, 2.000), -- OXO Cube (2, 107, 0.010), -- Tomato Puree (2, 108, 0.390), -- Chopped Tomatoes (2, 109, 0.020) -- Basil -- Lasagne INSERT INTO dbo.[Recipe_Ingredient_Ingredient] ( ParentRecipeId, IngredientId, Quantity ) VALUES (1, 116, 0.130), -- Mozarella (1, 117, 0.030), -- Parmesan (1, 118, 0.040), -- Shropshire Red (1, 119, 9.00) -- Lasagne Sheets INSERT INTO dbo.[Recipe_Recipe_Ingredient] ( ParentRecipeId, RecipeId, Quantity ) VALUES (1, 2, 1.000), -- Lasagne Bolognese (1, 3, 1.000) -- White Sauce -- Lasagne (Portion) INSERT INTO dbo.[Recipe_Recipe_Ingredient] ( ParentRecipeId, RecipeId, Quantity ) VALUES (4, 1, 0.250), -- Lasagne (Portion) (5, 1, 0.125) -- Lasagne (Kids Portion)
My skillset has matured somewhat since then, which you'll probably see from the posts here. You can read a bit more about me on the about page of the site, or check out some of the other posts on my areas of interest.