A quick and dirty query to find titles of Orchard blog posts that match criteria in SQL
I noticed earlier today that the number of "structured data" items Googles Webmaster Tools was reporting it had found on my website had dropped. Given that I'm continually adding new recipes (albeit it takes a while for Google to process them, but I'm in no rush!), this surprised me a little. There's a graph that shows how the number of structured data items has changed over time in the tools:
Yup, at some point a couple of weeks ago, Google lost one of my recipes (or more - as there are newer ones present in the list that's shown when you click through):
The Chorizo, Chilli, Tomato & Orzo Soup recipe is from September 27th, so definitely has been added and pre-dates the drop in number of items found.
Because I've only started adding recipes using schema.org structured data relatively recently, and have barely started updating recipes that pre-date this, there's a small enough number of recipes that it wasn't too onerous to sit and compare a list of recipes that are using structured data against what's showing in Google Webmaster Tools. Time for a quick and dirty query, wherein I've even used the lazy "DISTINCT" to avoid having to try and work out exactly how the underlying schema hangs together:
SELECT Title FROM ( SELECT DISTINCT (CAST(CIVR.[Data] AS XML)).value('(/Data/TitlePart/@Title)[1]', 'nvarchar(200)') [Title] FROM dbo.Common_BodyPartRecord BPR JOIN dbo.Orchard_Framework_ContentItemRecord CIR ON BPR.ContentItemRecord_id = CIR.Id JOIN dbo.Orchard_Framework_ContentItemVersionRecord CIVR ON CIVR.ContentItemRecord_id = CIR.Id WHERE BPR.[Text] LIKE '%itemProp%' AND CIVR.Latest = 1 ) SQ ORDER BY SQ.Title
Orchard stores content in an XML structure, hence the user of the SQL XML .value() method to extract the title of each item. I'm treating anything that contains the text 'itemProp' as a candidate to check as this is a bit of the HTML structure that I use for recipes in structured data format. After a quick comparison between the output and the list, appaerntly my Potato Soup recipe isn't showing up in Google Webmaster Tools structured data, which is odd as it pre-dates the other soup recipe from September 27th! Google provide a structured data testing tool, and that's showing that all is ok with the recipe, so just something odd going on with Google, I guess!
By making reference back to an article from Sébastien Ros's blog that I link-dumped previously, that describes some of Orchard's database structure, I've tidied up the query so that it removes duplicates without having to slap a DISTINCT in to just make them go away:
SELECT (CAST(CIVR.[Data] AS XML)).value('(/Data/TitlePart/@Title)[1]', 'nvarchar(200)') [Title], * FROM dbo.Orchard_Framework_ContentItemRecord CIR INNER JOIN dbo.Orchard_Framework_ContentTypeRecord CTR ON ctr.Id = cir.ContentType_id INNER JOIN dbo.Orchard_Framework_ContentItemVersionRecord CIVR ON civr.ContentItemRecord_id = cir.Id WHERE CIVR.Published = 1 AND CTR.Name = 'BlogPost' AND CIVR.Data LIKE '%itemProp%' ORDER BY (CAST(CIVR.[Data] AS XML)).value('(/Data/TitlePart/@Title)[1]', 'nvarchar(200)')