A quick and dirty query to find titles of Orchard blog posts that match criteria in SQL

I noticed earlier today that the number of "structured data" items Googles Webmaster Tools was reporting it had found on my website had dropped. Given that I'm continually adding new recipes (albeit it takes a while for Google to process them, but I'm in no rush!), this surprised me a little. There's a graph that shows how the number of structured data items has changed over time in the tools:

Yup, at some point a couple of weeks ago, Google lost one of my recipes (or more - as there are newer ones present in the list that's shown when you click through):

The Chorizo, Chilli, Tomato & Orzo Soup recipe is from September 27th, so definitely has been added and pre-dates the drop in number of items found.

Because I've only started adding recipes using schema.org structured data relatively recently, and have barely started updating recipes that pre-date this, there's a small enough number of recipes that it wasn't too onerous to sit and compare a list of recipes that are using structured data against what's showing in Google Webmaster Tools. Time for a quick and dirty query, wherein I've even used the lazy "DISTINCT" to avoid having to try and work out exactly how the underlying schema hangs together:

SELECT	Title
FROM	(
		SELECT  DISTINCT (CAST(CIVR.[Data] AS XML)).value('(/Data/TitlePart/@Title)[1]', 'nvarchar(200)') [Title]
		FROM	dbo.Common_BodyPartRecord  BPR
		JOIN	dbo.Orchard_Framework_ContentItemRecord CIR
			ON BPR.ContentItemRecord_id = CIR.Id
		JOIN	dbo.Orchard_Framework_ContentItemVersionRecord CIVR
			ON CIVR.ContentItemRecord_id = CIR.Id
		WHERE	BPR.[Text] LIKE '%itemProp%'
		AND	CIVR.Latest = 1
	) SQ
ORDER
BY	SQ.Title

Orchard stores content in an XML structure, hence the user of the SQL XML .value() method to extract the title of each item. I'm treating anything that contains the text 'itemProp' as a candidate to check as this is a bit of the HTML structure that I use for recipes in structured data format. After a quick comparison between the output and the list, appaerntly my Potato Soup recipe isn't showing up in Google Webmaster Tools structured data, which is odd as it pre-dates the other soup recipe from September 27th! Google provide a structured data testing tool, and that's showing that all is ok with the recipe, so just something odd going on with Google, I guess!

By making reference back to an article from Sébastien Ros's blog that I link-dumped previously, that describes some of Orchard's database structure, I've tidied up the query so that it removes duplicates without having to slap a DISTINCT in to just make them go away:

SELECT	(CAST(CIVR.[Data] AS XML)).value('(/Data/TitlePart/@Title)[1]', 'nvarchar(200)') [Title],
	* 
FROM	dbo.Orchard_Framework_ContentItemRecord CIR
INNER 
JOIN	dbo.Orchard_Framework_ContentTypeRecord CTR
	ON ctr.Id = cir.ContentType_id
INNER 
JOIN	dbo.Orchard_Framework_ContentItemVersionRecord  CIVR
	ON civr.ContentItemRecord_id = cir.Id
WHERE	CIVR.Published = 1
AND	CTR.Name = 'BlogPost'
AND	CIVR.Data LIKE '%itemProp%'
ORDER
BY	(CAST(CIVR.[Data] AS XML)).value('(/Data/TitlePart/@Title)[1]', 'nvarchar(200)')

About Rob

I've been interested in computing since the day my Dad purchased his first business PC (an Amstrad PC 1640 for anyone interested) which introduced me to MS-DOS batch programming and BASIC.

My skillset has matured somewhat since then, which you'll probably see from the posts here. You can read a bit more about me on the about page of the site, or check out some of the other posts on my areas of interest.

No Comments

Add a Comment