Using the C#7 Tuple language feature to eliminate an anonymous type
It might be a bit "old news" now, given that C# 7.0 was released in March 2017 (according to Wikipedia, so it must be true!) but after I answered a question on Stackoverflow earlier today I was reminded of the fact that Tuples were made a first-class citizen of C#. Before them, the options available for returning more than one value from a method were (quoting from the Microsoft .NET Blog post announcing what's new in C# 7.0):
- Out parameters: Use is clunky (even with the improvements described above), and they don’t work with async methods.
System.Tuple<...>
return types: Verbose to use and require an allocation of a tuple object.- Custom-built transport type for every method: A lot of code overhead for a type whose purpose is just to temporarily group a few values.
- Anonymous types returned through a
dynamic
return type: High performance overhead and no static type checking.
None of these are awful, but they're certainly far from optimal.
The question I answered was from someone who had a problem with a string they were attempting to decompose into a sorted list of unique characters and the count of instances of each character, ordered alphabetically. They'd got the decomposition and counting sorted, but weren't aware of the LINQ orderby that was one way of solving the last part of their problem.
The solution I gave them looked like thus:
var frequency = from f in "trreill".ToList() group f by f into letterfrequency orderby letterfrequency.Key select new { Letter = letterfrequency.Key, Frequency = letterfrequency.Count() }; foreach (var f in frequency) { Console.WriteLine($"{f.Letter}{f.Frequency}"); }
One of the comments against my answer was from Eric Lippert, saying:
Though this is good, it could be improved in two small ways. (1) remove the unnecessary ToList, and (2) in C# 7, select into a tuple rather than an anonymous type.
Good point! I'm not going to mention the comment regarding .ToList() any further here, that was a redundant bit of code that I copied wholesale from the original question without thinking.
Swapping in Tuples, and the other things that gives
The simplest change here, to remove the use of an anonymous type and replace it with Tuple usage is as simple as changing the select part of the LINQ statement to:
select ( Letter: letterfrequency.Key, Frequency: letterfrequency.Count() );
Aside: On the PC I was using at the time, this caused the C# compiler to throw its toys out of the pram claiming it didn't have appropriate support for using tuples. I'm guessing that's because the scratch project I dropped it into was pointed at an older version of the .NET Framework, but installing the System.ValueType NuGet package quickly solved that.
Not much change in the code there, I'll guess you're thinking, but the underlying type that's used to build the Tuple here is System.ValueType<> which is a struct, rather than with an anonymous type where we're getting full-blown objects. There's the possibility of reduced memory usage there, which might be appealing to you if you're processing large amounts of data, or working in a reduced memory environment.
The bigger benefit over anonymous types is that you can return the values from a function, so rather than having to upgrade to a concrete type (or "Custom-built transport types" as they're referred to above) if you want to re-use your LINQ statement (because you wouldn't copy and paste the code, right?) it becomes possible to have this:
public IEnumerable<(char Letter, int Frequency)> GetLetterFrequency(string text) { var frequency = from letter in text group letter by letter into letterfrequency orderby letterfrequency.Key select ( Letter: letterfrequency.Key, Frequency: letterfrequency.Count() ); return frequency; }
This method can now be called from anywhere, which just isn't readily possible with anonymous types in a strongly-typed way. And here it is being called, replacing the inline LINQ:
var value = "trreill"; var frequencyCalculator = new LetterFrequency(); var letterFrequencies = frequencyCalculator.GetLetterFrequency(value); foreach (var f in letterFrequencies) { Console.WriteLine($"{f.Letter}{f.Frequency}"); }
There's another change that can be made here, say perhaps the code in your foreach was longer and did more with the data, then repeated use of f. simply serves to add noise. The variable declaration can be deconstructed, which is suggested by Visual Studio, promoting each of the values in the Tuple to what look in the code like individual variables:
foreach (var (Letter, Frequency) in letterFrequencies) { Console.WriteLine($"{Letter}{Frequency}"); }