Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here.
I have had the pleasure to program in a variety of programming languages throughout the years including the trifecta of C++, Java, and C#. It's often interesting how these three languages are so similar and yet have such key differences as well. Each of them has features that I miss when I work with the others. But, if I had to pick one standout feature that I love in C# (and long for when working with the other languages) it would be LINQ.
There is a wealth of algorithmic and framework components in LINQ that make development more rapid and code easier to maintain. Extension methods, are a ubiquitous feature of LINQ even if you don't know you are using them. And even without using LINQ, they are a very interesting feature of .NET in of itself and worth a more detailed discussion.
Extension methods
An extension method, in brief, is the ability to "extend" a type through some syntactical sugar. Now, I say “extend” in quotes because it really does nothing to the other type at all, it simply defines a static method in a static class and identifies the type you are "extending" as the first parameter by preceding it with the this keyword in the parameter list.
For example, you could say:
1: // class has to be static
2: public static class IntExtensions
3: {
4: // method has to be static, first parameter is tagged with the 'this'
5: // keyword, the type of the parameter is the type that is extended.
6: public static int Half(this int source)
7: {
8: return source / 2;
9: }
10: }
Now, even with the this keyword, this is really just a static method like any other and thus you could invoke it like:
1: // Invoking as if it were an ordinary, static method
2: var two = IntExtensions.Half(4);
But the magic of extension method's syntactical sugar is that you can invoke them as if they were first-class methods of the type they "extend":
1: // Extension method behave as if they were really members of
2: // the type they extend, thus we can call Half() on any integer.
3: var two = 4.Half();
Once again, this is really just syntactic sugar. The instance of the type being “extended” is passed as the first parameter into the static method in the generated byte-code for you. Thus, the shorter, cleaner syntax will in the end compile just like the more traditional syntax in the previous example. But this syntactic sugar can not only shorten simple calls like the above, but can allow for chaining of multiple methods (extension and first-class both) in a very fluent way.
Let’s expand our integer extensions with a few more ideas:
1: public static class IntExtensions
2: {
3: public static int Half(this int source)
4: {
5: return source / 2;
6: }
7:
8: public static int Cube(this int source)
9: {
10: return (int)Math.Pow(source, 3);
11: }
12:
13: public static int Square(this int source)
14: {
15: return (int)Math.Pow(source, 2);
16: }
17: }
Now, what if you had the above methods and wanted to take 13, Cube() it, then take Half(), then Square() the result using these methods? If you wanted to do this using traditional static method syntax, you'd have to write:
1: // The repetition of the type name and nesting gets confusing...
2: var ans = IntExtensions.Square(IntExtensions.Half(IntExtensions.Cube(13)));
Ugh, that's a mess! But with extension method syntactical sugar, you get a much cleaner and easier to read result:
1: // Much better, says take 13, cube it, half it, square it.
2: var ans = 13.Cube().Half().Square();
So, we see there is a lot of power here to extend types in a very fluent way. But I've only hinted at one of the things that make extension methods so very powerful. I said they could be used to "extend" any type. I don’t mean just struct or class or primitives, I mean interfaces as well.
Extending Interfaces
The great thing about interfaces is that they can be used to specify a public contract without regard to implementation details needed to satisfy that contract. This of course means that interfaces provide no method bodies. But, many times when we are defining a complete interface, it is possible to define functionality without needing to know the implementation of the interface at all!
Consider, for example Enumerable.Count(), this is an extension method in System.Linq that will give you the count of any sequence of IEnumerable<T>. It doesn’t care how that interface is implemented (though it has a performance short-cut for Collection implementations). All it needs to do is to be able to know when it’s empty, and how to get the next item, both of which are specified in IEnumerable<T>, thus you can provide this functionality using only the interface of IEnumerable<T> itself. It can be a HashSet<T>, List<T>, T[], or any other sequence of items and you can always get a Count().
As another example, let's create our own extension method to chop a sequence of IEnumerable<T> into slices of a given size. That is, if we had an array of size 32, and we wanted to divide it into slices of size 13, we should get back as output three sequences of size 13, 13, and 6.
1: // some extension methods for IEnumerable<T>
2: public static class EnumerableExtensions
3: {
4: // first argument is the source,second is the max size of each slice
5: public static IEnumerable<IEnumerable<T>> Slice<T>(this IEnumerable<T> source, int size)
6: {
7: // can't slice null sequence
8: if (source == null) throw new ArgumentNullException("source");
9: if (size < 1) throw new ArgumentOutOfRangeException("size", "The size must be positive.");
10:
11: // force into a list to take advantage of direct indexing. Could also force into an
12: // array, use LINQ grouping, do a Skip()/Take(), etc...
13: var sourceList = source.ToList();
14: int current = 0;
15:
16: // while there are still items to "slice" off, keep going
17: while (current < sourceList.Count)
18: {
19: // return a sub-slice using an iterator for deferred execution
20: yield return sourceList.GetRange(current, Math.Min(size, sourceList.Count - current));
21: current += size;
22: }
23: }
24: }
Notice that everything we are using on source is available publically for any IEnumerable<T> since they are either public interface methods declared on IEnumerable<T>, or other extension methods provided in System.Linq.
So now, we could use this method to process any sequence in slices! For example, what if we had an array of 1000 items, and wanted to process them in parallel in lots of 10?
1: int[] items = Enumerable.Range(1, 1000).ToArray();
2:
3: // Process each slice of 10 items in parallel!
4: Parallel.ForEach(items.Slice(10), s =>
5: {
6: foreach (var item in s)
7: {
8: Console.WriteLine(item);
9: }
10: });
11:
Now you can! And with the fluent interface extension methods provide, you could easily chain the extension methods in a very easy-to-read way. For example, what if you wanted to process the cube of all the numbers from 1 to 1000 in groups of 10? We can chain in LINQ’s Select() extension method and our Cube() int extension to get:
1: // Simply says select a sequence of the cube of each item,
2: // then slice the sequence into lots of size 10
3: Parallel.ForEach(items.Select(i => i.Cube()).Slice(10), s =>
4: {
5: ...
6: });
You may have also noticed that we made our extension method check for a null on our source parameter. This is generally considered good form. It is possible to call an extension method off of a null instance, but many people think this is inappropriate because it would cause a problem for first-class methods. That said, you can get a lot of power from allowing a null first argument in an extension method so I leave it up to you.
My main piece of advice would be that if your first argument will allow null, the name should state it. For example, you could write:
1: public static class ArrayExtensions
2: {
3: // returns length if not null, otherwise zero.
4: public static int NullSafeLength<T>(this T[] source)
5: {
6: return source != null ? source.Length : 0;
7: }
8: }
And this would allow you to collapse this:
1: if (myArray != null && myArray.Length > 0)
2: {
3: ....
4: }
To this:
1: if (myArray.NullSafeLength() > 0)
2: {
3: ...
4: }
I personally don’t have a problem with extension methods like this allowing null because the name makes it obvious that this is safe. There are those who don’t agree and never think it should be possible, though, so I leave it up to you and your team to decide what style you prefer.
To Extend? Or Not To Extend?
Well, we've seen the power of extension methods, but as with all good things, this power can be abused. All I'm trying to say is, just because you can make a many things extension methods doesn't mean that everything should be an extension method.
For example, what would you say about this example:
1: public static class IntExtensions
2: {
3: // converts an int number of seconds to milliseconds for use in
4: // Thread.Sleep() and other timeout methods...
5: public static int Seconds(this int source)
6: {
7: return source * 1000;
8: }
9: }
I ran across this gem in some source online, I'm sure the well-meaning individual was hoping to use this to make code like this easier to read:
1: Thread.Sleep(30.Seconds());
Which hey! That looks great and fluid! The problem is, not every integer represents time, and not every usage of time implies milliseconds. Thus this extension method has a very localized purpose and doesn’t really apply to all integers as a whole.
Consider, what if someone took this well meaning method and did this:
1: // Whoops. I really meant 50 seconds!!!
2: var timeout = TimeSpan.FromSeconds(50.Seconds());
They'd then have 50,000 seconds of wait time. The method above is tuned to return milliseconds from seconds, which neither the name nor the return type implies! An int can represent anything: days, hours, minutes, puppies, extra lives, etc. Again, consider if someone would have employee.Age.Seconds() thinking it would convert their age in seconds, to only discover they are now 1000 times older.
Thus, you should always be careful when you create new extension methods that the problem being solved by the extension method fits both the type being extended, and the result appropriately for the domain of values.
Give a Hoot, Don't Pollute
As a final note on extension methods, one should always be aware not to let their dirty laundry air out in public. Just as it's a best practice to put types in namespaces to avoid collisions, I would put extension methods in their own namespaces as well so that users have to explicitly include the namespace to use them.
Why? Because it gives users a choice. Look at LINQ, if you want to use one of LINQ's extension methods, you must state using System.Linq to make them available (syntactically). That means that if a user doesn't want to be bothered with them, they simply don’t import the namespace and they won’t see them.
This also means that you won't pollute your IntelliSense needlessly as well. When you create extension methods in the global namespace (or some other high-traffic one) it can get annoying if every time you press the ‘.’ on an instance it shows every extension method under the sun. You can, for example, create extension methods off of object, which means it would apply to every type. While there are occasionally uses for extensions on object, you should be aware that that means that the extension method will show up in every IntelliSense member list when you press the ‘.’ key. Having these extensions in their own namespace, again, will prevent this unless someone really wants it to be visible.
Summary
Extension methods are a very powerful feature of .NET, they allow you to attach functionality to "extend" a type (even interfaces) and have that method syntax behave as if it was a first-class method. This can enable very powerful and fluid interfaces which can help make code easier to use and maintain, if treated with respect and used properly.