Why no ForEach method on IEnumerable interfaces

by Mark Zhou 19. January 2010 21:54

I was asked many times recently regarding this topic, why Microsoft doesn't introduce an extension method "ForEach" on interface "IEnumerable<T>"? The answer is: "No, and never".

Here are some strong reasons to not bring this extension method on IEnumerable. Especially, the following categories can make CLR team cut down this requirement.

  • Design pattern confusion
  • Consistency
  • Runtime behavior
  • Performance

Design Pattern Confusion

As we all known, the C# built-in keyword "foreach" is a syntatical sugar. While compile the source code contains foreach statements, the complier searches for the underlying object hierarcal inheritance to determine whether this object implements IEnumerable interface, if so, complier transforms foreach statement to call apprepriate IEnumerable methods to form the loop. That's to say, foreach keyword is based on the IEnumerable interface. Additionally, IEnumerable<T> is the generic form for IEnumerable interface, which improves the runtime type resolving performance by jsing generic type system. Threrfore IEnumerable<T> is also the source to make foreach keyword. Imagine adding an extension method ForEach to these interfaces, does it look strange?

Extension method is designed in use where you need to expand your existing type by adding some new instance methods on this type. Extension method is also a syntatical sugar in C#, they're translated to the static method call in the complie time. It would be a big confusion if there is a ForEach extension method on the IEnumerable interfaces that makes the foreach keyword work. How do you think that?

This discussion seems like an argument on eggs and hens, how do you know who's got born first on earch? hens or eggs?

Consistency

If IEnumerable has ForEach method, in the source code, when do we choose to use foreach keyword and when do we choose IEnumerable.ForEach? Though the actual result may be the same but the problem is still obvious. I remember there is also a discussion for where to consider use FCL types (e.g. System.String) or language built-in types (e.g. string in C#). This knid of arguments never end. Standardize usages of these kinds of stuff is really very difficult.

Runtime Behavior

You may think three is almost no difference if you add an extension method ForEach to IEnumerable then use this call against use foreach kwyword. Unfortunitely you're incorrect. ForEach may have the singature like this: void ForEach<T>(Action<T>). Because ForEach takes parameter that is of type Action<T>, and actually Action<T> is a lambda expression, each lambda experssion can be convert either to anonymous delegates or Expression Tree according to the executive context, thus the ForEach method call may be translated to a specific statement or even database command when you have a ForEach extension method defined on IEnumerable<T> and apply this call on the IQueryable<T> instances or Linq to Sql objects. when call ForEach() on objects that have a valid Linq Provider, the method call will not actually be performed, instead, the meaning of this call will be translated into statements that recognized by the attached Linq Provider on this type. As a result, you may get incorrect outputs when call ForEach() over objects in different level of its hierarchy trees. If you define ForEach method on the base type for all Linq enabled objects (actually it is IEnumerable<T>). it may override the behavior of the same method on the derived types. For more information, see this blog post: http://ppetrov.wordpress.com/2009/01/22/foreach-method-on-ienumerable/

Performance

You need to believe that if you're using the foreach built-in keyword,  The following types with foreach statements will be optimized by C# compiler when complie the source code into IL.

  • String: foreach(char item in myString) will directly use myString.Length instead of call myString.GetEnumerator().
  • Array: foreach(var item in myArray) will directly use myArray.Length instead of call myArray.GetEnumerator().
  • Linq-enabled objects: foreach will execute the linq query immediately in a linq-enabled context (deferred execution).
  • IList<T>: foreach will call Count property or GetEnumerator() method accordingly to the actual underlying type.

If you extend IEnumerable with ForEach method, all the above compiler optimization will be lost.

Conclusion

Try to use “foreach” keyword when possible. the built-in foreach keyword brings different runtime behavior and compiler optimization to the enumeration operations. Do not try to extend IEnumerable with defining a ForEach extension method – it is dangerous, confusable, yet not a good design.

Update:

The answer from C# team is here: http://blogs.msdn.com/ericlippert/archive/2009/05/18/foreach-vs-foreach.aspx. thanks Dixin Yan to provide this link.

Update 2:

Dixin tried to reflect the IL code for a standard foreach call on a List<T> object. The IL code is calling GetEnumerator() method on the IEnumerable interface instead of invoking Count property on the List<T>. So by now we hadn't found a proof to the last one design optimization for foreach I mentioned above.

Tags:

Comments (2) -

Dixin
Dixin People's Republic of China
1/20/2010 12:08:28 PM #

The answer from C# compiler team is the side effects:

blogs.msdn.com/.../foreach-vs-foreach.aspx

Reply

mazhou
mazhou United States
1/22/2010 3:49:47 AM #

Dixin gave a very useful link to this article, I have added this to the conclusion part. Thanks a lot.

Reply

Pingbacks and trackbacks (1)+

Add comment




  Country flag
biuquote
  • Comment
  • Preview
Loading


Translate This Page

About Mark

Mark is a developer who works for building base class libraries and tools for developers.

Mark's Awards

Microsoft Community Contributor

Tag cloud

    Month List

    Who visit this site

    Recent Comments

    Comment RSS