Yield and iterator methods
Date posted: 05/04/2017
You haven't filled in compulsory values.
The email is not correct
All around C#, there are over a few dozen keywords. One of them, which is rarely used and is sometimes misunderstood, is yield. However, if used properly, yield can help in optimizing your code. Let's take a look at how it works.
What does yield do?
Let's get to what yield does using an example. We are going to create a method that returns a Fibonacci sequence. You have probably heard what Fibonacci numbers are, from your first for loop classes. So, a Fibonacci sequence is a series of numbers in which each number is the sum of the two preceding numbers. Well, this may not be the exciting way you would expect to implement yield, but I guess this will make things easier to grasp. So let's write the following code.
string fibonacciSequence = "";
foreach (int f in FibonacciEnum(10))
fibonacciSequence += f + " ";
The following method returns a Fibonacci sequence.
private IEnumerable<int> FibonacciEnum(int n)
{
List<int> fibonacciEnum = new List<int>();
int a = 0; int b = 1; int sum = 0;
for (int i = 0; i < n; i++)
{
fibonacciEnum.Add(a);
sum = a + b;
a = b;
b = sum;
}
return fibonacciEnum;
}
So, we just created a method that returns an IEnumerable of integers by adding every number to a list. We call the method, get that IEnumerable and then loop through it and create the following string.
0 1 1 2 3 5 8 13 21 34
Ok, let's implement the same method using yield and see what it looks like.
private IEnumerable<int> Fibonacci(int n)
{
int a = 0; int b = 1; int sum = 0;
for (int i = 0; i < n; i++)
{
yield return a;
sum = a + b;
a = b;
b = sum;
}
}
This is what implementing yield looks like. You can tell both methods look alike. It's like we removed our list object and use yield return to add one integer at a time to our IEnumerable on every iteration. What's even more confusing is that the result string we get is the same.
0 1 1 2 3 5 8 13 21 34
So, is there any reason to use yield at all?
The answer is yes. Actually using yield is totally different that using a typical method. The answer would be easier to show if you could place a breakpoint within the Fibonacci method. Then you would notice that the method is not executed at all when called. On the contrary it will start executing when it is needed, in other words, when the for loop in the caller method begins. At that moment the Fibonacci method starts executing but stops as soon as it reaches yield return. When it does, code returns to the outer for loop and will get back to the method the next time an IEnumerable element is requested. The following picture will make things clearer.
As you can see, yield allows us to move in and out from our method but the method does not initialize. Its variables store the same values the way they were during last iteration. It's like the execution is paused, saved and then resumed from the save point later on. Repetitions will go on until we stop requesting elements or there are no more elements to return.
Yield can be used in combination with return on a method or a get accessor that returns an IEnumerable or IEnumerator. Using yield the method turns into an iterator. In other words the method is turned into a stateful machine which calculates each element of the IEnumerable only when needed. The compiler does all that transformation hard work here. All you have to do is use yield.
Before moving on, we should mention yield break. This is another way to end the iteration loop on our own. Suppose we do not want to compute Fibonacci numbers forever and ever, we can insert yield break, like this.
private IEnumerable<int> Fibonacci(int n)
{
int a = 0; int b = 1; int sum = 0;
for (int i = 0; i < n; i++)
{
if (i < 9)
{
yield return a;
sum = a + b;
a = b;
b = sum;
}
else
yield break;
}
}
This will end our method on the 9th iteration.
0 1 1 2 3 5 8 13 21
Using yield
Let's move on to another example. I'm not saying that Fibonacci numbers are boring but, well, I personally never had to use them since first year at school. So let's use sights that people visit on trips instead.
Here's a Sight class
public class Sight
{
public int Id;
public int CityId;
public string Title;
public Sight(int id, int cityId, string title)
{
Id = id;
CityId = cityId;
Title = title;
}
}
Here's a list so we can have some something to work with
public List<Sight> sightsList = new List<Sight>() { new Sight(1, 1, "Eye Of London"), new Sight(2, 1, "Westminster Abbey"), new Sight(3, 2, "Louvre"), new Sight(4, 1, "Tower Bridge"), new Sight(5, 2, "Notre-Dame") };
And finally, here's the method that returns sights that are located within a city with a given Id, or return all sights in case cityId is 0.
private IEnumerable<Sight> GetSightsByCategoryId(List<Sight> sights, int cityId)
{
foreach (Sight sight in sights)
{
if (sight.CityId == cityId || cityId == 0)
yield return sight;
}
}
Let's call GetSightsByCategoryId in a similar way we did the Fibonacci numbers.
string sightSeeing = "";
foreach (Sight sight in GetSightsByCategoryId(sightsList, 1))
sightSeeing += sight.Title + " | ";
You can probably guess how things work, but let's give it a try together. At first GetSightsByCategoryId will return Eye Of London, then Westminster Abbey. On next call it will ignore Louvre and move on to Tower Bridge. After that ,it will move on to the final two iterations trying to find more Sights but since there are no more that match, will return nothing more.
The result will look like that.
Eye Of London | Westminster Abbey | Tower Bridge |
The same pattern would be followed if instead we had used
string sightSeeing = String.Join(" | ", GetSightsByCategoryId(sightsList, 1).Select(x => x.Title));
Needless to say that yield does not need to show up only in methods that contain iteration loops. For example the following method is totally valid.
private IEnumerable<Sight> GetSights()
{
yield return new Sight(1, 1, "Eye Of London");
yield return new Sight(2, 1, "Westminster Abbey");
}
Now let's add some LINQ in our recipe.
string sightSeeing = GetSightsByCategoryId(sightsList, 1).First().Title;
This will enter GetSightsByCategoryId only once, since there is no need to get more elements. It will return
Eye Of London
string sightSeeing = GetSightsByCategoryId(sightsList, 0).Last().Title;
This will loop through all elements of sightsList till it gets to the last one. In contrast to the Fibonacci numbers there actually is no point in searching every single element since we know that the last element of the list has nothing to do with the previous ones. So this might not be a perfect case to use yield. Anyway, the result would be
Notre-Dame
string sightSeeing = GetSightsByCategoryId(sightsList, 1).First().Title + " | " + GetSightsByCategoryId(sightsList, 1).Last().Title;
You may have probably guessed that the result is
Eye Of London | Notre-Dame
I know, it's not much different than the previous one and, well, if you understand how yield works it's always the same pattern. What's more interesting is this. How many iterations do you think took place in order for the previous statement to be completed? If you guessed the same as the number of the list elements, you would be wrong. The thing is that a method containing yield is always computed from the beginning whenever it is called.
If for example we had
string sightSeeing = GetSightsByCategoryId(sightsList, 1).Last().Title + " | " + GetSightsByCategoryId(sightsList, 1)Last().Title;
this would require iterations twice the size of the list.
If you are familiar with Data contexts this is probably not the first time you've heard of such things. Actually LINQ is based on yield. You may have already noticed that GetSightsByCategoryId(sightsList, 1) is no different than sightsList.Where(x => x.CityId == 1). As a result yield carries along all pros and cons of LINQ.
Why should I use yield?
Keep in mind that yield takes effect only in sequences like enumerators. So we may need to use it only in case we want to return a collection of objects.
Imagine you want to do some processing with a lot of elements contained within a sequence (that sequence could be anything, for example a database table or a group of files) but you want to process them one at a time. There is no point in wasting time and memory into loading all elements first and process them later. Instead you can get the first one, do what you want with it and then move to the next one. And so on. You can use yield to obtain lazy loading.
Otherwise suppose you have thousands of database records but you only need a few of them. Why would you want to fetch them all at the first place? Use yield instead.
Or maybe you want to get a sequence like the Fibonacci we talked about earlier. There surely is no end to that sequence, so you may want to give yield a try and get the sequence little by little.
Accessing one item at a time can make things less complicated if something goes wrong. Only the items currently in use need to be examined instead of the whole list.
Finally think of a sequence source bound to continuous changes. Creating a list of the objects could contain objects that might have changed from the time the list is loaded till we actually use them.
So, yield can be of help in certain situations but offers nothing in other cases. For example if we do want to loop through all elements of a sequence, there is no point in using yield. Or if we want to use the returned elements more than once. In that case data we ask for will be computed from the begining every time we need them. Transforming a method into an iteration and moving through all its objects takes time. Much more time than creating a list would take. So, whenever you think using yield would do good, remember to test your code afterwards so it doesn't backfire. Anyway, keep in mind that if there is no actual reason to use yield, then don't use it.
Conclusion
Yield can be used to create an iterator out of a simple method. This can help us access the method's elements one at a time, whenever we need them instead of getting them all at once and then process them. This way, yield can help us when we want to get lazy loading; still it can be helpful in other cases as well. However we should be careful cause using yield is not always as good an idea as it may seem.
Back to BlogPreviousNext