Home Page

In this lesson, I will discuss how the .NET Garbage Collection works and how it can impact your C# programs.  This is especially important for ASP.NET programmers because when you deploy an ASP.NET web application, if you don’t understand how Garbage Collection affects your program, you may end up using too many resources or memory on the web server, possibly resulting in application downtime due to server hangs or resource deadlocks.  After working with .NET for many years, I noticed that there is a percentage of .NET programmers that do not understand Garbage Collection at all, and that can lead to buggy and unstable code.  Even the programmers that are familiar with the concept of Garbage Collection are sometimes not knowledgeable about what Garbage Collection does NOT do, which is equally as important as what it does.

This lesson is also related to a recent article that I wrote called the Scope of C# variables.  Go back and read that first if you haven’t already done so.  You need to understand how C# recognizes the scope of variables in order to understand how Garbage Collection works.  Also, this discussion of Garbage Collection is applicable to Reference variable types, NOT Value types.  If you are not sure what the difference is, read a recent article called the Difference between Value variables and Reference variables.

Let me start off by first introducing the concept of Garbage Collection and what it is all about.  Garbage Collection is a process that the .NET runtime executes every so often in order to reclaim memory that was being used by your C# program before, but is no longer being used.  So in short Garbage Collection is a way that .NET cleans up old objects.  By performing this cleanup, the ASP.NET runtime engine is able to effectively create memory space so that future objects can be created successfully without receiving “Out of memory” error messages.  If Garbage Collection did NOT occur, your application would probably run out of memory, especially if there are a lot of users.  Let’s explore more in detail below.

In order to understand Garbage Collection you first need to understand how .NET allocates (creates) memory in your application.  When you create an instance of an object in C# (using the new operator), the .NET runtime engine will dynamically create memory for that object and then return a reference to the newly created memory.  A reference is like a pointer to that new object in memory and normally you would store the reference in a variable.  Here is an examle where I have created two reference variables called Number1 and Number2.


ExampleClass Number1 = new ExampleClass();
ExampleClass Number2 = new ExampleClass();

In the above example the .NET runtime will create memory for two new objects of type ExampleClass and return a reference to those objects.  The references are stored in variables above using the assignment operator.  Here is a graphical representation of what the .NET memory would look like after the instantiation (use of the new operator).  Click to enlarge.

Memory allocation reference types

Now let’s say that these variables were declared inside a method like a button click event handler on an ASP.NET web form.


protected void btnReferenceVariables_Click(object sender, EventArgs e)
{

  ExampleClass Number1 = new ExampleClass();
  ExampleClass Number2 = new ExampleClass();

  Number1.Number = 11;
  Number2.Number = 22;

  txtClassNumber1Before.Text = string.Format("Number1 = {0}", Number1.Number);
  txtClassNumber2Before.Text = string.Format("Number2 = {0}", Number2.Number);

}

In this case, since the variables have a scope that is local to the method btnReferenceVariables_Click, after the method completes, they are no longer available to the application; they go out of scope.  So what happens to the memory for those two objects?  Well, it is still there, but the references (pointers) to it no longer exist.  Here is what the memory looks like after the method completes.

Memory allocation reference types after method completes

Notice that the memory is still allocated, but there are no more references to it.  Is this a problem (like a memory leak)?  Well, it would be a problem if Garbage Collection didn’t exist in .NET.  In fact, older languages like C++ didn’t have Garbage Collection so this scenario was common when programmers forgot to deallocate memory that was created.  With .NET however, what will happen is that at some point in the future (perhaps when memory is low) the .NET runtime engine will run the Garbage Collector.  The Garbage Collector looks for blocks of memory like these two blocks that no longer have any references to them.  The Collector then removes the unused objects from memory and this frees up more space for future object creation in your application.

Let’s review that again quickly.  The Garbage Collector will run when memory is low and look for all unreferenced objects in memory.  The Collector will claim back the memory used by those old objects.  This is important to understand for a .NET programmer because it gives you an idea of how much memory your program is using at any given time, based on how much object instantiation you are doing.  If you instantiate a lot of objects (new operator), you will be using a lot of memory.  That is not necessarily a bad thing, but if you overdo it, at least you will have an idea of what you can fine tune if memory is getting too low a lot when your application runs.  You can either decrease the amount of new objects that you create in your code, or you can add more memory to the machine.  There are other options of course, such as using object pooling, but that is another discussion altogether.  Normally the Garbage Collector will work fine to clean up your objects, even if you do a lot of instantiation.  I use objects a lot in my code and I have never had any problems.

The important thing I want you to remember is that when you create an object using the new operator and your reference to that object goes out of scope, the memory doesn’t free up right away.  The scope of the reference variable is very important.  If a reference variable does NOT go out of scope, the reference to its object will still exist and the Garbage Collector will ignore that object.  Here comes an example. 

Now, let’s change the example code earlier just a little bit to see what would happen if I saved a reference to one of the objects in a Session variable.  The Session is a special ASP.NET object that stores information (state) until the user logs off or closes the browser.


protected void btnReferenceVariables_Click(object sender, EventArgs e)
{

  ExampleClass Number1 = new ExampleClass();
  ExampleClass Number2 = new ExampleClass();

  Session["Number1"] = Number1;

  Number1.Number = 11;
  Number2.Number = 22;

  txtClassNumber1Before.Text = string.Format("Number1 = {0}", Number1.Number);
  txtClassNumber2Before.Text = string.Format("Number2 = {0}", Number2.Number);

}

Notice that after the objects are instantiated, I am saving a reference to the object that Number1 references in a Session variable called “Number1″.  This means that after the btnReferenceVariables_Click method completes, the object that was created in the first new operation abvoe is NOT eligible for Garbage Collection.  Let’s see why by looking at this graphic.

Memory allocation after method completes session still in scope

Notice that after the method completes, there is still a reference to the first object that was created.  This is because although the local variables Number1 and Number2 are out of scope, the Session is STILL in scope.  Therefore, the object that Number1 was pointing to is NOT eligible to be Garbage Collected.  Only when the Session dies, will that object’s memory get cleaned up.  Actually even if the Session hasn’t ended, there are a couple of other scenarios that would cause that object to be eligible for Garbage Collection.  One way is if you assign the Session variable “Number1″ to null.  The second way is if you assign the Session variable “Number1″ to a different object.  In both of those cases, there wouldn’t be any more references left to that original object created by the local variable Number1 in the method, so it would then be eligible for Garbage Collection.

Now that I have covered what the Garbage Collector does, let me talk a little bit about what it DOESN’T do.  This is where I have seen many programmers create buggy code because they are either unaware of the following concepts or don’t understand them well.  Although the Garbage Collector will cleanup unreferenced memory, it will not cleanup unused resources that haven’t been closed or shutdown properly.  Two very common examples that come to mind are file handling and database connections.

When you open a file to read or write it, the operating system creates what is called a file handle.  The file handle needs to be closed properly in order to release the file back to the operating system.  If a file handle is not released back gracefully, problems such as incomplete files or locked files can result.  The way to prevent this is to ensure that you are properly closing the file that you open in your C# code.  Most file related classes in C# have a Close() method that you can call in order to gracefully release the file and make sure that any unwritten buffered data is flushed out into the file.  An example of such a class is the StreamReader class.

In a similar fashion, when you open a database connection, the database connection manager grabs available connection from the pool and gives your program a reference to that connection.  The connection needs to be closed properly in order to release it back into the connection pool so that another thread can use it.  If a database connection is not released gracefully, the application can run out of available connections, causing errors that could crash the entire application (especially if it is highly database driven).  The way to prevent this is to ensure that you properly close opened database connections immediately after you are finished using them.  The .NET database related classes have a Close() method that you can call.  An example of such a class is the SqlConnection class.  Make sure that you read the specific documentation for whatever class you intend to use and look at multiple examples that show how to properly close the database resources.

There is an alternative to manually closing a resource.  C# contains a special statement called the “using” statement.  “using” is a special keyword that can help you cleanup resources automatically so that you don’t have to call any Close() method manually.  It works really well and can be used in most situations.  I recommend it because it will cause the .NET runtime engine to automatically call the Dispose() method of your resource class and then Dispose() internally calls the Close() method to gracefully finish using a resource.

There are other resources (not just files and database connections) that the Garbage Collector will not automatically cleanup for you.  These are called unmanaged resources in the .NET documentation.  It is not possible to list them all here, but instead you should be aware of how the Garbage Collector works and read the documentation for whatever resources you are using in your application to ensure that you are properly closing them if they are unmanaged.

I hope you enjoyed this lesson and please post questions as I’m sure you will have some.

11 Comments »

  1. Thank you so much for your lessons. The explaination are straight forward. I had trouble with the oncepts but it is so much clearer than last week.

    Comment by Wesley — April 24, 2009 @ 2:01 pm

  2. I really appreciate this tutorial,infact its an eye opener for me and I commend your efforts for putting all this stuff together in a succint manner, now a more confident with my C# skill.
    Keep the good work.

    Comment by ola — May 19, 2009 @ 8:46 pm

  3. An awesome article. Thank you very much for the information you have provided. Please keep writing such articles.

    Comment by Vidya — May 27, 2009 @ 12:16 am

  4. awesome… you are one hell of a teacher, keep up the good work. thanks a lot for all your help, and to think you’re doin it for free… why not setup a donation link and well wishers like myself could chip a bit to compensate your time and effort spent in creating this very very very wonderfull and educational tutorial))

    Comment by Sif — August 12, 2009 @ 4:41 pm

  5. I am having a problem that I think is related to these concepts, but I am unsure of how to handle or correct the situation. I have something like the following…

    StringBuilds output = new StringBuilder();
    IList myIDs = GetMyIDs(); // Returns ~10,000 ids
    foreach (int id in myIDs)
    {
    MyObject myObj = GetMyObject(id); // Returns a new object. Very large object (holds alot of XML data in a property for example.)
    output.Append(myObj.Render()); // Just appends some text to the SB.
    }

    I am getting System.OutOfMemory exceptions.

    Based on your tutorial, each of the 10,000 objects instantiated will hold a place in memory until they go out of scope. I think that this means the scope only ends after the loop completes processing. So, is there a way to release these resources earlier than the end of the loop?

    Thanks!

    Kevin

    Comment by Kevin — August 22, 2009 @ 11:20 am

  6. Hi Kevin. Thanks for posting your question.

    While it is true that the objects you are instantiating in the loop will go out of scope
    when the loop completes, that does NOT mean that the memory allocated and used for those
    objects will be released back to the ASP.NET worker process at that time. That release
    of the memory will only occur during the next Garbage Collection execution. So rather
    than trying to cleanup the memory, which you cannot control, you have to examine your
    algorithm and data types to find the best solution to resolve the memory problem. In
    other words, you have to change the way that you designed the code. That is how you can
    solve this memory usage problem.

    There could be many reasons that you are getting the Out of Memory error. Also, it could
    be a combination of of things. Here are some things to consider.

    1) Are you creating 10,000 objects for every user that hits your web application? That
    appears to be a lot of memory usage if the size of MyObject is large as you say below in
    your comment. The way you estimate how much available ASP.NET RAM you need for that
    block of code to run successfully is by using this equation: (Est. size of MyObject) * (#
    of objects created in your loop) * (Max # of concurrent users). For example, if the
    estimated size of MyObject is 10K bytes, you are creating 10,000 MyObjects in the loop
    and your application has max 50 concurrent users, you would need about 10k * 10,000 * 50
    = 5000000K bytes of memory available. If your object is large, it is probably much more
    than 10k in my example.

    Is there a need to loop and create 10,000 objects for every user that hits the
    application? Why do you need so many? Can you reduce the loop iteration count to much
    less? This of course requires analysis of your overall algorithm and maybe even business
    requirements.

    2) How are you storing the data in the MyObject class? What kind of data types are you
    using? Are those the most efficient data types with regards to memory? You can e-mail
    me the code for that class and I can look at it if you want.

    3) I noticed you are using the StringBuilder class. Although StringBuilder is generally
    much better for performance versus using String concatenation, you can try this to see if
    less memory is consumed. Try to give the StringBuilder a size when you instantiate it.
    How big do you think the string will be once you have appended everything to it? If you
    think the average largest size of the string is going to be 5Megabytes, then you can
    instantiate the StringBuilder like this:
    StringBuilder mybuilder = new StringBuilder(5242880);

    This tells ASP.NET to allocate 5Meg worth of space right away and it will not require any
    further allocation whenever you append data to string using the Append method. What this
    does is ensure that you are not allocating any new blocks of memory each time in the loop
    with regards to the StringBuilder-this does not affect the memory needed for the MyObject
    class instantiations.

    Yours,
    Ted Kolovos

    Comment by ted — August 22, 2009 @ 1:42 pm

  7. Ted, thanks for the feedback. I have a few answers to your questions…

    1) Are you creating 10,000 objects for every user that hits your web application?

    No, this is an admin application where a few users are occasionally performing work on ~10000 records. For example, generating PDF documents for records, or exporting to a CSV file, or performing some sort of mass update.

    2) How are you storing the data in the MyObject class? What kind of data types are you
    using?

    Actually, the MyObject class is typically a class representing an entity. For example, a Student record let’s say. The student record is for the most part pretty basic, with simple short string properties. But, it does have a IDictionary property that has approximately 300 items. This is a custom field -> value dict that allows us to store arbitrarily defined data for the student. That is the big item for these objects. And, because this is actually stored as XML in the DB, I deserialize it into the dictionary when it is loaded from the DB. I perform this deserialization using an XmlReader and Serialize using XmlWriter. Certainly, this could be causing an issue, but I thought using the stream approach would be better than loading an XML document and looping through it.

    3) I noticed you are using the StringBuilder class.

    Yes, and I also think this might be an issue. In the example I presented, I am basically storing a CSV record for each record in the string builder, then after the loop completes, I create and write the contents to a CSV file. I have already started modifying this approach to stream to the file during the processing, rather than creating the x-large SB object.

    I have also seen some others discuss issues where you do not set an initial size for the SB, and the SB has to dynamically allocate more memory.

    Thanks!

    Kevin

    Comment by Kevin — August 24, 2009 @ 1:49 pm

  8. Hmmmm …. The ins and outs of garbage collection and memory management is one of those things I’ve always meant to get around to studying and never have (I word in a small IT shop and therefore do it all - my knowledge is broad but not deep).

    But - just call Close() and Dispose() as soon as you can on EVERY .Net object that offeres these methods, and sleep easy at night.

    Comment by Charles Bevitt — September 21, 2009 @ 1:51 pm

  9. What is better for performance and how it should be done, creating variable in loop or before loop as in examples :

    Ex. 1
    for (int i = 0; i

    Comment by Aramis1986 — December 28, 2009 @ 7:02 am

  10. Ex. 1

    for (int i =0; i

    Comment by Aramis1986 — December 28, 2009 @ 7:04 am

  11. I dont know why it cuts my code.

    Comment by Aramis1986 — December 28, 2009 @ 7:05 am

RSS feed for comments on this post.

Leave a comment