Home About

March 20th, 2009

Exploring C# Boxing - 6

C# Boxing Explained

Boxing in C# has little to do with Saturday night television but quite a bit more with that part-time job at the warehouse you had as a student. It is an important concept in C# that is related to how the compiler handles different kinds of variables in memory. Knowing how the compiler handles the various types allows you to avoid unexpected side effects in your code.

This article explains what boxing is, how it works and how it can negatively effect your code if you don’t pay attention to it. We also look at how generics can be used to improve your code’s efficiency. And we try to answer the ultimate question: Is everything in C# an object?

Everything is an object in C# .. but not all objects are created equally

In C#, everything inherits from System.Object, so they share all the common object methods :

string a = “hello world”;
int b = 34;
Console.WriteLine( “{0} {1}”,a.ToString(),b.ToString());

At first glance all objects look the same in C#. But an important distinction is made between value types and reference types. Value types are basic types such as struct, int, long, short etc. Reference types are all the classes you will use (strings, delegates, objects)

  • Value types (int,struct..) etc are located on the stack. (unless they are part of a reference type, such as a class)
  • Reference types (classes…) are accessed through a pointer to their actual location on the heap.

We simplify things a little for this article: The stack keeps track of what is executing in our application. As we enter and exit methods items are added to or removed from the top. The heap stores all the applications data, and since data can be added to and removed from it randomly it needs to be garbage collected.

Why the difference ? The stack is faster to access for the runtime so by putting simple types close at hand the code gains in efficiency.

Assigning variables

C# BoxingThe graphic shows four possible scenarios :

  1. Boxing: When you assign an integer to an object (object b=a): a new managed memory block is created on the heap. For this C# has to allocate memory at runtime.
  2. By Value: If you copy the integer to another integer (int c=a) the value is simply copied to another memory slot on the stack (which is allocated at compile time), this is the fastest assignment as most checks can be done at compile time.
  3. Unboxing: When you cast an object to a value type (d=(int)b) the result is stored on the stack. This is also an expensive operation as at runtime the value has to be retrieved from the heap and is then checked to see if the cast was valid. (If the source argument is null or a reference to an incompatible object, an InvalidCastException is thrown.)
  4. By Reference: Classes are reference types so if you assign a new object to an older object (object e = b) no new instance is created, the object pointer on the stack simply points to the older object. A direct result of this is that if you modify either b or e, they will both reflect the change as they both point to the same location in memory.

Boxing is as simply putting a basic type in wrapper (making it a fully blown object), and unboxing taking that wrapped object and converts it back to a simpler type. To do the boxing managed memory needs to be allocated on the heap, references need to be updated, and the contents of the value type have to be copied.

Value types are copied, reference types just refer to the original object

The following code example shows how a simple change in code can have very different results.

Example A Example B
    class MainClass
    {
        struct Demo
        {
            public int x;
            public Demo(int x)
            {
                this.x = x;
            }
        }
        
        public static void Main(string[] args)
        {
            Demo p = new Demo(10);
            object box = p;
            p.x = 20;
            Console.Write(((Demo)box).x);
        }
    }
   class MainClass
    {
        class Demo
        {
            public int x;
            public Demo(int x)
            {
                this.x = x;
            }
        }

        public static void Main(string[] args)
        {
            Demo p = new Demo(10);
            object reference = p;
            p.x = 20;
            Console.Write(((Demo)reference).x);
        }
    }
Result: 10 Result: 20

In Example A, because Demo is a struct (and thus a value type on the stack) when we box it a copy is made. When we modify the original no change is made to the copy.

In Example B, we have created Demo as a class. In this situation we can assign it to another object and we don’t need to box it. A reference is made instead. So when we update the original, the copy is also updated as they both point to the same location in memory.

Boxing and unboxing value types slows things down

Often you can’t rely on what the type of variable a function will take so you need to use an object variable as object is the lowest common denominator in .NET. In the following example we use the ArrayList class to store a set of integers. The ArrayList can store any type of variable, but to be able to do this it accept the object class.

        using System.Collections;

        public static void Main(string[] args)
        {
            int total = 0;
            ArrayList myList = new ArrayList();
            for (int Lp = 0; Lp < 10000000; Lp++)
                myList.Add(Lp); // Box: Integer to an object
            foreach(object item in myList)
                total += (int)item; // Unbox: Object to integer
        }

When we add an integer to the Arraylist it is boxed into an object, and when we retrieve it it is unboxed back into an integer.

Note that this is only an issue if we are trying to store value types in the arraylist. If we were trying to store objects (eg. classes) there is no boxing done, as the original type was an object already. In this situation a simple reference is stored. On retrieval there is no unboxing necessary as only a reference is returned.

Boxing in Action

When we look at our code in a dissassembler the boxing operation is clearly visible in the output stream:

; myList.Add(Lp); // Box: Integer to an object
IL_000f: ldloc.1
IL_0010: ldloc.2
IL_0011: box [mscorlib]System.Int32
IL_0016: callvirt instance int32 class [mscorlib]System.Collections.ArrayList::Add(object)

And when we convert the object back to an integer the reverse happens:

; total += (int)item; // Unbox: Object to integer
IL_0042: unbox [mscorlib]System.Int32

Generics remove the need to box and unbox value types

In .NET 2.0 generics and generic collections were introduced that remove the need to box and unbox variables in many common situations. The following example is functionally equivalent to the earlier boxing example. We use a generic here, but force it to be “object”:

        using System.Collections.Generic;
        public static void Main(string[] args)
        {
            int total = 0;
            List<object> myList = new List<object>();
            for (int Lp = 0; Lp < 10000000; Lp++)
                myList.Add((object)Lp); // We need a boxing operation
            foreach(int item in myList)
                total += (int)item; // Unbox back to an int
        }

The above example of course makes little sense because we are inefficient on purpose. I included it to demonstrate how with generics we can get around the need for boxing with generic collections.

        using System.Collections.Generic;
        public static void Main(string[] args)
        {
            int total = 0;
            List<int> myList = new List<int>();
            for (int Lp = 0; Lp < 10000000; Lp++)
                myList.Add(Lp); // No need to box things
            foreach(int item in myList)
                total += item; // No need to unbox
        }

Generic are similar to templates in C++, they allow you to specify a type and the compiler will generate all the required code. In our example we specify that our list is of type int, so the compiler will create a List class that supports integers. There is no longer a need to convert our integer type to an object first. If you would like to know more about how generics work have a look at another post I wrote on the subject: Generics in C# .

So how much of a difference does boxing make?

Boxing is slower but how much is the difference and do you need to care? To put things into perspective I tested the timing on the three code examples above. For this I used .NET’s Stopwatch class to measure the code’s performance. First of all, the .NET runtime is pretty fast and I found that I needed to set the loop iterations at some 10 milion before I was able to get a consistent result across runs.

  • Boxing/Unboxing ListArray example: 2580 ms
  • Boxing/Unboxing List<object> example: 2050ms
  • Non boxing List<int> example: 825 ms

The non boxing List<int> example is about 2.5 times faster than List<object>, which is quite considerable.

To finish up…

Based on this should you rewrite all your code to implement generics? Probably not unless you have several tight loops that could use your attention, note that this only applies to value types (structs, ints) . For reference types (i.e. all classes) this is less of a problem as they are stored as a reference anyway.

It is however a good idea to implement generics in any code you write from here on. Not just because of the potential gains in speed but also because generics provide the compiler with much more information.

  • Generics reduce the number of coding errors as you have compile-time checking of types
  • Generics are more readable, you don’t need to cast all over the place and it’s always obvious what type it is associated with

Also if possible avoid passing value types as parameters to methods if these force a conversion to an object.

Image credit: Wall of boxed by celesteh

kick it on DotNetKicks.com

Be Sociable, Share!

Tags:

6 Responses to “Exploring C# Boxing”

  1. Mike Borozdin Says:

    Hi Martijn,

    Really well explained!

  2. Luke R Says:

    Great article, thanks. Interesting reading :-)

  3. Marcin Rybacki Says:

    Very nice article, I learned a few things from it…. I especially liked the comparison with those two examples

    It is good to know such snags or at least to be aware of it… so in case they crop up, the bug wouldn’t make your jaw hanging….

    But the other thing is that after a few years of working as a developer I have never had had such issues to trace… so my conclusion is that these kind of snags may most likely appear… on a job interview than on in real life :-)

  4. Martijn Says:

    @Marcin

    Thanks for the comment ! Just hoping that I am not creating food for job interviews questions with this post ;-)

  5. Jordan Says:

    Marcin,
    While I don’t use C# at work or necessarily on a day-to-day basis, I remember Jon Skeet saying (either in his excellent C# in Depth book or on an interview on .NET Rocks!) that the occurrence of casting issues in boxing and unboxing were relatively rare, so it sounds like your experience is quite possibly typical.

    Still, I think having strong type-checking is a great added feature, even if it isn’t that common of an issue.

    Martijn, this was a very good post. I was directed here from DotNetKicks, and was a little confused about why anyone was still talking about boxing, but your comparison to generics and the table showing the time difference really drove the point home: use generics!

  6. John Says:

    Martijn,

    Do you know how to unbox a reference type if you know the type as a variable? I am using reflection to try and write calsses that can create its own SQL statements. On a class I can reflect each member and determine its type, and get its Get function. But the get function Invoke returns an object. How do I unbox it?

    Type t = list[i].Type; // this varibale is correct!
    MethodInfo mi = list[i].GetFunction; // this is also correct
    var x = (t) mi.Invoke(my_valid_object, null);

    The compiler gacks on the ‘(t)’ symbol. This also won’t work:

    Type t = list[i].Type;
    var x = Activator.CreateInstance(t);

    Activator.CreateInstance returns an object, I know it is of type ‘t’ but I cannot cast it that way!

    Any ideas?


Most popular
Recent Comments
  • ARS: great plugin! I love it! but, it will be so nice if you can add attribute ‘title’ as one of...
  • Nelson: Saved me from doing it myself. Good article.
  • andy: i am currently playing taiwanese server wow in 奈辛瓦里(PVP) and i would like to realm transfer to somewhere there...
  • berties: any english speaking playing on a taiwanese server?
  • web application development: has C# search volume really so constant over the years? really surprising.