System.OutOfMemoryException: ‘Exception of type ‘System.OutOfMemoryException’ was thrown’

You’ve just created a Console app in the latest Visual Studio, and wrote some C# code that allocates some non-negligible quantity of memory, say 6 GB. The machine you’re developing has a decent amount of RAM – 16GB – and it’s running 64-bit Windows 10.

You hit F5, but are struck to find the debugger breaking into your code almost immediately and showing:

Figure 1 – Visual Studio breaking into an exception

What’s going on here ? You’re not running some other memory-consuming app. 6 GB surely should have been in reach for your code. The question that this post will set out to answer thus becomes simply: “Why do I get a System.OutOfMemoryException when running my recently created C# app on an idle, 64-bit Windows machine with lots of RAM ?“.

TL;DR (small scroll bar => therefore I deduct a lot of text => I’ve got no time for that, and need the answer now): The default build settings for Visual Studio limit your app’s virtual address space to 4 GB. Go into your project’s Properties, go to Build, and choose Platform target as x64. Build your solution again and you’re done.

Not so fast ! Tell me more about what goes on under the hood: Buckle up, since we’re going for a ride. First we’ll look at a simple example of code that consumes a lot of memory fast, then uncover interesting facts about our problem, hit a “Wait, what ?” moment, learn the fundamentals of virtual memory, find the root cause of our problem then finish with a series of Q&A.

The Sample Code

Let’s replicate the issue you’ve encountered first – the out-of-memory thing. We’ll pick a simple method of allocating lots of memory – creating several large int arrays. Let’s make each array contain 10 million int values. As for how many of these arrays should be: our target for now is to replicate the initial scenario that started this blog post – that is consuming 6 GB of memory – so we should choose the number of arrays accordingly.

What we need to know is how much an int takes in memory. As it turns out, an int will always take 4 bytes of memory. Thus, an array of 10 million int elements would take 40 million bytes of memory. This will actually be the same on either a 32-bit platform or a 64-bit one. If we divide the 6 GB (6.442.450.944 bytes) to 40 million bytes, we’ll get roughly 162. This should be in theory the number of 40 mil arrays required to fill 6 GB of memory.

Now that the numbers are clear, let’s write the code:

using System;

namespace LeakMemory
{
    class Program
    {
        static void Main(string[] args)
        {
            const int BlockSIZE = 10000000;  // 10 million
            const int NoOfBlocks = 162;
            int[][] intArray = new int[NoOfBlocks][];

            Console.WriteLine("Press a key to start");
            Console.ReadLine();

            try
            {
                for (int k = 0; k < NoOfBlocks; k++)
                {
                    // Generate a large array of ints. This will end up on the heap
                    intArray[k] = new int[BlockSIZE];
                    Console.WriteLine("Allocated (but not touched) for array {0}: {1}", k, BlockSIZE);
                    // Sleep for 100 ms
                    System.Threading.Thread.Sleep(100);
                }
            }
            catch (Exception e)
            {
                Console.WriteLine(e.Message);
            }

            Console.WriteLine("done");
            Console.ReadLine();

            // Prevent the GC from destroying the objects created, by
            //  keeping a reference to them
            Console.WriteLine(intArray.Length);
        }
    }
}

Aside from allocating the arrays themselves, most of the code is fluff, and deals with writing output messages, waiting for a key to be pressed to get to the next section or delaying allocating the subsequent array. However, this all will come in handy when we’ll analyze the memory usage in detail. We’re also catching any exception that might come up, and write it on the screen directly.

Something ain’t right

Let’s hit F5 and see how the sample code performs:

Not only that it doesn’t complete successfully, but the code doesn’t even make it till the 100th 10-million int array. The exception thrown is our familiar System.OutOfMemoryException. Visual Studio’s built-in profiling dashboard (top right) shows the memory used by our process going close to 3 GB – just as the exception hits.

Can I Get Some Light Over Here ?

Ok, we need to understand what goes on. Luckily, Visual Studio has a built-in memory profiler we can use right away. This will run the code once more, and allow us to take a snapshot after the exception is thrown, so that we understand where the usage goes:

Figure 3 – Memory profiling using Visual Studio

Oddly enough, this time the code can successfully allocate 67 arrays (the code fails just after displaying error for the 0-based array no 66). When we first ran the code, it could only allocate 66 arrays.

Drilling down into the objects for which memory is allocated, we see the same number of arrays successfully allocated (67) as in the console output. Each array takes roughly 40 million bytes, as expected. But why only allocate barely close to 3 GB – to be more precise 2.6 GB, as the profiler shows above -, when it was supposed to go up to 6 GB ?

Anyone Got A Bigger Light ?

Clearly we need a tool that can shed more light on the problem, and allow us to see the memory usage in better detail. Enter VMMap, which is “a process virtual and physical memory analysis utility. It shows a breakdown of a process’s committed virtual memory types as well as the amount of physical memory (working set) assigned by the operating system to those types“. The “committed” and “virtual memory” parts might sound scary for now, but nonetheless, the tool seems to tell where the memory goes from the operating system’s point of view, which should show significantly more than Visual Studio’s integrated memory profiler. Let’s give it a spin:

Figure 4 – VMMap showing detailed memory usage

The Timeline… button allows going to a particular point in time from the process’ life to see the exact memory usage then. The resulting window – Timeline (Committed) – shows (1) a gradual upward trend, then (2) an approximately constant usage, followed by (3) a sudden drop to zero. You also briefly see the time of the selection being changed to a point where the memory usage line is pretty flat (within (2) described above, which happens after the exception was thrown, but before all the lines start dropping as part of (3)). Ignore the yellow/green/purple colors mixed in the chart for a second, and also note that when the time of the selection is changed, all the values in the main window change as well.

Back in the main window, it’s a lot like a christmas tree, with multiple colors and lots of columns with funny names, but let’s leave that aside for a moment, and only focus on the Size column in the top left. Actually, let’s take a closer look at only 2 values there, the first one – Total – which represents the total memory consumed, and last one – Free – representing the total free memory. Here they are highlighted:

Hmm, the total size in the figure represents about 3.7 GB. That’s significantly larger than the 2.6 GB value we’ve got from Visual Studio’s Memory Profiler.

But look at the value for free space – that’s almost 300 MB of memory. This should have been more than enough for allocating 7 more of our 10 million int arrays with no problem.

How about we sum the 2 values – the total size and the free space ? The number is exactly 4 GB. Intriguing. This seems to suggest that the total memory our process gets is exactly 4 GB.

Wait, What ?

VMMap has a concise, to-the-point help. If you lookup what Total WS means, it says “The amount of physical memory assigned to the type or region“. So the value at the intersection of the Total WS column and the Total row will tell us exactly how much physical memory the process is taking at its peak usage, right after the out-of-memory exception is thrown:

The value is… 12,356 KB. In other words about 12 MB. So VMMap is telling us that our program, which was supposed to allocate 6 GB of memory (but fails somewhere midway by throwing an exception) is only capable of allocating 12 MB of RAM ? But that’s not even the size of one array of 10 million int, and we know for sure we’ve allocated not one, but 67 of them successfully ! What kind of sorcery is this ?

A Trip Down Memory Lane

Before moving on, you need to know that there’s a very special book, called “Windows Internals“, that analyses in depth how Windows works under the hood. It’s been around since 1992, back when Windows NT roamed the Earth. The current 7th edition handles Windows 10 and Windows Server 2016. The chapter describing memory management alone is 182 pages long. Extensive references to the contents found there will be made next.

Back to our issue at hand, we have to start with some basic facts about how programs access memory. Specifically, in our small C# example, the resulting process is never allocating chunks of physical memory directly. Windows itself doesn’t hand out to the process any address for physical RAM. Instead, the concept of virtual memory is used.

Let’s see how “Windows Internals” defines this notion:

Windows implements a virtual memory system based on a flat (linear) address space that provides each process with the illusion of having its own large, private address space. Virtual memory provides a logical view of memory that might not correspond to its physical layout.
Windows Internals 7th Edition – Chapter 1 “Concepts and tools”

Let’s visualize this:

Figure 7 – Virtual Memory assigned to a process

So our process gets handed out a range of “fake”, virtual addresses. Windows works together with the CPU to translate – or map – these virtual addresses to the place where they actually point – either the physical RAM or the disk.

In figure 7, the green chunks are in use by the process, and point to a “backing” medium (RAM or disk), while the orange chunks are free.

Note something of interest: contiguous virtual memory chunks can point to non-contiguous chunks in physical memory. These chunks are called pages, and they are usually 4 KB in size.

Guilty As Charged

Let’s keep in mind the goal we’ve set out in the beginning of this post – we want to find out why we’ve got an out-of-memory exception. Remember that we know from VMMap that the total virtual memory size allocated to our process is 4 GB.

In other words, our process gets handed by Windows 4 GB of memory space, cut into pages, each 4 KB long. Initially all those pages will be “orange” – free, with no data written to them. Once we start allocating our int arrays, some of the pages will start turning “green”.

Note that there’s a sort of dual reality going on. From the process’ point of view, it’s writing and allocating the int arrays in either the “orange” boxes or “green” boxes that haven’t yet filled up; it knows nothing about where such a box is really stored in the back. The reality however, which Windows knows too well, is that there’s no data stored in either the “green” or “orange” boxes in figure 7, only simple mappings that lead to the data itself – stored in RAM or on the disk.

Since there’s really no compression at play here, there won’t really be a way to fit those 6 GB of data into just 4 GB. Eventually we’ll exhaust even the last available free page. You can’t just place 6 eggs into an egg carton that can only accommodate 4. We just have to accept that the exception raised is a natural thing, given the circumstances.

So The World Is A Small Place ?

“Are you saying that every process out there only gets access to 4 GB of memory ?(!)” I rightfully hear you asking.

Let’s take a look at the default configuration options used by Visual Studio for a C# console app:

Figure 8 – Visual Studio’s default build settings

Note the highlighted values. To simplify for the sake of our discussion, this combo (Any CPU as Platform target plus Prefer 32-bit) will get us 2 things:

Visual Studio will compile the source code to an .exe file that will be run as a 32-bit process when started, regardless if the underlying system is 32-bit or 64-bit Windows.
The Large Address Aware flag will be set in the resulting .exe file, which essentially tells Windows that it can successfully handle more than 2 GB of virtual address space.

These 2 points combine on a 64-bit Windows so that the process is granted via the Wow64 mechanism its maximum allocable space given its 32-bit constraint – that is of 2^32 bytes, or exactly 4 GB.

If the code is compiled specifically for 64-bit systems – eg by simply unticking the Prefer 32-bit option back in figure 8, suddenly the process – when run on a 64-bit machine – will get access to 128 TB of virtual address space.

An important point to remember: the values presented above for a 64-bit system, namely 4 GB (for a 32-bit process that is large address aware) and 128 TB (for a 64-bit process) respectively are the maximum addressable virtual address space ranges currently for a Windows 10 box. A system can have only 2 GB of physical memory, yet it doesn’t change the fact that it will be able to address 4 GB of address space; how that address space is distributed when actually needed – eg say 700 MB in physical RAM, while the rest on disk – is up to the underlying operating system. Conversely however, having 6 GB (or 7/10/20/50 GB) won’t help a 32-bit large address aware process get more than 4 GB of virtual address space.

So 1 mystery down, 2 more to go…

Bits and Pieces

Remember those 300+ MB of free space in Figure 5 back when the out-of-memory exception was thrown ? Why is the exception raised when there’s still some space remaining ?

Let’s look first at how .NET actually reserves memory for an array. As this older Microsoft article puts it: “The contents of an array are stored in contiguous memory“.

But where in memory are these arrays actually placed ? Every object ends up in one of 2 places – the stack or the heap. We just need to figure out which. Luckily, “C# in Depth” (Third Edition) by Jon Skeet has the answer, all within a couple of pages:

Array types are reference types, even if the element type is a value type (so int[] is still a reference type, even though int is a value type)
C# in Depth (Third Edition), Jon Skeet

[…]an instance of a reference type is always created on the heap.
C# in Depth (Third Edition), Jon Skeet

The thing is that there are 2 types of heaps that a process can allocate: unmanaged and managed. Which kind is used by ours ? “Writing High-Performance .NET Code” (2nd Edition) by Ben Watson has the answer:

The CLR allocates all managed .NET objects on the managed heap, also called the GC heap, because the objects on it are subject to garbage collection.
“Writing High-Performance .NET Code” (2nd Edition), Ben Watson

If the words “managed heap” look familiar, it’s because VMMap has a dedicated category just for it in the memory types it’s showing.

Now let’s look at what happens in the last seconds of our process’ lifetime, shortly before the exception is thrown. We’ll use the “Address Space Fragmentation” window, which displays the various types of memory in use and their distribution within the process’ address space. Ignore the colors in the “Address Space Fragmentation” window to the right for now, but keep an eye out for the free space. We’ll also do one more thing: sort the free space blocks in descending order.

Figure 9 – Address space fragmentation in action

We can see the free space gradually filling up. The allocations are all contiguous, just like the theory quoted before said they would be. So we don’t see, for example, the free space around the “violet” data being filled, since there’s no large enough “gap” to accommodate it. Yet in the end we’re still left with 2 free blocks, each in excess of 40 mil bytes, which should accept at least 2 more int arrays. You can see each of them highlighted, and their space clearly indicated on the fragmentation window towards the end of the animation.

The thing is that so far we’ve made an assumption – that each array will occupy the space required to hold the actual data, that is 4 bytes (/int) x 10 million (int objects) = 40 mil bytes. But let’s see how each block actually looks like in the virtual address space. We’ll go a to a point in time midway – when we know sufficient data has been already allocated – and only filter for the “Managed Heap” category, and sort the blocks by size in descending order:

Figure 10 – Tracking blocks across the fragmentation view

It turns out that each block is 49,152 KB in size – or 48 MB -, and is composed of 2 sub-blocks: one of 39,068 KB and another of 10,084 KB. The first value – 39,068 KB – is really close to our expected 40.000.000 bytes, with only 5,632 bytes to spare, which suggests this is were our int elements are stored. The second value seems to indicate some sort of overhead. Note several such sub-blocks being highlighted in the fragmentation view. Note that for each 48 MB block, both of the sub-blocks contained are contiguous.

What this means is that there has to be a free space “gap” big enough to accomodate 49,152 KB in order to successfully allocate another array of int elements. But if you look back at Figure 9, you’ll see that we’ve just run out of luck – the largest free space block only has 41,408 KB. The system no longer has contiguous free memory space to use for one more subsequent allocation, and – despite having several hundred MB of free space made up from small “pieces” – throws an out-of-memory exception.

So it wasn’t the fact that we’ve exhausted our 4 GB virtual memory limit that threw the out-of-memory exception, but the inability to find a large enough block of free space.

This leaves one with one more question to answer.

Your Reservation Is Now Confirmed

Remember the ludicrously low number of actual used physical memory of 12,356 KB back in Figure 6 ? How come it’s so low ?

We briefly touched on this issue in the last paragraph of So the World Is A Small Place ? by saying that some of the virtual address space can be backed up by physical memory, or can be paged out to disk.

There are 4 kinds of memory pages:

Pages in a process virtual address space are either free, reserved, committed, or shareable.
Windows Internals 7th Edition – Chapter 5 “Memory Management”

When we’re allocating each int array, what’s happening under the hood (through .NET and the underlying operating system) is that memory for that array is committed. Committing in this context involves the OS performing the following:

Setting aside virtual address space within our process large enough for it to be able to address the area being allocated
Providing a guarantee for the memory requested

For (1) this is relatively straightforward – the virtual address space is marked accordingly in structures called process VADs (virtual address descriptors). For (2), the OS needs to ensure that the memory requested is readily available to the process whenever it will need it in the future.

Note that neither of the two conditions demands providing the details of all the memory locations upfront. Giving out a guarantee that – say 12,000 memory pages – will be readily available when requested is very different than finding a particular spot for each of those 12,000 individual pages in a backing medium – be it physical RAM or one of the paging files on disk. The latter is a lot of work.

And the OS takes the easy way out – it just guarantees that the memory will be available when needed. It will do this by ensuring the commit limit – which is the sum of the size of RAM plus the current size of the paging files – is enough to honor all the requests for virtual address space that the OS has agreed to so far.

So if 3 processes commit memory – the first one 400 MB, the second 200 MB and the third 300 MB – the system must ensure that somewhere either in RAM or in the paging files there is enough space to hold at least 900 MB, that can be used to store the data if those processes might be accessing the data in the future.

The OS is literally being lazy. And this is actually the name of the trick employed: lazy-evaluation technique. More from “Windows Internals“:

For example, a page of private committed memory does not actually occupy either a physical page of RAM or the equivalent page file space until it’s been referenced at least once.

Why is this so ? Because:

When a thread commits a large region of virtual memory […], the memory manager could immediately construct the page tables required to access the entire range of allocated memory. But what if some of that range is never accessed? Creating page tables for the entire range would be a wasted effort.

And if you think back to our code, it’s simply allocating int arrays, it doesn’t write to any of the elements. We never asked to store any values in the arrays, so the OS was lazy enough to not go about building the structures – called PTEs (Page Table Entries) that would have linked the virtual address space within our process to physical pages that were to be stored in RAM.

But what does the term working set stand for back in Figure 6 ?

A subset of virtual pages resident in physical memory is called a working set.

Yet we never got to the point where we demanded the actual virtual pages, therefore the system never built the PTE structures that would have linked those virtual pages to physical ones in the RAM, which resulted in our process having a close-to-nothing working set, as can be clearly seen in Figure 6.

Is It All Lies ?

But what if we were to actually “touch” the data that we’re allocating ? According to what we’ve seen above, this would have to trigger the creation of virtual pages mapped to RAM. Writing a value to every int element in the arrays we’re spawning should do the trick.

However there’s one shortcut we can take. Remember that an int element takes 4 bytes, and that a page is 4 KB in size – or 4096 bytes. We also know that the array will be allocated as contiguous memory. Therefore, we don’t really need to touch every single element of the array, but only every 1024th element. This is just enough to demand for a page to be created and brought within the working set. So let’s slightly modify the for block that’s allocating the arrays in our code:

            for (int k = 0; k < NoOfBlocks; k++)
            {
                // Generate a large array of ints. This will end up on the heap
                intArray[k] = new int[BlockSIZE];
                //Console.WriteLine("Allocated (but not touched) for array {0}: {1} bytes", k, BlockSIZE);
                for(int i=0;i<BlockSIZE;i+=1024)
                {
                    intArray[k][i] = 0;
                }
                Console.WriteLine("Allocated (and touched) for array {0}: {1} bytes", k, BlockSIZE);
                // Sleep for 100 ms
                System.Threading.Thread.Sleep(100);
            }

Let’s see the result after running this code:

Figure 11 – Touching the committed memory brings it in the WS

The values are almost identical this time, meaning pages were created and our data currently sits in the physical RAM.

Q & A

Q: You mentioned back in one of sections that the pages are usually 4 KB in size. What’s the instance they have a different size, and what are those sizes ?
A: There are small (4 KB), large (2 MB) and – as of Windows 10 version 1607 x64 – huge pages (1 GB). For more details look in the “Large and small pages” section close to the beginning of chapter 5 in “Windows Internals, Part 1” (7th Edition).

Q: Why use this virtual memory concept in the first place ? It just seems to insert an unneeded level of indirection. Why not just write to RAM physical addresses directly ?
A: Microsoft itself lists 3 arguments going for the notion of virtual memory here. It also has some nice diagrams, and it’s concise for what it’s communicating across.

Q: You mentioned that on a 64-bit Windows, 64-bit compiler generated code will result in a process that can address up to 128 TB of virtual address space. However if I compute 2^64 I get a lot more than 128 TB. How come ?
A: A quote from Windows Internals:

Sixty-four bits of address space is 2 to the 64th power, or 16 EB (where 1 EB equals 1,024 PB, or 1,048,576 TB), but current 64-bit hardware limits this to smaller values.

Q: But AWE could be used from Pentium Pro times to allocate 64 GB of RAM.
A: Remember that the virtual address space is limited to 4 GB for a large-address aware, 32-bit process running on 64-bit Windows. A *lot* of physical memory could be mapped using the (by comparison, relatively small) virtual address space. In effect, the virtual address space is used as a “window” into the large physical memory.

Q: What if allocating larger int blocks, from 10 mil to say 12 mil elements each. Would the overhead be increased proportionally ?
A: No. There are certain block sizes that seem to be used by the Large Object Heap. When allocating 12 mil elements, the overall size of the block is still 49,152 KB, with a “band” of only 2,272 KB of reserved memory. When allocating 13 mil elements, the overall size of the block goes up to 65,536 KB, with 14,748 KB of reserved space for each:

Q: What’s causing the overhead seen in the question above, as well as within the article ?
A: At this time (4/21/2019) I don’t have the answer. I do believe the low-fragmentation heap, which .NET is using under the hood for its heap implementation, holds the secret to this.

Q: Does the contiguous data found within each virtual page map to correspondingly contiguous data within the physical memory pages ? Or to rephrase, are various consecutive virtual space addresses within the same virtual page pointing to spread-out locations within a physical page, or even multiple physical pages ?
A: They are always contiguous. Refer to “Windows Internals, Part 1” (7th Edition) to chapter 5, where it’s detailed how in the process of address translation the CPU copies the last 12 bits in every virtual address to reference the offset in a physical page. This means the order is the same within both the virtual page as well as the physical one. Note how RamMap shows the correspondence of physical-to-virtual addresses on a 4 KB boundary, or exactly the size of a regular page.

Q: In all the animation and figures I’m seeing a yellow chunk of space, alongside the green one for “Managed Heap”. This yellow one is labeled “Private Data”, and it’s quite large in size. What’s up with that ?
A: There’s a bug in the current version of VMMap, whereby the 32-bit version – needed to analyze the 32-bit executable for the int allocator code – incorrectly classifies virtual addresses pointing to .NET allocated data above 2 GB as private data, instead of managed heap. You’ll also see that the working set for all int arrays classified as such appears to be nothing – when in reality this is not the case. I’m currently (4/21/2019) in contact with Mark Russinovich (the author of VMMap) to see how this can be fixed. The bug however doesn’t exist in the 64-bit version of VMMap, and all the allocations will correctly show up as ‘Managed Heap’.

Q: I’d like to understand more about the PTE structures. Where can I find more information ?
A: Look inside chapter 5 (“Memory Management“) within “Windows Internals, Part 1” (7th Edition). There’s an “Address Translation” section that goes into all the details, complete with diagrams.

Q: Your article is hard to follow and I can’t really understand much. Can you recommend some sources that do a better job than you at explaining these concepts ?
A: Eric Lippert has a very good blog post here. There’s also a very nice presentation by Mark Russinovich here which handles a lot of topics about memory (including a 2nd presentation, also 1+ hours long). Though both sources are quite dated, being several years old, the concepts are very much current.

Q: Where can I find more info about the Platform Target setting in Visual Studio ?
A: The previous post on this very blog describes that in detail. You can start reading from this section.

Q: I’ve tried duplicating your VMMap experiment, but sometimes I’m seeing that the largest free block available is in excess of 100 KB. This is more than double the size of an int array, which should take around 49 KB (39KB + 10KB reserve), so there should’ve been space for at least one subsequent allocation. What’s going on ?
A: I don’t have a thorough answer for this right now (4/21/2019). I’ve noticed this myself. My only suspicion is that something extra goes on behind the scenes, aside the simple allocation for the int array, such as the .NET allocation mechanism going after some extra blocks of memory.

Q: I heard an int takes a double amount of space on a 64-bit system. You’re stating in this article that it’s 4 bytes on either 32-bit/64-bit. You’re wrong !
A: Don’t confuse an IntPtr – whose size is 4 bytes on a 32-bit platform and 8 bytes on a 64-bit one – which represents a pointer, to an int value. The pointer contains that int variable’s address, but what’s found at that address is the int value itself.

Mihai-Albert.com

Drilling Down to a Comfortable Level of Understanding