Microsoft Enterprise Platforms Support: Windows Server Core Team
EPS Team Blogs
Product Team Blogs
While working on Windows 8/Windows Server 2012, I have been tied up with other content creation projects and as such have not had time to write for AskCore. However, I’ve had a number of blog ideas rattling around in my head. Today I finally found some time to get one of them committed to paper...ah...Word document.
One of the most involved classic cases that I’ve been called upon to provide support for is the ‘where is my hard drive space’ case.
The call usually starts like this…
“I have a 250GB hard drive. When I look at the properties of the drive, it tells me that I have 75GB free and that 175GB are in use. However, when I add all my files together, I only see 150GB in use. That is a difference of 25GB. Where did my space go?”
This is not an easy question to answer. I’ve spent years improving my understanding of how files are stored in NTFS and while I can explain it, I end up having to teach concepts that customers really don’t need to know. My mentor, Dennis Middleton, even wrote a couple of blogs on the subject. But they have the same problem. You have to learn quite a bit about NTFS to really understand why these numbers never seem to work out as expected.
So in this blog entry, I will attempt to explain what is happening without getting too deep into the internals of NTFS. Some concepts I will have to explain, but I’ll try to keep the details light.
First of all, and I can’t stress this enough, it is NOT broke. NTFS is storing data according to its design. The problem is that when you add up all the files on your disk, you aren’t seeing what you think you are seeing. In fact, unless you are deep in file system forensics, you don’t really know what you are seeing at all.
There are two methods of looking at the space on your disk. I call them The Right Way and The Wrong Way.
Open the properties page on your volume.
Rule of thumb: The pie does not lie.
This is quick and slick….and always correct. Under the covers we are querying a special metafile called $BITMAP to get these values. This metafile is like a database of the clusters on your volume. It doesn’t know anything about what’s stored in a particular cluster, it just knows if the cluster is in use or not. This is also why the pie chart draws so quickly. We are only reading one file to get this information.
The wrong way is to use Windows Explorer, navigate to your root directory, and get the properties on all your files. To the casual observer, and honestly, to most people, this SEEMS like it should be the same thing. But it is not. It’s not even close.
Comparing these two is really comparing apples and oranges. The Right Way will tell you how much free space/used space you have. The Wrong Way, however, tells you this and only this…
The space used by the unnamed data streams for the files that the current user has access to and are not hidden.
So any files that are hidden or that the user doesn’t have access to are not included when adding up the files using The Wrong Way. And before you start thinking to yourself, “I’m the administrator on this computer and I have access to everything.”….yes you are, but no you don’t.
What you don’t see can really be broken down into two categories, Normal files and Metafiles.
Normal files are just that, normal files. But in this case you don’t have permissions to look at them. The best example of this is the ‘System Volume Information’ directory that is just off the root directory. By default, you don’t have permissions to view what’s in there. Since this is where VSS stores its snapshots, there can be some big files in there. And for those of you running Windows Server 2012, this is also where the Chunk Store is kept for volumes using file system deduplication.
Also, I’ve seen applications that keep some of their logs hidden away from the user. They can be accessed via the application, but you can’t see them normally in Windows Explorer due to permissions.
For instances like this, you can take ownership of the files/directories and change the permissions. However, doing so might cause problems to the system. So if you want to see the file sizes on files that you do not have permissions for, use the method discussed in the next few sections.
Metafiles are files hidden away by Windows that are used for specific functions. Earlier I mentioned the $BITMAP metafile. Also, I authored a blog entry a while ago that talked about some of the other common metafiles.
Unlike normal files, where you can take ownership of them and change the permissions, metafiles are hidden and Windows does not want you to see them. This is for your own good. Interacting with metafiles directly just causes trouble.
Since metafiles can take up space, and we can’t look at them with Windows Explorer, I worked out a relatively quick way to look at these files. Some metafiles can be queried using various means (CHKDSK, FSUTIL, etc) but this method will work for all metafiles.
I do want to take a moment to point out that if a metafile is taking up space, that doesn’t necessarily mean that there is a problem. Some metafiles will take up a large amount of space and it is completely normal that they do. Don’t get focused on preventing it from happening or putting a stop to it.
To look at metafiles, we need a sector editor that understands NTFS. Microsoft has a sector editor (Disk Probe) but it doesn’t really have the functionality needed for this type of work. There are a number of sector editors out there. If you don’t have a favorite, I recommend you do some research to find one that suits your needs. For the purposes of screenshots, I will be using WinHex. Its not an endorsement. I just have to use something.
Once the volume is loaded up, you can see all the files in the root directory.
A number of metafiles are actually in the root directory. The $BITMAP file is highlighted and we can see that it is approximately 29.1MB in size. So just by itself, this is nearly 30MB of used space that wouldn’t show up if you used The Wrong Way. However, this isn’t really that much, and this particular metafile doesn’t normally get bigger unless the volume is extended.
Actually most metafiles are quite small and can be ignored. So let’s look at files that can grow to considerable size.
$LOGFILE: This file can get big if its maximum size is set beyond the default.
$MFT: The more files you have and the more fragmentation you have, the more entries will be created in the Master File Table. This will cause the $MFT metafile to grow.
$SECURE: This is where security descriptors are stored. The more complex your NTFS security is on this volume, the larger this file can grow. It is currently listed at 0 bytes. I’ll explain this shortly.
Before we continue, it is important to note that the sizes listed here are not always the entire size of the file. This size just refers to the file’s primary (unnamed) data stream.
Some files will show a “….” on the icon. This means that the file has some other attribute in it that is taking up some space as well.
Using $SECURE as an example, if you drill down on that file, you can see the three additional attributes that it contains, $SDH, $SDS, and $SII.
You don’t have to know what they are for. You just need to know that they take up space. If I added all the sizes together, I’d get the approximate amount of space that the file really uses. In this case that works out to about 2MB. So not that big in this case.
Now let’s venture outside of the root directory. Earlier, when we listed out the root directory, at the top of the list, we could see $EXTEND.
This is a metafile, but it is also a directory. As such, we can drill down to see what’s in it.
We can see a few metafiles in here, including another directory. The metafile that I want to draw your attention to is $USNJRNL. This is the metafile used for the USN Journal (aka change journal or NTFS journal).
Without getting too deep into how this file works, it is used by various applications to track changes. As such, this file can actually get pretty big. In our screenshot we can see that its listed as 0 bytes, however it does have additional attributes in it.
When we drill down into it, you can see two additional attributes, $J and $MAX. The $J is normally the big one. While my test volume doesn’t have a large change journal, I have worked a number of cases where this has grown to several gigabytes in size. In fact my home computer has a change journal of about 12GB.
NOTE: $J is actually sparse. So the amount of space it takes up will vary from what’s listed. But discussing sparse is outside the scope of this article.
And while we have been using this method to look at metafiles, this works for normal files as well…even if you do not have permissions for the file. It won’t allow you to read the contents of the file, but that’s not what this is about. This is about viewing file sizes.
Notice that in this case, I didn’t have any one place where a great deal of space was hiding. Sometimes that’s the way it works. A bunch of small spaces, used up here and there, adding up to a big difference.
There is one other place where used space can hide. Its very rare and I only include it in the spirit of being complete. We learned that The Wrong way only showed us…
So what about other streams? Without getting too deep into a discussion about streams, let me cobble up a simplified view. Picture this, you open a Word document and type out some text. Then you save that file. The text and all its formatting is saved in a structure called the ‘unnamed data stream’. But files can have other streams. They are called alternate data streams. When you use The Wrong Way, it doesn’t take into account space used in an alternate data stream.
Most of the time alternate data streams are tiny. However, I have been involved in a couple of cases where an application stored gigabytes of data in alternate data streams. So if you are tracking down space and you just can’t find it anywhere else, look at the alternate data streams on your files.
Microsoft provides a command line tool that will look at your alternate data streams and let you know what files have them.
Alternate data streams are not necessarily bad. In fact some Microsoft programs use alternate data streams to tag and classify files. But again, those streams are very small. Use the STREAM.EXE tool to look for streams that are large or large amount of files that have alternate data streams.
Also, there is a specific issue that can cause the WinSxS directory to use up additional disk space. So I want to make note of it here.
2795190 How to address disk space issues that are caused by a large Windows component store (WinSxS) directory
So that brings us to a close. There are a number of places on your volume that you can’t see and can’t directly interact with. This is why using The Wrong Way to look at your disk space is simply a bad idea. Trust the pie chart. And remember, just because a file or metafile is using space, doesn’t mean that there is a problem.
Thanks for reading,
Robert Mitchell Senior Support Escalation Engineer
Interested in Azure Server Backup? Check out my videos…
Want to know more about Microsoft storage? Check out my blogs...
I also write content for Windows IT Pro magazine:
So, does WMI use the bad way or good way?
You misspelled "gigabyte" in the alternate data streams section. "Gigabye" is cute, an eggcorn as the NLP folks would say. The bytes of data do indeed "go bye", or hidden, as described.
I just read about another vagary of NTFS from that old, yet new thing, here blogs.msdn.com/.../10109789.aspx The time stamp for last access time is no longer saved after Win XP, it seems. I didn't know. I remember using that a lot, for "file system forensics" as you referred to it.
This is one of my favorite posts, ever, on TechNet. It is so useful! Thank you!
Also, a 750GB is never 750GB. As soon as you use it in Win 7 [for example, on a laptop], 100 MB is held in reserve. Even then 750GB is unformatted.
Tony, which WMI command?
EllieK, a misspelling! I'll report that and get it fixed. Thanks for pointing that out to me. :) I'm also a fan of Raymond Chen. You might want to pick up his book if you haven't already.
Ed, while you are correct, those sizes are total disk sizes, while I'm only referring to file system sizes. But yes, there is a bit of lost space for things like reserved partition, slack space between the MBR and the boot sector (unprotected space), and also just to 'rounding'.
Thanks for your comments :)
A wonderful article. Its going to be very helpful for the engineers who need to answer customers query about Disk Space Usage esp. the tool “WinHex” can solve the mystery of lost space for the customer. Thanks a lot for this.