Am I being presumptuous when I say that sometimes Windows is an extremely confusing operating system?
I’m not referring to the graphical user interface. Anyone can point and click a mouse; I’m talking about the little things that just don’t seem to make any sense.
One of those little things revolves around the quirkiness of file sizes.
If you right click any file or folder in Windows and go to Properties you’ll see two seemingly contradictory attributes:
- Size on Disk
Sometimes these values match but other times they’re different.
As you can see below, I have a folder called Web which is where I store all my websites. Windows is showing that the Size is 3.82 GBs but the Size on disk is 3.87 GBs.
So which one is it?
Microsoft’s Raymond Chen has a great technical explanation of the size disparity; however, it might be too abstruse for average users. So I’ll give you the Vonnie translation.
Let’s say you created and saved and empty text file called “Nothingingness.txt”.
Even though there’s nothing in the file and the Size and Size on Disk properties show zero bytes, it still consumes space because Windows has to allocate negligible room for metadata such as:
- File location
- Creation Date
- The name of the file itself
To understand why there is sometimes a difference between the file size and the file size on disk we need to understand a little bit about the “lay-of-the-land” as it applies to hard drives.
Traditional spinning disk hard drives are segmented into concentric circles called tracks. If you could visualize this it would look something like a bulls-eye from a dartboard.
A single “slice” of the dartboard “pie” gives you a sector. It’s basically a way to section the hard drive so you can do something meaningful with the data on it.
A collection of sectors is called a cluster and this is the tiniest amount of disk space that can be used to hold a file. They typically range from 512 bytes to 32KB in size but the important thing to note is that the sizes are estimations and are always rounded up to nearest whole number of clusters.
This means if your hard drive is segmented into 512 byte clusters then any files smaller than 512 bytes will still take up the full 512 bytes on the disk.
To illustrate this point I created a new file and copied all 26 characters of the alphabet 20 times. Since this is a plain text file with no formatting, spaces or anything like that, the actual size of the file is 520 bytes.
That’s because each ASCII character is exactly 1 byte so… 26 characters in the alphabet multiplied by 20 duplicates equals exactly 520.
But notice that the Size on Disk shows 4,096 bytes. Where the heck did that number come from?
The 4,096 is my cluster size also called the allocation unit.
This is how you can find your cluster size:
In Windows 8 you can press Windows Key + x + a to open the Command prompt and type:
(Windows 7 and Vista users can just click the windows icon in the lower left corner of the screen and then type the same command) .
When the diagnostic finishes, scroll down and look for the little line that reads:
bytes in each allocation unit
That is your cluster size in bytes.
In my example, you can see my cluster size is 4KB. I just took 4,096 bytes and divided by 1,024 to get the number of bytes per kilobyte.
The computer is just rounding to the nearest factor of the cluster size: 4KB.
Let’s keep doubling the bytes of text in the file to see what that does to the size on disk.
520 * 2 = 1040
Alright let’s keep going.
1040 * 2 = 2080
Now we’ve almost closed the gap. I’m going to double the alphabet one more time.
2080 * 2 = 4160.
Now since this is larger than the cluster size of 4096, Windows rounds up to the nearest factor of the cluster.
Notice how my Size on Disk value now shows the next multiple of the cluster size. This is what happens to every file and folder on your computer and it’s why the Size and Size on Disk values sometimes don’t match.
So what’s the real size?
The actual size of the file in bytes is the Size value. In other words, my alphabet file example is exactly 4,160 bytes; therefore, someone would need at least 4,160 bytes of space of storage on their hard drive to view it.
Think of the other value as the amount of space you’ll reclaim by deleting the file.
In other words, if I delete alphabet-160-times.txt I’ll get back 8,192 bytes of hard drive space. This is because the the smallest unit of data a hard drive can work with is the cluster and my file is consuming two blocks of 4,096.
I hope that makes sense.
This whole size vs size on disk thing is pretty confusing but it really all comes down to clusters. Think of the cluster like a cubbyhole. It doesn’t matter if you put something small like a shoe in the cubbyhole or a pillow that fills the entire hole. The cubbyhole is still a fixed size. I hate analogies when it comes to this sort of thing but that’s what’s happening on your computer.
Welcome to the wonderful world of bytes and clusters!
Sounds like a bad fruit smooth restaurant… oh well – I hope my article helped.