Disk Fragmentation is not Hyper-V's Enemy

Save to My DOJO

Where Did All of this Talk About Fragmentation Originate?
Server Systems are not Desktop Systems
These Differences are Meaningful
Disk Fragmentation and Hyper-V
How to Address Fragmentation in Hyper-V

Fragmentation is the most crippling problem in computing, wouldn’t you agree? I mean, that’s what the strange guy downtown paints on his walking billboard, so it must be true, right? And fragmentation is at least five or six or a hundred times worse for a VHDX file, isn’t it? All the experts are saying so, according to my psychic.

But, when I think about it, my psychic also told me that I’d end up rich with a full head of hair. And, I watched that downtown guy lose a bet to a fire hydrant. Maybe those two aren’t the best authorities on the subject. Likewise, most of the people that go on and on about fragmentation can’t demonstrate anything concrete that would qualify them as storage experts. In fact, they sound a lot like that guy that saw your employee badge in the restaurant line and ruined your lunch break by trying to impress you with all of his anecdotes proving that he “knows something about computers” in the hopes that you’d put in a good word for him with your HR department (and that they have a more generous attitude than his previous employers on the definition of “reasonable hygiene practices”).

To help prevent you from ever sounding like that guy, we’re going to take a solid look at the “problem” of fragmentation.

Where Did All of this Talk About Fragmentation Originate?

Before I get very far into this, let me point out that all of this jabber about fragmentation is utter nonsense. Most people that are afraid of it don’t know any better. The people that are trying to scare you with it either don’t know what they’re talking about or are trying to sell you something. If you’re about to go to the comments section with some story about that one time that a system was running slowly but you set everything to rights with a defrag, save it. I once bounced a quarter across a twelve foot oak table, off a guy’s forehead, and into a shot glass. Our anecdotes are equally meaningless, but at least mine is interesting and I can produce witnesses.

The point is, the “problem” of fragmentation is mostly a myth. Like most myths, it does have some roots in truth. To understand the myth, you must know its origins.

These Aren’t Your Uncle’s Hard Disks

In the dark ages of computing, hard disks were much different from the devices that you know and love today. I’m young enough that I missed the very early years, but the first one owned by my family consumed the entire top of a desktop computer chassis. I was initially thrilled when my father presented me with my very own PC as a high school graduation present. I quickly discovered that it was a ploy to keep me at home a little longer because it would be quite some time before I could afford an apartment large enough to hold its hard drive. You might be thinking, “So what, they were physically bigger. I have a dozen magazines claiming that size doesn’t matter!” Well, those articles weren’t written about computer hard drives, were they? In hard drives, physical characteristics matter.

Old Drives Were Physically Larger

The first issue is diameter. Or, more truthfully, radius. You see, there’s a little arm inside that hard drive whose job it is to move back and forth from the inside edge to the outside edge of the platter and back, picking up and putting down bits along the way. That requires time. The further the distance, the more time required. Even if we pretend that actuator motors haven’t improved at all, less time is required to travel a shorter distance. I don’t know actual measurements, but it’s a fair guess that those old disks had over a 2.5-inch radius, whereas modern 3.5″ disks are closer to a 1.5″ radius and 2.5″ disks something around a 1″ radius. It doesn’t sound like much until you compare them by percentage differences. Modern enterprise-class hard disks have less than half the maximum read/write head travel distance of those old units.

frag-trackdistance

It’s not just the radius. The hard disk that I had wasn’t only wide, it was also tall. That’s because it had more platters in it than modern drives. That’s important because, whereas each platter has its own set of read/write heads, a single motor controls all of the arms. Each additional platter increases the likelihood that the read/write head arm will need to move a meaningful distance to find data between any two read/write operations. That adds time.

Old Drives Were Physically Slower

After size, there’s rotational speed. The read/write heads follow a line from the center of the platter out to the edge of the platter, but that’s their only range of motion. If a head isn’t above the data that it wants, then it must hang around and wait for that data to show up. Today, we think of 5,400 RPM drives as “slow”. That drive of mine was moping along at a meagerly 3,600 RPM. That meant even more time was required to get/set data.

There were other factors that impacted speed as well, although none quite so strongly as rotational speed improvements. The point is, physical characteristics in old drives meant that they pushed and pulled data much more slowly than modern drives.

Old Drives Were Dumb

Up until the mid-2000s, every drive in (almost) every desktop computer used a PATA IDE or EIDE interface (distinction is not important for this discussion). A hard drive’s interface is the bit that sits between the connecting cable bits and the spinning disk/flying head bits. It’s the electronic brain that figures out where to put data and where to go get data. IDE brains are dumb (another word for “cheap”). They operate on a FIFO (first-in first-out) basis. This is an acronym that everyone knows but almost no one takes a moment to think about. For hard drives, it means that each command is processed in exactly the order in which it was received. Let’s say that it gets the following:

Read data from track 1
Write data to track 68,022
Read data from track 2

An IDE drive will perform those operations in exactly that order, even though it doesn’t make any sense. If you ever wondered why SCSI drives were so much more expensive than IDE drives, that was part of the reason. SCSI drives were a lot smarter. They would receive a list of demands from the host computer, plot the optimal course to satisfy those requests, and execute them in a logical fashion.

In the mid-2000s, we started getting new technology. AHCI and SATA emerged from the primordial acronym soup as Promethean saviors, bringing NCQ (native command queuing) to the lowly IDE interface. For the first time, IDE drives began to behave like SCSI drives. … OK, that’s overselling NCQ. A lot. It did help, but not as much as it might have because…

Operating Systems Take More Responsibility

It wasn’t just hard drives that operated in FIFO. Operating systems started it. They had good excuses, though. Hard drives were slow, but so were all of the other components. A child could conceive of better access techniques than FIFO, but even PhDs struggled against the CPU and memory requirements to implement them. Time changed all of that. Those other components gained remarkable speed improvements while hard disks lagged behind. Before “NCQ” was even coined, operating systems learned to optimize requests before sending them to the IDE’s FIFO buffers. That’s one of the ways that modern operating systems manage disk access better than those that existed at the dawn of defragmentation, but it’s certainly not alone.

This Isn’t Your Big Brother’s File System

The venerated FAT file system did its duty and did it well. But, the nature of disk storage changed dramatically, which is why we’ve mostly stopped using FAT. Now we have NTFS, and even that is becoming stale. Two things that it does a bit better than FAT is metadata placement and file allocation. Linux admins will be quick to point out that virtually all of their file systems are markedly better at preventing fragmentation than NTFS. However, most of the tribal knowledge around fragmentation on the Windows platform sources from the FAT days, and NTFS is certainly better than FAT.

Some of Us Keep Up with Technology

It was while I owned that gigantic, slow hard drive that the fear of fragmentation wormed its way into my mind. I saw some very convincing charts and graphs and read a very good spiel and I deeply absorbed every single word and took the entire message to heart. That was also the same period of my life in which I declined free front-row tickets to Collective Soul to avoid rescheduling a first date with a girl with whom I knew I had no future. It’s safe to say that my judgment was not sound during those days.

Over the years, I became a bit wiser. I looked back and realized some of the mistakes that I’d made. In this particular case, I slowly came to understand that everything that convinced me to defragment was marketing material from a company that sold defragmentation software. I also forced myself to admit that I never could detect any post-defragmentation performance improvements. I had allowed the propaganda to sucker me into climbing onto a bandwagon carrying a lot of other suckers, and we reinforced each others’ delusions.

That said, we were mostly talking about single-drive systems in personal computers. That transitions right into the real problem with the fragmentation discussion.

Server Systems are not Desktop Systems

I was fortunate enough that my career did not immediately shift directly from desktop support into server support. I worked through a gradual transition period. I also enjoyed the convenience of working with top-tier server administrators. I learned quickly, and thoroughly, that desktop systems and server systems are radically different.

Usage Patterns

You rely on your desktop or laptop computer for multiple tasks. You operate e-mail, web browsing, word processing, spreadsheet, instant messaging, and music software on a daily basis. If you’re a gamer, you’ve got that as well. Most of these applications use small amounts of data frequently and haphazardly; some use large amounts of data, also frequently and haphazardly. The ratio of write operations to read operations is very high, with writes commonly outnumbering reads.

Servers are different. Well-architected servers in an organization with sufficient budget will run only one application or application suite. If they use much data, they’ll rely on a database. In almost all cases, server systems perform substantially more read operations than write operations.

The end result is that server systems almost universally have more predictable disk I/O demands and noticeably higher cache hits than desktop systems. Under equal fragmentation levels, they’ll fare better.

Storage Hardware

Whether or not you’d say that server-class systems contain “better” hardware than desktop system is a matter of perspective. Server systems usually provide minimal video capabilities and their CPUs have gigantic caches but are otherwise unremarkable. That only makes sense; playing the newest Resident Evil at highest settings with a smooth frame rate requires substantially more resources than a domain controller for 5,000 users. Despite what many lay people have come to believe, server systems typically don’t work very hard. We build them for reliability, not speed.

Where servers have an edge is storage. SCSI has a solid record as the premier choice for server-class systems. For many years, it was much more reliable, although the differences are negligible today. One advantage that SCSI drives maintain over their less expensive cousins is higher rotational speeds. Of all the improvements that I mentioned above, the most meaningful advance in IDE drives was the increase of rotational speed from 3,600 RPM to 7,200 RPM. That’s a 100% gain. SCSI drives ship with 10,000 RPM motors (~38% faster than 7,200 RPM) and 15,000 RPM motors (108% faster than 7,200 RPM!).

Spindle speed doesn’t address the reliability issue, though. Hard drives need many components, and a lot of them move. Mechanical failure due to defect or wear is a matter of “when”, not “if”. Furthermore, they are susceptible to things that other component designers don’t even think about. If you get very close to a hard drive and shout at it while it’s powered, you can cause data loss. Conversely, my solid-state phone doesn’t seem to suffer nearly as much as I do even after the tenth attempt to get “OKAY GOOGLE!!!” to work as advertised.

Due to the fragility of spinning disks, almost all server systems architects design them to use multiple drives in a redundant configuration (lovingly known as RAID). The side effect of using multiple disks like this is a speed boost. We’re not going to talk about different RAID types because that’s not important here. The real point is that in practically all cases, a RAID configuration is faster than a single disk configuration. The more unique spindles in an array, the higher its speed.

With SCSI and RAID, it’s trivial to achieve speeds that are many multipliers faster than a single disk system. If we assume that fragmentation has ill effects and that defragmentation has positive effects, they are mitigated by the inherent speed boosts of this topology.

These Differences are Meaningful

When I began taking classes to train desktop support staff to become server support staff, I managed to avoid asking any overly stupid questions. My classmates weren’t so lucky. One asked about defragmentation jobs on server systems. The echoes of laughter were still reverberating through the building when the instructor finally caught his breath enough to choke out, “We don’t defragment server systems.” The student was mortified into silence, of course. Fortunately, there were enough shared sheepish looks that the instructor felt compelled to explain it. That was in the late ’90s, so the explanation was a bit different then, but it still boiled down to differences in usage and technology.

With today’s technology, we should be even less fearful of fragmentation in the datacenter, but, my observations seem to indicate that the reverse has happened. My guess is that training isn’t what it used to be and we simply have too many server administrators that were promoted off of the retail floor or the end-user help desk a bit too quickly. This is important to understand, though. Edge cases aside, fragmentation is of no concern for a properly architected server-class system. If you are using disks of an appropriate speed in a RAID array of an appropriate size, you will never realize meaningful performance improvements from a defragmentation cycle. If you are experiencing issues that you believe are due to fragmentation, expanding your array by one member (or two for RAID-10) will return substantially greater yields than the most optimized disk layout.

Disk Fragmentation and Hyper-V

To conceptualize the effect of fragmentation on Hyper-V, just think about the effect of fragmentation in general. When you think of disk access on a fragmented volume, you’ve probably got something like this in mind:

Jumpy Access

Look about right? Maybe a bit more complicated than that, but something along those lines, yes?

Now, imagine a Hyper-V system. It’s got, say, three virtual machines with their VHDX files in the same location. They’re all in the fixed format and the whole volume is nicely defragmented and pristine. As the virtual machines are running, what does their disk access look like to you. Is it like this?:

Jumpy Access

If you’re surprised that the pictures are the same, then I don’t think that you understand virtualization. All VMs require I/O and they all require their I/O more or less concurrently with I/O needs of other VMs. In the first picture, access had to skip a few blocks because of fragmentation. In the second picture, access had to skip a few blocks because it was another VM’s turn. I/O will always be a jumbled mess in a shared-storage virtualization world. There are mitigation strategies, but defragmentation is the most useless.

For fragmentation to be a problem, it must interrupt what would have otherwise been a smooth read or write operation. In other words, fragmentation is most harmful on systems that commonly perform long sequential reads and/or writes. A typical Hyper-V system hosting server guests is unlikely to perform meaningful quantities of long sequential reads and/or writes.

Disk Fragmentation and Dynamically-Expanding VHDX

Fragmentation is the most egregious of the copious, terrible excuses that people give for not using dynamically-expanding VHDX. If you listen to them, they’ll paint a beautiful word picture that will have you daydreaming that all the bits of your VHDX files are scattered across your LUNs like a bag of Trail Mix. I just want to ask anyone who tells those stories: “Do you own a computer? Have you ever seen a computer? Do you know how computers store data on disks? What about Hyper-V, do you have any idea how that works?” I’m thinking that there’s something lacking on at least one of those two fronts.

The notion fronted by the scare message is that your virtual machines are just going to drop a few bits here and there until your storage looks like a finely sifted hodge-podge of multicolored powders. The truth is that your virtual machines are going to allocate a great many blocks in one shot, maybe again at a later point in time, but will soon reach a sort of equilibrium. An example VM that uses a dynamically-expanding disk:

You create a new application server from an empty Windows Server template. Hyper-V writes that new VHDX copy as contiguously as the storage system can allow
You install the primary application. This causes Hyper-V to request many new blocks all at once. A large singular allocation results in the most contiguous usage possible
The primary application goes into production.
- If it’s the sort of app that works with big gobs of data at a time, then Hyper-V writes big gobs, which are more or less contiguous.
- If it’s the sort of app that works with little bits of data at a time, then fragmentation won’t matter much anyway
Normal activities cause a natural ebb and flow of the VM’s data usage (ex: downloading and deleting Windows Update files). A VM will re-use previously used blocks because that’s what computers do.

How to Address Fragmentation in Hyper-V

I am opposed to ever taking any serious steps to defragmenting a server system. It’s just a waste of time and causes a great deal of age-advancing disk thrashing. If you’re really concerned about disk performance, these are the best choices:

Add spindles to your storage array
Use faster disks
Use a faster array type
Don’t virtualize

If you have read all of this and done all of these things and you are still panicked about fragmentation, then there is still something that you can do. Get an empty LUN or other storage space that can hold your virtual machines. Use Storage Live Migration to move all of them there. Then, use Storage Live Migration to move them all back, one at a time. It will line them all up neatly end-to-end. If you want, copy in some “buffer” files in between each one and delete them once all VMs are in place. These directions come with a warning: you will never recover the time necessary to perform that operation.

Was this helpful?
Yes

Provide feedback about this article

Share this post

Not a DOJO Member yet?

Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!

38 thoughts on "Disk Fragmentation is not Hyper-V’s Enemy"

sol rosenberg says:

January 26, 2017 at 7:18 pm

https://www.linkedin.com/pulse/vmware-agrees-defrag-virtual-hosts-guests-san-peter-vervaene

How do you contradict this?

Reply
- Eric Siron says:
  
  January 26, 2017 at 8:43 pm
  
  For starters, it was written by the people that ~~bought~~ supplanted the company that fooled me so many years ago.
  Now just as then, they provide zero concrete evidence. I am still awaiting anyone’s reproducible pre- and post- performance traces.
  I don’t defragment, other than whatever Windows does with its behind-the-scenes automated stuff, and my performance traces show an average of 1ms access latency. How do they contradict that?
  
  Reply
  - Richard Yao says:
    
    August 13, 2018 at 9:17 am
    
    If Windows is doing automatic defragmentation, then you are defragmenting your drives by letting that run.
    
    Reply
    - Eric Siron says:
      
      August 13, 2018 at 3:16 pm
      
      Well, I am not doing anything. Windows is. And, as you can find in almost all complaints about that particular maintenance, Windows does not perform what most people think of as a defragmentation cycle. If that process was what people meant with the common vernacular usage of “defragment”, then I would not have written this article.
      
      Reply
Erin Steadman says:

January 26, 2017 at 10:33 pm

Even Windows Server 2016 still has Defragment enabled for all disks except SSD’s by default. Are you suggesting we should turn this off?

Reply
- Eric Siron says:
  
  January 26, 2017 at 10:40 pm
  
  Not at all. I’m suggesting that people leave things alone and not worry so much about them. The automated defrag process built into Windows/Windows Server isn’t what most of us traditionally think of when we talk about defrag. It doesn’t put the drives through a multi-day MTBF-rushing ordeal.
  
  Reply
Erin Steadman says:

January 26, 2017 at 11:00 pm

Ahh, Ok. I’ve never been one for 3rd party defrag tools either. Windows built in always seems to be adequate. Although back in the day, Norton Speeddisk (I think it was called) was pretty cool.

Reply
- Eric Siron says:
  
  January 26, 2017 at 11:04 pm
  
  Why yes. Yes it was. I’m not sure how far back you’re going with “back in the day”, but I believe that Norton Utilities were some of the best software tools available right up until Symantec wrecked them.
  
  Reply
Erin Steadman says:

January 27, 2017 at 12:30 am

Well at least Windows 95, maybe even 3.1 and Dos. i agree wholeheartedly. I avoid Symantec like the plague now. We didn’t have a lot of the functionality back then and had to have something like Norton Utilities just to do your job 🙂

Reply
David Suthers says:

May 25, 2017 at 8:15 pm

I will have to disagree with you on this one… and I have proof, at least on Desktops. We use an antivirus that because of multiple daily updates, fragments their definition files badly. Because these files are accessed constantly, there is a noticeable difference in performance if this file is fragmented. And I’m talking NOTICEABLE!

Also if you’ve ever had large Outlook .pst files you’ll also know the pain of fragmentation. It takes MINUTES to open Outlook if your mailbox is large, and fragmented.

You can argue that because on servers that multi-disk RAID arrays eliminate a lot of this and you would be at least partially correct. RAID arrays move the level of fragmentation you NOTICE to a much more higher level of fragmentation. But VMs with thin provisioned storage create a condition I like to call “double fragmentation”. This is where both the VHDX file is fragmented on the host server AND the guest drives are fragmented. This gives us the wonderful condition where a 500GB VHDX file is placed at random across a 10TB RAID. With any level of disk usage among the other VMs this will create large amounts of disk thrashing.

Reply
- Eric Siron says:
  
  May 25, 2017 at 8:44 pm
  
  Just so we’re clear, your argument is that you choose to employ a crappy AV on single-spindle systems for users that treat Outlook like a SQL-based CRM, therefore the rest of us should all operate our datacenters like it’s 1999? I just want to make sure that I’m not misunderstanding you.
  I’m glad that you found a cutesy term to make a non-issue look cataclysmic. You should trademark that now before someone else does. I hear that it’s awful to be sued for using a term that you coined.
  But, next time you want to tell me to disbelieve my own experiences and performance traces just because you have a pretty phrase, bring your own performance charts and reproducible methodology. Your FUD has no power here.
  
  Reply
Jake Snowbridge says:

May 28, 2018 at 2:27 pm

Wow Eric, I’m surprised at your response to David.
Regardless if I agree with him or not, I find it astonishing that you can on the hand write such a detailed article, but then in contrast shoot him down immediately without any factual/productive counterargument whatsoever.
Adamant much?

Reply
- Eric Siron says:
  
  June 26, 2018 at 1:47 am
  
  I’ve been having the same arguments for years and I’m sick of them. People bring me anecdotes of systems being used poorly that behave poorly and act like that proves that fragmentation is the problem. I bet these same people argue with their mechanics about engine cleaning products being the best thing ever and prove it by showing how poorly their gasoline engine cars run on diesel until they’re cleaned out.
  
  Reply
Richard Yao says:

August 13, 2018 at 9:10 am

I found your blog post in a google search for something unrelated, but after reading it, I would like to make a (partial) rebuttal.

Before I say why fragmentation can be a problem, I should say that fragmentation is not the worst that can happen. IOPS contention is a far bigger problem because it can delay system page faults (for those without the wisdom to have separate storage for their OS and their data) and blocking reads/writes for absurd amounts of time via queuing delay. Fragmentation turning sequential IO into random IO can worsen that, although the extent to which depends on how many actual times this happens in practice from random uncached reads.

You can also notice slowdowns from reduced bandwidth if your data is toward the end if the disk rather than at the start. It is about a factor if 2 difference.

“Likewise, most of the people that go on and on about fragmentation can’t demonstrate anything concrete that would qualify them as storage experts.”

I have been doing Linux filesystem development for several years. I am #2 on the contributor list here:

https://github.com/zfsonlinux/zfs/graphs/contributors

This is not to say that I “go on and on about fragmentation”. I rarely ever talk about it.

“I slowly came to understand that everything that convinced me to defragment was marketing material from a company that sold defragmentation software. I also forced myself to admit that I never could detect any post-defragmentation performance improvements.”

Back when I still used Windows, I noticed an improvement from defragmentation. NCQ definitely helped to make things less terrible, but a sequential read from the start of the disk always is better than a bunch of random reads. If you have things suffering from queuing delay because those random reads are causing IOPS saturation, there should be a drop in performance, especially for interactive workloads.

I noticed differences between defragmenters back when I still used Windows. The Windows builtin defragmenter was lousy and the one from diskeeper did not work much better. PerfectDisk did well and I really liked Jkdefrag.

With server workloads (excluding file servers and large databases), you hopefully have all of the things you need in the operating system’s in-memory cache so that you are not doing many read operations and fragmentation should not matter. With large databases, the answer is usually to get more RAM. With file servers, especially those that serve large amounts of large static content, you don’t have such an option.

“I am opposed to ever taking any serious steps to defragmenting a server system. It’s just a waste of time and causes a great deal of age-advancing disk thrashing.”

A user once asked me for help because his file server was performing poorly. It hosted large static content where the file contents were written in random order. His workload involved sequential reads of uncached files. He had set ZFS to use a 16KB record size if I recall because the data was being written in 16KB blocks and he wanted to avoid read-modify-write overhead (another thing that can cause slowdowns).

The way that ZFS works is that it tries to keep things in sequential order by writing the blocks in the order that it gets them. The files were written in random order and were absurdly fragmented. I suggested that he copy each of the files and delete the originals. His system performance doubled. After that, he wrote his files to a temporary directory and then copied them to the final location to avoid fragmentation.

“Fragmentation is the most egregious of the copious, terrible excuses that people give for not using dynamically-expanding VHDX.”

I do not use things like VHDX out of concern that it is yet another layer where things can go wrong (reliability wise). However, having another layer between your VM and storage certainly is not helping performance. When I do VMs, I put them on ZFS zvols and let ZFS worry about things.

Fragmentation affects ZFS, but the impact is not so egregious that I consider the lack of defragmentation in ZFS to be a problem. Others consider fragmentation to be more of a problem than I do and they have made efforts to minimize it in ZFS through more intelligent block placement. ZFS will write sequential writes to disk in sequential order when it can and it’s ARC algorithm is great at reducing read IOs, so our performance is fairly good even with the unavoidable fragmentation from doing CoW.

After thinking about it, using a dynamically expanding VMDK gives you 3 levels of fragmentation. You have fragmentation in the host file system, fragmentation in the VMDK and fragmentation in the guest filesystem. If I were to do thin provisioning, I would rely on ZFS for that, which would lower it to 2 levels. If you don’t do a dynamically expanding VMDK and you are using an in place filesystem or volume manager under your VM, you reduce the levels of fragmentation down to 1. Whether or not this matters depends on the extent to which the fragmentation worsens queuing delay for your workload. It isn’t as much as people think outside of extreme cases, but the effect is there.

In conclusion, it depends. Fragmentation is certainly not as bad as some people think, but it is not as harmless as your blog post seems to claim either. There are certainly rare workloads out there that are harmed by fragmentation (like the server that had all file blocks written in random order on ZFS with a 16KB recordsize setting).

Lastly, the fact that I am on Linux might make my remarks about how I use ZFS instead of a dynamically expanding VMDK somewhat less relevant to your auidence than they could have been, although a Windows port of ZFS is in the early stages of development, so that could change in the future.

Reply
- Eric Siron says:
  
  August 13, 2018 at 5:25 pm
  
  I want to show the proper respect to someone with your talents that has taken the time to write such a carefully thought out response. So, if anything you read below seems dismissive or needlessly combative, then I apologize. You appear to deserve better and I will strive to respond with the due respect.
  As a preamble, I believe that you represent an edge case. As a person that works directly on file systems, you have to worry about all of the cases. In my opinion, 90th, 95th, or even the 99th percentile should not be good enough for you. I’m glad that people like you exist and work on these file systems. But, I do not believe that you are an “everyman” of the computing community. So, while I value your input, I do not believe that it brings much strength as a counterargument to the points that I am making here.
  To those points, I have been writing and discussing this topic for over a decade. 100% of the rebuttals that I’ve received have been speculations, discussions about the process, and anecdotes. Your speculations, process discussion, and anecdotes are of well-above-average quality, but still, they are only speculations, process discussions, and anecdotes.
  First, your anecdote. The title of my article is stated as an absolute, and like most absolutes, it will fall apart in the extreme cases. However, I believe that throughout the text, I gave those extremes proper attention. I do not believe your anecdote or any of the others that I have received through the years disprove what I’m saying in any way. I also wonder how this person came to the conclusion that the file server was performing poorly? Given that, even today, most file servers sit behind a gigabit connection at most, and that even a horrifically fragmented FAT32 volume with zero cache on a minimal RAID system can easily keep a gigabit connection happily stuffed, what exactly was going on there? The other thing there, is that you don’t even really suggest fragmentation as a problem or defragmentation as a solution in this particular anecdote. I’m not sure why that tale was even presented in this context.
  That, I believe, is the fundamental problem in all of the anecdotes that I’ve received. In some cases, defragmentation temporarily alleviated the symptoms. In a few cases, defragmentation cleaned up after something else had caused a major problem and that problem had been addressed. In no case did defragmentation ever directly fix the problem.
  Second, the reason that this article matters. People listen to those speculations, process discussions, and anecdotes, and set aside hours each X period to perform a full, end-to-end optimization of bit ordering. I fully stand by my assertion that they are accomplishing nothing, and at great cost. The improvements that they see, if any, are likely coming from the minimally-invasive optimization passes that Windows already does all on its own. In the handful of cases that remain, I postulate that some application or service is causing problems and it needs to be replaced if it is poorly architected or contained if it just can’t be helped.
  Third, these improvements. For years, people have been claiming to me that they can feeeeeel the difference that a defrag makes. My counter-assertion is that a computer can feeeeeel the negative effects of any performance degradation much more acutely than any human can. Therefore, their feeeeeeling should readily appear on a pre vs. post comparison of performance traces. For a human to really be able to distinguish the differences, they should be visible in the comparison chart from forty feet away. We’re talking 10% or greater improvements. And yet, in all of the years that I have been writing and talking about this, literally no one has ever produced even one comparison chart of even questionable quality. Not one. I set out the challenge, and the person gathers up their feeeeeelings and disappears forever. If I ever am given such a chart in the future, the very first question I’m going to ask is, “How do I replicate your methodology?” The second question that I will ask is, “How does this translate to real world usage?” And I think that’s what scares people the most. The emperor has no clothes, and nobody wants to see those dangly bits.
  
  To take things a bit more piecemeal:
  “You can also notice slowdowns from reduced bandwidth if your data is toward the end if the disk rather than at the start. It is about a factor if 2 difference.” — I disagree fundamentally with the usage of “bandwidth” here, but there is some merit to what you’re saying. To that:
  - I did talk about physical dimensions in the material. But, the answer has always been to prefer smaller disks — 2.5″ over 3.5″ disks in modern systems. There are a multitude of reasons, but in this article’s context, I can safely assert that you cannot make any guarantee that even the best defragmentation software ever written will place data intelligently enough to address this problem.
  - If the important data got to the end of the disk ahead of the non-important data, then your problem was not caused by fragmentation and will not be solved by defragmentation.
  - “factor of 2” sounds scary, but “‘too small to care’ times two” frequently equals “still too small to care”. So, what are we factoring by 2? As-is, this fails a basic sniff test.
  - Only half of the data can be toward the end of the disk, maximum. The other half, and arguably most of the data in all practical cases, will be near the front of the disk. So, you will encounter that dreaded “factor of 2” less than 50% of the time. But, that “factor of 2” does not start at the halfway line, but as you approach the edge — where it should be a data desert anyway. So, the “factor of 2” bit is essentially FUD regardless of what you’re factoring.
  “(for those without the wisdom to have separate storage for their OS and their data)” — Please definitively quantify this. My OS and data are segregated in nearly all cases because OS is local and data is remote, but my OS disk access routinely reads 0 IOPS out to at least the 90th percentile with the remainder being consumed almost completely during administrative access and backup cycles. I see no evidence to support the implied assertion behind this statement, much less the expressed disparagement leveled at people that architect differently.
  
  “but a sequential read from the start of the disk always is better than a bunch of random reads.” — Agreed in principle. But, this is just a process discussion. Nothing in the real world behaves that way. Random read/writes are the big rule, big sequential reads are the tiny exception. Even more so in shared storage environments like Hyper-V. Furthermore, “better than” is stated without context or quantification.
  
  “I do not use things like VHDX out of concern that it is yet another layer where things can go wrong (reliability wise).” — This is FUD. And, as much as I respect your contributions to making the computing world a better place, you greatly diminish yourself by spreading FUD. Abstraction made computer usage viable for the masses and enables everything that we do with it today. You cannot argue that the mere existence of an abstraction layer is evil without also arguing against the very fabric of modern computing. I can accept the possibility that a layer such as VHDX causes problems, but I will not act on mere possibility when I have years of direct experience that so far have not turned up any problems. Yes, I have encountered corrupted VHDXs, but only in conditions that would have resulted in corrupted data anyway. Furthermore, if a problem is found, I would postulate that the problem comes from an environmental, design, or implementation error, not the simple existence of the abstraction layer. I will only accept your statement as valid if you show your work.
  
  “Fragmentation affects ZFS, but the impact is not so egregious that I consider the lack of defragmentation in ZFS to be a problem.” — On this, we have full agreement. We probably differ in that I would extend that statement to cover every modern file system that I know to be in common use, even crusty old NTFS. If you want to say that ZFS is better at it than NTFS, I’ll willingly concede the point without even asking for quantification. However, I would challenge any claim that the “better” will matter for more than a miniscule fraction of the computing world.
  
  “After thinking about it, using a dynamically expanding VMDK gives you 3 levels of fragmentation.” — I am not seeing the basis of this thought process. Your middle tier is just a composite of the outer two. Take away one and you have only the other. Either way, in practical terms, this is a process discussion. To talk about it in practical terms builds on the premise that defragmentation is something that should be feared, a premise which I continue to reject due to lack of evidence. Academically is another matter, of course.
  
  —
  Again, I understand that as a file system architect, you must be intently mindful of performance concerns that most other people would never need think about. I am not even qualified to engage in conversation with you at that level.
  However, if the average admin went into the average datacenter and randomly threw spoons at the rate of one spoon per second, he would be unlikely to strike a server with meaningful levels of disk performance concerns more than once every hour. I’m glad that you and your colleagues are working to make even those systems faster, but that regular admin should not be performing full defragmentation passes on the entire datacenter to deal with it. I stand by that assertion and will continue to do so until someone conclusively proves that I am wrong.
  
  Reply
Bart van de Beek says:

October 6, 2018 at 12:35 am

Came across this and just felt the urge to respond:
You offset defragmentation-jobs to gaining increased performance. Totally agree on that one, as it’s non-existent, unless extreme edge-cases… However, using thin VHDx, there is still a use of good old normal guest defrag: Reclaiming VHDx space. If I just shutdown a VM and do a Optimize-VHD -Mode Full (mounting VHDx read-only, etc.) I usually almost reclaim close to nothing. However, if I defragment the guest volume prior to that, followed by an immediate retrim (or wait until Guest Windows does), then run optimize-VHD I actually can reclaim all of the prior wasted and trashed used disk space. Would like to know your opinion on this use-case ?

Reply
- Eric Siron says:
  
  October 8, 2018 at 3:14 pm
  
  I agree. VHDX compacting is one of the few places that full defragmentation has value.
  I did not put it into the article for two reasons. First, just like too many administrators have a neurotic need to keep their disks close to a 100% defragmentation level, too many administrators also have a neurotic need to keep their VHDXs as small as possible. I didn’t want to feed into that. Second, I intended to stick with the theme of the title. Compacting a VHDX should address a particular circumstance, not occur on some normative cycle. When we need to employ an uncommon procedure (compacting), we can feel free to enact an uncommon component (full defrag).
  
  Reply
Anand Franklin says:

October 10, 2018 at 3:30 pm

Hello Eric,

I read your complete article on Defragmentation that it is not recommended to run it on any Windows Server with NTFS file system.

I also came across this page which provides Best Practices to improve Hyper-V and VM performance.
https://www.altaro.com/hyper-v/23-best-practices-improve-hyper-v-vm-performance/

It says on point # 14 that to “14. De-fragment Hyper-V Server regularly or before creating a virtual hard disk”

Please advise on this.

Reply
- Eric Siron says:
  
  October 10, 2018 at 3:58 pm
  
  I did not write the referenced article. I wrote this one: https://www.altaro.com/hyper-v/best-practices-hyper-v-performance/. I have held the same position on defragmentation for years and have still never seen anything to the contrary except unsourced KB articles, FUDded-up “whitepapers”, and extraordinary anecdotes. Some people are comfortable administering their systems based on such things. I am not.
  
  Reply
  - Tejas says:
    
    January 22, 2019 at 1:35 pm
    
    Hi Eric, I read your article several times and you have provided technical information. Kudos!
    If I have Physical server with 2TB of data running on windows operation system.
    1) Will the volume be fragmented?
    
    2) So if I virtualize that server I should not bother about fragmentation just because of Hyper -V not affected.
    
    Reply
    - Eric Siron says:
      
      January 22, 2019 at 7:35 pm
      
      I think maybe you missed something in the article?
      You cannot avoid fragmentation. If you use a disk in a computer system, its file system will fragment. That is a given. Virtual or physical, it will happen.
      What I am saying is that natural fragmentation is not a thing to be scared of.
      
      Reply
Paul Stearns says:

January 28, 2019 at 7:00 pm

So in reading this article, and the exchange between you and Richard, it leaves us lightweight network admins with one question; “Is my system part of the herd which requires no defragmentation, or is it an outlier that would benefit from defragmentation?” You both agree that at least in some cases it is useful.

So in my case, I will err on the side of caution and defrag my VHDX drive which is 99% fragmented. I am using Piriform’s Defraggler, as Hyper-V is lying about the VM’s drive being an SSD. Perhaps because the C: vhdx is “Dynamically expanding virtual hard disk” it thinks the d: is SSD, but the underlying Array is all spinning rust.

Reply
- Eric Siron says:
  
  January 28, 2019 at 7:29 pm
  
  Start by dispensing with the notion that there’s a magical X% where a disk “too” fragmented. X% only means something when file access patterns are guaranteed linear to a high degree. You can almost never guarantee that. If you have multiple VMs sharing a physical location, then you can all but guarantee the opposite (that’s part of the article). Defragmentation is as likely to make things worse as it is to make them better. It’s most likely to have no noticeable impact.
  
  You know by monitoring performance. The cheapest monitoring system is your user base. If they’re not experiencing performance problems, then you have no performance problems worth expending effort to address. A better way is to track disk performance. The easiest metric to judge by is “average disk queue length”. If average disk queue lengths grow as a disk becomes more fragmented, and if it crosses the line from an acceptably low average into trouble territory, then you know that fragmentation is a problem. If the average stays low, or more likely, does not meaningfully change, then you know that fragmentation is not a problem. I should make it clear that capturing queue length data for a few minutes does not tell you much. You really need longitudinal data.
  
  Reply
  - Paul Stearns says:
    
    January 28, 2019 at 10:13 pm
    
    Well there are performance issue complaints. This particular VM is somewhat unique in that it runs an Oracle 12c DB which is about 500GB. On a nightly basis, it gets dropped & cloned from a production DB.
    
    One of the mistakes made when configuring the host was to create a single large “C:” drive. I have a second Hyper-V host which we have upgraded to be able to handle all of the VMs temporarily, and I intend to fail over the VMs on this host, wipe it and reconfigure it with two RAID arrays on separate controllers, and move the VMs back, splitting the load between two arrays, which should improve performance.
    
    I believe the most important thing as an network wrangler is to know your load, and let it dictate the configurations.
    
    Reply
    - Eric Siron says:
      
      January 28, 2019 at 10:27 pm
      
      I strongly doubt that defragmentation will make a bit of difference in this case. You could try to drop the database and defragment the remaining files so that the new DB gets created in a more contiguous space, but I still don’t think it will matter.
      Splitting into two separate arrays will make performance worse, not better. You will have performance characteristics that light some spindles up hot while others have nothing to do, versus spreading the work across all available spindles.
      Splitting C: from the other drives when all storage is local will likely not have any performance impact at all. There is some benefit in making logical separation, though, so do that. Partitions or logical array disks will grant more mileage than separate arrays.
      You might be able to get something by moving all the VMs to alternative storage and then migrating them back, starting with the DB guest. That will place its data closer to the beginning of the disks. I don’t know if anyone has quantified the benefit, but the closer to the beginning, the faster.
      It sounds to me like you have approached the limits of what this system can handle. Obviously I can’t know that without seeing performance charts, but it sounds like it’s time to consider scaling out.
      
      Reply
      - Paul Stearns says:
        
        January 28, 2019 at 11:16 pm
        
        Actually, my experience proves the opposite is true.
        
        The host which I mentioned that I had upgraded with additional memory, I moved all of the VMs to this host, wiped the host to be upgraded. Iadded memory (it didn’t need more memory for the VMs it was running, but would if I wanted to be able to support both hosts VMs on one box) Upgraded the OS from 2012 R2 to 2016 and moved the VMs back using the knowledge gained about the load each machine created to place the VHDX files judiciously.
        
        The server is performing much better.
        
        At any rate I will update you on whether defragging the host, and then the VM with the DB makes a difference. Since I do a 500 GB transfer on a nightly basis via a 10 gb link from a physical server running a DB on SSDs, I may be able to see a difference.
        
        BTW Defraggler allows some ability to place files at the end of the disk based on their size. I used that feature to move the tablespaces (DB container files) to the end of the disk, and everything else will get pushed to the beginning of the disk. Tonight it will wipe out the tablespaces and recreate them.
      - Eric Siron says:
        
        January 28, 2019 at 11:36 pm
        
        Please do update me, and with pre-/post- performance traces. I especially need to see where x/y disks are faster than x disks under the same load because that doesn’t even pass a sniff test. I have never seen smaller arrays outperform large arrays when all else was equal, and I’ve seen lots of performance traces. I think in your earlier migration, the additional memory did more than you expected (possibly alleviated paging I/O) or you unknowingly fixed something else.