AOC-IT

Do you want to join the vExpert Community

Have you ever wanted to become a vExpert? Personally, I became a vExpert in 2019. I became this because I had a good mentor who helped me in my endavours. Since then I joined the vExpert Pro community back in 2023 I worked with people to become vExperts and guiding them with applications and requirements and how best approach this. My good friend Wouter and I hatched an idea during this year on how we could help people in the process to get over the hurdle easier. Wouter setup a call for Monday from 8-9 PM CET and I will host a session on Thursdays at the same time also. However if you can’t meet those times for whatever reason send a request to this email or to me on here and I will try to accomodate when possible. We will from time to time have other people from the vExpert community joining these calls also.

If you are interested in joining the vExpert Office Hours drop me an email at vexpertofficehours@gmail.com so I can add you to either or both of the calendar invitations.

FAQ

– How often will you run this?This will be weekly untill applications close

– How late is this in my time zone?You can use this link to convert the time to your local time.

Delete snapshots, consolidate disks… how long will it take?

Hi folks!

I think it happened to every vSphere admin at some point… the huge VM you’re so proud of reported that it needs to consolidate a disk (of course it’s the large data disk) or the automatic snapshot of that backup job was not automatically removed (hello Veeam folks!). Usually removing snapshots or consolidating disks is no big deal, vCenter handles the process automatically in the background. I throw both processes in the same basket, as they’re bascially the same thing.

Sometimes however, things go wrong (before we get to the consolidation/removal processes); a disk is corrupted, one or multiple, usually large to extra large snapshots orphaned and you need to clean upIn my experience the IT-devil will preferrably hit the VMs with heavy, critical loads and disk sizes in the 2-digit TB range… The operating system crashed or the VM crashed or something else on the “no bueno” list happened. One of the things vCenter keeps screaming about is “Virtual machine disk consolidation is needed”. Or some peripheral application like Veeam ran into issues because of the orphaned snapshot, there are a multitude of reasons you may get into this situation.

So before you go on a google frenzy to find the magic button in this matter, let me shed some light on this scenario and what you can do:

Tune it!

How to clean up the situation as quickly as possible? Snapshot removal and disk consolidation can usually be run in any VM powerstate. If the VM is powered on, it will take longer, as OS disk I/O traffic will constantly interfere with the removal/consolidation task. So if you want it to run quicker, first power down the VM.

Don’t expect miracles, the process itself is rather… slow and the time it’ll take is completely dependent on the VM disk size, snapshot size, to a lesser extent on hardware (ESXi server and storage). As a very rough estimate, expect the process to run for about one hour per TB disk capacity. At one point I had to run consolidation on a VM with a 49TB disk and yes, it took a bit more than two days to finish.

Speed it up!

You started the consolidation/snapshot removal process and the manager is breathing down your neck, demanding the system to be up and running again asap?

You: “There must be a hidden turbo switch for this kind of task, right? Some whay to speed it up?”

Me: I’m sorry, but no there isn’t.

You: “But the manager wants the VM running NOW, so can we stop the consolidation process, start the VM and just deal with the cleanup later?”

Me: once the consolidation process is started, it cannot be stopped, there is no specific control tool for this. Theoretically you can kill/restart the management services but the chances are very high, this will leave the disk in a corrupted state – fixing that will be an enormous pain in the backside, not to speak of the manager…. have patience, it is done when it’s done.

Lab Environment Version 2

Getting started on all this I thought a bit about what to put and what might be useful to put here. Since I work with VMware products and have done so for a while now (since ESX 1.5). I figured that describing my lab environment could be a good start, and also what plans I have for that environment.

The physical part of the environment is something like the following:

I have several servers, 8 in fact. They are of two different types. 3x Dell T130, which are the newer servers that I have and then there are the 5 older ones, IBM x3250. The systems generally are built with maximum memory available. 64 GB for the Dell servers and 32 GB for the IBM servers. This comes from an early time in virtualization where I learned that CPU is not as important as memory usually.

The network consists of Cisco Catalyst switches (3750x), Cisco Routers (800 Series) and a SoHo FortiGate firewall.

Storage is made up of some Synology boxes hooked up as iSCSI devices and the Dell Servers also has some internal SSD, which later will become useful for a (minimal (and most likely sluggish)) vSAN configuration.

The current environment consists of the 3 Dell Servers, with 1 Catalyst for networking traffic and a second catalyst switch for storage traffic. It is fully on purpose I did it this way so I can restart the normal network without impacting the whole lab.

NW20190727

The three dell Servers makes up primarily a management network part, 2 Windows Servers for AD, 3 node vROps cluster 6.6, Log Insight server, vRA 7.6 and a few other bips and bops. Currently it suits most of my requirements but I got thinking the other day… why not expand the setup a bit. What if I add 2 more sites? Use the old IBM servers, they still are supported by VMware with ESXI 6.7u2. Once could add a SRM site, remote collectors to vROPS (Because every home needs that :)), Create a Payload cluster for VRA, deploy NSX-T?

NW20190727-future-idea

So the idea looks a little bit more like this, reconfigure the current 3 Dell servers and the network topology. Add a connection to where the IBM servers are located (Devolo power over Ethernet adapters, they are usually so unstable that will be quite good for simulation the interim power site failure in the house I live in, and also slow enough to emulate the connection to a secondary site :)). Then take the 5 IBM Servers, split them to a second management cluster, which would allow me to run SRM across, though I suspect only in an older version as I am not sure that there are Synology adapters for the appliance yet (more on that when time comes around). The other three servers can then be used as payload for example or … whatever really comes to mind.

Anyway, these are the thoughts on a Saturday morning, well afternoon now.

One thing is for sure, the electricity company will love me…