HomeFeaturesArticlesThe Top 5 Success and Failures of a VDI Migration

The Top 5 Success and Failures of a VDI Migration

Top 5 VDI Successes

1. Always Perform an Assessment

An assessment is a method of gauging user consumption behavior on a current physical device.  You should look at metrics in terms of how the OS is currently using things like CPU, memory, IOPS, network resources, applications, and device hardware.  When this data is collected over a long-term period of time (typically 30 days), you can understand the highs and lows of user consumption.  When you apply this to a series of mathematical formulas, you can not only calculate what a group of users would require from a hypervisor perspective, but also understand growth rates, software inventory, and what hardware will require a disproportionate amount of resources.  Some good examples of hardware that will do this are things like high output audio/visual capture devices, high resolution scanners, USB Webcams, and DVD burners.  An assessment is key to architecting exactly the environment that you need to survive for the next five years.  If you build too much hardware, you have wasted resources.  If you build too little hardware, user experience suffers, or worse, a VDI project can collapse.   A proper assessment will calculate your requirements for you, identify poor hardware, help you consolidate software, and even help you optimize your virtual images.

2. Image Optimization is Key

One area where massive gains in user experience can be realized is through image optimization.  VDI is unlike any other kind of computing environment we have seen used to date.  There are many things that are built into the Windows operating system (Linux too!) that simply have no place inside of a VDI deployment.  Such examples are things like the Wireless Service, Bluetooth Support, Branch Cache, Family Safety, Fax, and Internet Connection Sharing.  None of the aforementioned services should be utilized in a VDI environment, and as such, should be set to disabled.  There are literally hundreds of different settings and services that are removed from an operating system when used in VDI, and these have a very large effect on how much RAM, IOPS, and CPU the system uses even without a user logged in.  This has an additive effect, as when you are hosting hundreds or thousands of VDI systems on a hypervisor, multiple unnecessary services running in an image can mean the difference between housing 50 users in hypervisor instead of 70, or 80.

3. Never Underestimate the Importance of Good Storage

The slowest sub-system in any modern computing system is local storage.  This means that any bottleneck or slowdown in storage will be instantly felt by your users, resulting in increases of help desk calls, poor VDI experience, and a ‘black eye’ on the project.  In recent years, solid state, or flash storage, has become increasingly affordable, and many organizations have begun to deploy their VDI environments within a flash storage array.  This solves many issues, as VDI is typically IOPS heavy, and capacity light.  When you are designing your VDI storage environment, remember that you must design for the worst case of what the environment will have to deal with.  This means that the advertised IOPS that an array or platter can deliver inside of a storage array will never be realized.  Instead, focus on the minimum amount of IOPS a platter or platform will deliver without any kind of cache acceleration.  This will ensure that you have adequate IOPS to deal with any boot storms, login storms, or security storms in your environment.  In many cases, VDI systems will bypass storage cache because user data is intrinsically unique.  All the thousands of documents, configuration settings, and databases your users use on a daily basis are specific to that user, and may only be loaded once per day or week.  This means that storage cache will be less effective in a VDI environment, and load will be shifted to the disks that are running behind it.  If this is planned for during the assessment phase, your VDI systems will perform better than their physical desktop counterparts, and will help increase VDI adoption and project success.

4. Use Software Designed For VDI

VDI has presented a number of challenges for software vendors looking to support their products inside an ever-changing virtual landscape.  Many times, this means that applications coded for physical machines may not perform well when migrated over to a VDI computing system.  This can be due to many issues, such as coding applications for a discreet video card when such a device is not present in VDI, lazy coding when we have plentiful resources in a physical machines, and even how to work with multiple CPUs.  Look for updated versions of your common applications, and don’t forget to speak to your application vendor about your requirements.  Many application vendors now have applications that are virtualization aware, or virtualization optimized.  Another important aspect of software to consider within VDI is the use of your security software.  Antivirus solutions are terribly inefficient when their physical desktop versions are utilized in VDI.  As the systems scale, there is too much duplicate work going on inside of the system to justify its existence.  Instead, start looking for security applications that take VDI’s economies of scale into account.  There are a number of anti-virus engines that function at the hypervisor level, and eliminate the bulk of this duplicate work.  As an added benefit, having antivirus function at a layer above the guest operating system leads to a more secure environment.

5. Increase Productivity Through Profile Migration

In the past, it was common for organizations deploying VDI to create a ‘from scratch’ environment for their users to work from.  At first, this sounds like a good idea.  In theory, you are eliminating any potential baggage that may have been retained from years, or even decades, of users working with their physical machines.  However, this method has a massive side effect when migrating to VDI.  In reality, it creates an environment where users need to customize their new systems, and spend hours, or even days, trying to get their new environment to function the same as their old one.  Settings will vary from user to user, but they will include things from desktop backgrounds to browser favorites, to application settings.  This is a massive loss of employee productivity.  Instead, you can look for profile migration tools that let you get the best of both worlds.  They will allow you to retain the settings and files from the physical machines that your users need to perform their jobs, while eliminating things like their MP3 collections or questionable applications from the profile.  This allows you to have employees that instantly have a recognizable platform to work with, with all of their data where they expect to find it.  The end result is that instead of spending a few days getting used to the new VDI environment, customers can log into VDI and immediately start working.

Top 5 VDI Failures

1. Doing Things the Way You Always Have

In many environments where we have found that VDI did not succeed, evidence was found that called into question certain business practices that hurt overall system stability.  In one such example, a large company (which shall of course remain nameless!) had constructed their own software update platform.  The platform was written in an older language, and executed in DOS.  They created this platform in the mid-1990s, when enterprise grade system update tools were still in their infancy.  Unfortunately, this solution was never fully removed, even as the company brought in modern system update tools such as SCCM and Windows update.  Instead, three different system update solutions were running on the system at the same time:  the DOS tool, SCCM and Windows Update.  Different teams managed all three tools, and all three tools required a significant amount of overhead on the systems to function.  To add insult to injury, the VDI systems in this environment were limited to a single core (more on that later), and all three systems were resident on the system, taking up valuable resources.  Because the resources were so constrained, the stability of VDI would begin to falter under any kind of additional load.  On more than one occasion, a simple system update was enough load to cause every VDI system to become saturated and unusable.  During the post-mortem of the environment, it was asked why three different system update tools were necessary in a VDI environment, and why only one CPU core was used.  The only answer that was given:  “That’s the way we have always done it”.  For many companies, VDI represents the ability to break away from policies and procedures that are either antiquated, or are simply inefficient.  Each aspect of the way you manage your machines should be called into question.  If a policy or process that you have implemented goes against best practices for VDI, you must examine it.  Far too many companies have been burdened by failed VDI pilots simply because of “The way we always do things here” scenario.

2. Not Taking Advantage of Non-Persistent VDI

Non-Persistent VDI can be thought of as VDI 2.0.  It’s more secure, as machines will wipe any information not stored by a profile solution when the user logs off.  This reduces the potential for Adware, Malware, Spyware, Ransomware and viruses on your system.  It can also significantly decrease the amount of hypervisors you have in your environment, as you only need to have the amount of concurrent machines that you use running.  It can massively affect the amount of storage space required for VDI, because linked clones all read from the same master image.  Finally, Non-Persistent VDI can also be updated almost immediately when installing system updates, anti-virus updates, or base image software updates.  Despite these huge advantages, many companies are still struggling to deploy Non-Persistent VDI due to the amount of work that needs to be performed in deploying applications on top of those systems.  If you have not examined Non-Persistent VDI as a solution for your company, now is the time.  Layering tools, for example, have never had better compatibility, ease of use, or performance.  In many cases, these new layering tools make the long-term management of VDI easier than Persistent VDI.

3. Overloading an Environment

If you have gone through the proper steps of VDI architecture – including performing an assessment, optimizing your image, and selecting the correct hardware for your environment – then you are in an excellent place for the long-term care and feeding of VDI.  However, you may be tempted, once the project has reached maturity, to add additional users in the environment that you did not plan or assess for.  This is a massive pitfall, and can quickly kill an environment.  VDI has had all of its resources carefully planned for short and long term use.  Adding an extra one or two hundred users may seem like a small amount in a one thousand-user environment, but what may work in the short term will almost certainly cause VDI to fail down the road.  This is because when VDI is initially designed, it has slightly more resources than it needs to account for long-term growth.  Most system resources will increase by about five percent per year, including memory, CPU consumption, and IOPS consumption.  This is a naturally occurring phenomenon that happens as your software and image updates use more resources.  As applications become more advanced, they have a tendency of using more system resources.  Adding more users to the system may be fine now, but they will cause massive bottlenecks in the system two or three years in the future.

4. Too Much, Too Soon, Too Fast

Testing is an incredibly important aspect of rolling out a properly configured VDI environment.  Testing should apply to every single level of a VDI deployment, from image optimization, to application delivery, system updates, connection broker functionality, slow link usability, and hundreds of other areas.  As VDI gains traction, you or your management may be tempted to start cutting corners.  This may include going from a POC or Pilot environment to a production environment early, or before each piece of software in VDI has had a functionality test from the application owner.  Do not do this!  More VDI projects have failed for moving too quickly than almost any other reason in a VDI deployment.  Proper VDI rollouts are usually executed in three phases:

  1. POC – Proof of Concept is where key engineers and architects work with the environment to establish a base level of functionality of VDI
  2. Pilot – Pilot phase is where key employees are granted access to a VDI environment to gauge its use in the corporate environment.  These employees should be reporting any issue they find within VDI, and should also be tolerant of any issues they may find
  3. Production – Production can only be reached once all the milestones of POC and Pilot have been reached.  POC and Pilot phases can take months, or even years, to fully vet out.

5. Artificially Restricting User Resources

One of the most common issues with user experience in VDI has to do with resources attached to the virtual machine.  Many of my clients that install user experience solutions are doing so to understand why and where the bottlenecks in their VDI systems are occurring.  In hundreds of different environments, it has been found that users were not given the proper amount of CPU cores or memory to perform their work.  Many clients are also employing VDI systems with only one virtual core.  This is a false economy.  When a modern operating system, such as a version of Windows, from Windows 7 and newer, has only one core, it will use a version of the Windows Kernel called the UniProcessor HAL (Hardware Abstraction Layer).  This Kernel does not understand how to deal with more than one processor stream, and was meant to be used in systems that were barely capable of running Windows 7 when it was released.  It was a throwback included for older system compatibility.  The UniProcessor HAL is not very efficient, as modern versions of Windows were written to take advantage of parallel processing, multi-threaded capable CPUs.  If you remove that capability from the operating system, you will artificially stunt it, and cause it to run slowly.  It also must be stated that a VDI system places more CPU load on a CPU core than its physical counterpart.  Physical machines have hundreds of discrete systems all working together to create a computing environment.  All of those discrete circuits, from the system clock, to the USB controller, to the network and video card, must also be virtualized by the CPU in a virtual system.  Pushing all of this work to a single core is a recipe for failure.  Adding more cores is not a bad thing, and ideally, you want a hypervisor CPU to be operating in the 60-80% range.  If your hypervisor CPUs are below this value, and your guest virtual machines are queuing their CPU, then you are wasting a very expensive hypervisor CPU.

6. Bonus – Not Defining a VDI Champion

Many times, it has been found that larger organizations that deploy VDI hang on to their old ways of doing things.  This means that departments may be heavily ‘siloed’, or kept in separate areas, almost like separate businesses.  Over the years, an ‘us versus them’ mentality may occur between these departments.  When VDI is introduced in an organization, these groups must find ways of working together seamlessly, and without friction.  To that end, they need an overreaching executive manager to make sure that all the teams keep working together to solve problems.  They need a champion.  A champion will have the ability to create a VDI group, or a team of subject matter experts from each of the IT teams that are going to be implementing VDI.  The champion needs to have the authority to force groups to take on new work, and change old ways of doing things.  Without this champion, the project may be relegated to inefficient ways of performing, and the entire VDI project will suffer as a result.

Liquidware Labs