2012-03-29

The Wind of Change

Of course the title of this post was blatantly borrowed from the Scorpions (I do love the song)

For the past 5 years I have designed, implemented and supported our corporate IT infrastructure. We have grown (sometimes too fast) and have established a solid and sound foundation for the ever-evolving market and business needs.

The time has come for a new adventure and change of focus.

Starting Mid-April I will be moving to a new position of Platform architect in NDS.

Part of my responsibilities will be the architectural design of our products (with a strong emphasis on the virtualization aspect). I will focus on what I am good at – designing the best solution for the customer and take a step back from the support aspect.

So what can you expect to change? Well one thing is for sure. I will continue blogging (so that won’t be it), I will continue to share with you my views, my thoughts and my ideas.

One thing that will change is – I will be now looking at other solutions and not working day-to-day with only VMware. Which if you ask me is actually a good thing. I have said this more than once – the days of VMware being the one and only player in the market are soon to be over. Some will choose Hyper-V or KVM over vSphere for a number of reasons – be they justified or not – it does not matter.

I will try to continue to be as technical as possible (because that is my passion) but you can expect some other aspects to creep their way in here as well. Elasticity, Cloud, Automation will be my daily bread and butter.

So bear with me – it will be a good ride!!

2012-03-22

Quick and Dirty #PowerCLI to Repair a Disk Space Problem

I ran into a case of a datastore that ran out disk space (why that happened is a whole different story – don’t go there… really….) but because of a whole strange chain of event this caused a number of VM’s to become corrupted.

So how did they become corrupt? There were a number of VM’s that were in the middle of committing a snapshot and there was no space on the volume. The VM’s became corrupt. I could not power them on, I couldn’t do anything with them. The only thing I could do was to restore them from backup - and that is what I did. (There were a number of other VM’s on this datastore as well – but they froze and were waiting for a prompt)

I was asked – how many VM’s were affected by this and which VM’s – and what I did want to share with you is - how I found the list of machines to restore.

First step was to get all the events that happened between 22.00-24.00

$a = Get-VIEvent -Start (Get-Date).Addhours(-14) -Finish (Get-Date).Addhours(-12) -MaxSamples 10000 


[12:00:07 PM] ~> (Get-Date).Addhours(-14) 
Wednesday, March 21, 2012 22:00:13 PM 

[12:00:13 PM] ~> (Get-Date).Addhours(-12) 
Thursday, March 22, 2012 00:00:19 AM 
I then looked for the events that happened on a machine that I knew had been affected
$a | ? {$_.ObjectName -like "MACHINE1*"} 

EventTypeId          : com.vmware.vc.VmDiskFailedToConsolidateEvent
Severity             :                                                       
Message              :
Arguments            :
ObjectId             : vm-1346
ObjectType           : VirtualMachine
ObjectName           : MACHINE1
Fault                :
Key                  : 29721794
ChainId              : 29716968
CreatedTime          : 21/03/2012 22:56:59 PM
UserName             : NDR-IL\vi3admin
Datacenter           : VMware.Vim.DatacenterEventArgument
ComputeResource      : VMware.Vim.ComputeResourceEventArgument
Host                 : VMware.Vim.HostEventArgument
Vm                   : VMware.Vim.VmEventArgument
Ds                   :
Net                  :
Dvs                  :
FullFormattedMessage : event.com.vmware.vc.VmDiskFailedToConsolidateEvent.fullFormat
                                      (com.vmware.vc.VmDiskFailedToConsolidateEvent)
ChangeTag            :
DynamicType          :
DynamicProperty      :

One of the events was the one above. Now that we have found an EventTypeId that looks like the problematic one – I looked for all the machines that had this error. (VM Names of course have been altered to protect the innocent…)

$a | ? {$_.EventTypeId -eq "com.vmware.vc.VmDiskFailedToConsolidateEvent"} | select CreatedTime, ObjectName, FullFormattedMessage | ft 
CreatedTime                    ObjectName      FullFormattedMessage
-----------                           ----------              --------------------
21/03/2012 23:21:08 PM  MACHINE1        event.com.vmware.vc.VmDiskFailedToCo...
21/03/2012 23:06:10 PM  MACHINE2        event.com.vmware.vc.VmDiskFailedToCo...
21/03/2012 23:04:34 PM  MACHINE3        event.com.vmware.vc.VmDiskFailedToCo...
21/03/2012 23:03:38 PM  MACHINE4        event.com.vmware.vc.VmDiskFailedToCo...
21/03/2012 22:57:17 PM  MACHINE5        event.com.vmware.vc.VmDiskFailedToCo...
21/03/2012 22:57:17 PM  MACHINE6        event.com.vmware.vc.VmDiskFailedToCo...
21/03/2012 22:57:04 PM  MACHINE7        event.com.vmware.vc.VmDiskFailedToCo...

Get the list of VM’s and on which datastore they were located so that I start a restore….

$a | ? {$_.EventTypeId -eq "com.vmware.vc.VmDiskFailedToConsolidateEvent"} | select ObjectName | sort ObjectName | % { 
$x = get-vm $_.ObjectName 
  Write-Host "$($x.Name) - $($x.ExtensionData.Config.Files.VmPathName)" 
} 
MACHINE1 - [NFS_2] MACHINE1/MACHINE1.vmx
MACHINE2 - [NFS_3] MACHINE2/MACHINE2.vmx
MACHINE3 - [NFS_2] MACHINE3/MACHINE3.vmx
MACHINE4 - [NFS_3] MACHINE4/MACHINE4.vmx
MACHINE5 - [NFS_2] MACHINE5/MACHINE5.vmx
MACHINE6 - [NFS_2] MACHINE6/MACHINE6.vmx
MACHINE7 - [NFS_3] MACHINE7/MACHINE7.vmx

Quick and dirty (and I am sure that it is not really my best coding) – but Oh So Useful!!!!

You see why you have to learn PowerCLI ??

2012-03-20

Open-SDK–Opening the vSphere Management SDK

When writing scripts (I am a fan of PowerCLI of course) there is many a time when I need to get something out of the vSphere SDK so I can dig in and get the details that I am looking for.

Of course you could always go out to the internet and look for what you would like.

But sometimes I do my coding when in transit (don’t worry I am not driving) - and have no internet connection - so I like to have the SDK with me on my laptop.

I downloaded the SDK Package from here.

I extracted the package to C:\Program Files\VMware\SDK

The following function opens the SDK locally - and if not will try and open the correct web page.

#==============================================================
# NAME: Open-SDK
# AUTHOR: Maish Saidel-Keesing
# DATE  : 20/03/2012
# COMMENT: Will open the vSphere SDK from the local disk or from the Internet
# SOURCE: http://bit.ly/GEb6mk
#==============================================================

function Open-SDK () {
    param ()
    PROCESS {
            $test = Test-Path -Path "C:\Program Files\VMware\SDK\vSphereManagementSDKReadme.html" 
            if ($test -eq "$true")  {
                Invoke-Item "C:\Program Files\VMware\SDK\vSphereManagementSDKReadme.html"
            } elseif {
            Write-Host -ForegroundColor Yellow"You do not have the vSphere SDK installed on your System! Trying to open a web page"
            start http://pubs.vmware.com/vsphere-50/index.jsp?topic=/com.vmware.wssdk.apiref.doc_50/right-pane.html
            }
else {
Write-Host -ForegroundColor Red"You do not have the vSphere SDK installed on your System and no access to the internet!"; break
}
    }
}

##Entry point to script
Open-SDK

Just one of the few nice tools to make your life easier.

2012-03-19

Cause a Linux Kernel Panic or a Windows BSOD

In some testing I was doing with VM HA monitoring – and I would highly recommend Duncan’s post for some more information on the subject - I needed to crash a VM to test the functionality.

So in essence what does it do?

When enabling this feature – VMware HA monitors the guest itself for Operating system failure and if recognized – it reboots the VM (according to the defined threshold)

So how do you crash a Windows VM? And how do you crash a Linux VM?

Windows

This Microsoft KB gives you the answers

Methods to generate a manual memory dump file
There are several methods to generate a manual kernel or complete memory dump file. These methods include using the NMI, keyboard (PS2/USB), remote kernel, or NotMyFault.exe tools.
How to generate a manual memory dump by using the NotMyFault tool
If you can log on while the problem is occurring, you can use the Microsoft SysInternals NotMyFault tool. To do this, follow these steps:
  1. Download the NotMyFault tool from the following Microsoft Web site:

    http://download.sysinternals.com/Files/Notmyfault.zip

  2. Click Start, locate and right-click Command Prompt, and then click Run as administrator.
  3. At the command line, type NotMyfault.exe /crash, and then press ENTER.
Note This will generate a memory dump file and a "Stop D1" error.
How to generate a manual memory dump file by using the keyboard
  • If you are using a PS/2 keyboard, you have to create the

    CrashOnCtrlScroll

    registry entry. For more information about how to generate a memory dump file by using the keyboard, click the following article number to view the article in the Microsoft Knowledge Base:

    244139  Windows feature lets you generate a memory dump file by using the keyboard

  • If you are using a USB keyboard, this feature is not supported in Windows Server 2008 Service Pack 1 until you install hotfix KB 971284. For more information about using the hotfix, click the following article number to view the article in the Microsoft Knowledge Base:

    971284  A hotfix is available to enable crash on CTRL-SCROLL support on Vista S about using the hotfix and Windows Server 2008 on a USB keyboard

    However, it is supported in Windows Server 2008 Service Pack 2 or later versions. You must create the CrashOnCtrlScroll registry entry on the Windows Server 2008-based computer for this feature to work. To enable the feature on a computer that uses a USB keyboard, follow these steps:

    1. Start Registry Editor.
    2. Locate and then click the following registry subkey:

      HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\kbdhid\Parameters

    3. On the Edit menu, click Add Value, and then add the following registry entry.

      Name : CrashOnCtrlScroll 
      Data Type : REG_DWORD 
      Value : 1 
    4. Exit Registry Editor.
    5. Restart the computer. (On a computer that uses a USB keyboard, you do not have to restart the computer. Unplugging the keyboard and plugging it back again is sufficient. After that, the Memory dump file can be generated.)
    Note The keyboard operation will generate a memory dump file and a "Stop E2" error.
    This hotfix is included in Service Pack 2 for Windows Vista and Windows Server 2008.

From my testing – I could not get the manual method to work, so I tried the Sysinternals one  which worked well for me.

Windows BSOD

Linux

The easiest way I found to crash a Linux machine was to issue the following command at command prompt

echo c > /proc/sysrq-trigger

Which of course produces this:

Linux Panic

I wish for you that may these be the only crashes you encounter in your day.

Hope you enjoyed the ride…

2012-03-16

NetApp Virtual Storage Console 4.0 Released

Thanks to Christopher Wells for the heads up

#NetApp Virtual Storage Console (VSC) 4.0 for #VMware vCenter... Now available for download! http://bit.ly/xnoVVg (NOW account required)Fri Mar 16 02:09:09 via TweetDeck

So What’s New?

Virtual Storage Console 4.0 for VMware vSphere includes enhancements to all four capabilities.

    The Monitoring and Host Configuration capability adds support for the following:

    • Data ONTAP® for Cluster-Mode
    • Management of consolidated storage controller credentials
      These credentials apply to the Monitoring and Host Configuration capability, the Provisioning and Cloning capability, and the Optimization and Migration capability.
    • Scanning for and flagging indirect paths used to access data for Cluster-Mode based NFS exports, and providing a list of potential direct data paths
    • The NFS VAAI Plugin
      This is a software library that integrates with Virtual Disk Libraries and allows VMware software to execute certain primitives on NetApp storage controllers.

    The Provisioning and Cloning capability enables you to perform the following tasks:

    • Provision and manage NFS and VMFS datastores on storage systems running Data ONTAP for Cluster-Mode
    • Use the Monitoring and Host Configuration capability to add and remove storage systems

    A new capability, Optimization and Migration, enables you to perform the following tasks:

    • Review the alignment status of virtual machines
    • Perform online alignment of virtual machines by migrating them into optimized VMFS datastores
    • Migrate a group of virtual machines into new or existing datastores

    The Backup and Recovery capability enables you to perform the following task:

    • Back up and restore a virtual machine even if a VMware consistency snapshot fails

    Access to this tool requires a NOW account and of course a valid license.

    2012-03-14

    Don’t Let your Datacenter Turn into a Datayard

    Last night I tweeted a poll asking this question – (please feel free to add your vote)

    The choices that people left are more or less what I was expecting but I still think that this warrants a post explaining my thoughts – and also to get yours.

    Virtualization is a godsend!! We are finally able to decouple the operating system from specific hardware. For as long as I can remember this was emphasized (by myself – I admit as well) as one of the many benefits for using virtual machines. You no longer have to rely on specific hardware.

    So there are a large number of benefits – but what people sometimes overlook is that this decoupling also has a downside to it – and that is related to the poll above.

    But before that, another side to this post. What about the migration of old physical machines to VM’s? Consolidation? Does that sound familiar? Well it probably will. I cannot count the number of times I have seen on Twitter people tweeting about converting physical servers into VM’s – I myself have done this quite a number of times.

    So what are the reasons that we would convert physical servers to VM’s. Well if you are reading this blog – then you probably don’t need me to answer that question. But I would like to dwell on one specific reason.

    Old (ancient/dying/no support/no spare parts) hardware.

    You have an application running on a old server – which is being used by some part of the company – and it is critical to their day-to-day operations (isn’t everything????) But the hardware is old, perhaps failing, not reliable anymore. And moving this application to new hardware will require a re-install of the software. But alas the company that sold you the software 10 years ago, has since then evaporated.. I am sure this sounds familiar.

    So why is this problematic? You, the VIAdmin saved the day – and averted a large risk – and now have this ancient application running on a VM (until eternity) – the department that owns the application – can breathe easier.

    But is this a good thing? Look at the results again

    Poll results

    I know that not a lot of people voted – But I am pretty sure that the results would stay more or less the same no matter the size of the poll. And the bottom line comes down the following.

    Because of the benefits of virtualization – application owners have less of an incentive to update their applications. And I would like to elaborate a little more on this point.

    Before the days of virtualization – when you needed to deploy an application – you would purchase a server with its 3 years warranty and service level (and perhaps even extend that service contract for another two years), install an OS and deploy the application. All parties involved were on board with fact that the server, application and usually also the operating system would need to be re-deployed on new hardware, and perhaps on a new operating system in 5 years time.

    In comes the wonders of virtualization. You deploy a VM (no need to install the OS of course) and install the application on top of that. Now you start going into that grey area…

    How long will that application / Operating system stay active? 3 years? 5 years? 10? 15 years????

    Ladies and gentlemen for those of you who are not aware, Windows NT 4 was released on 29 July 1996, and Windows 2000 on 17 February 2000. End of life for support on both operating systems has long gone by – many many years ago!!! And do you know what Windows 2003 is not that far off? Security Patches are no longer being released. You can only dream of getting support from Microsoft. I am sure that the same goes for the older flavors of Linux as well.

    There will always be cases that a dying application needs to be moved to a VM and saved, but in the same breath that you take to revive this server – your next immediate action should be how do you retire this server? The sooner the better.

    We are all guilty of hosting old applications and operating systems, I know for sure I am as well. There are benefits to virtualization – but there are drawbacks as well. These can be averted with proper planning – defined standards and a strong will. Without those your virtual infrastructure might one day look like this:

    datayard

    I would like to thank @tscalzott @TimStephenson @egrigson @tednorris for joining in on the conversation.

    If you would like to share your opinions, ideas or your view on this post, on how you prevent your datacenter from turning into a datayard – please feel free to do so in the comments below.

    2012-03-06

    Removing ^M Characters from Files in ESXi

    As part of a build process for an ESXi server – on of the stages are to upload a valid SSL certificate to the ESXi server.
    When copying the certificate over to the host with pcsp for some reason they file is always malformed when going over. If you do a cat rui.crt you will see no issue, but if you do a vi rui.crt then you will see that each and every line has a ^M at the end of it – this is because the file is a dos format file.
    dos format
    I finally found a to remove them – and it was not easy to find.
    Both tr and dos2unix are not available on ESXi
    sed 's/'"$(printf '\015')"'$//
    s/'"$(printf '\032')"'$//' rui.crt > rui.new

    That produces a clean file.
    unix format
    Phew – now I can rest easy…

    Update:
    Thanks to the comment received from JR below it is even easier

    sed 's/.$//' < rui.crt > rui.new

    2012-03-05

    Set-UpdateToolsPolicy–For your VM’s

    In continuation to William Lam’s post on how you can Automating VMware Tools Upgrade Policy and thanks to a comment left on the post – I wanted to perform the same functionality in PowerCLI.

    I present to you the Set-UpdateToolsPolicy Function

    Function Set-UpdateToolsPolicy {
    	<#
    		.SYNOPSIS
    			A function to change the update policy one or more VM's 
    		.DESCRIPTION
    			This script will change the VMware Tools Update policy 
    			on one or many virtual machines
    		.PARAMETER  VM
    			The name of one or more virtual machines, this can be 
    			passed from the pipeline
    		.PARAMETER  Policy
    			The policy setting - has to be be either manual (do not try 
    			and upgrade tools on each boot cycle) or upgradeAtPowerCycle 
    			(tools will be checked for upgrade at each power cycle)
    		.EXAMPLE
    			PS C:\> Get-VM foo | Set-UpdateToolsPolicy -Policy manual
    		.EXAMPLE
    			PS C:\> Set-UpdateToolsPolicy -VM foo -Policy upgradeAtPowerCycle
    		.NOTES
    			Author: Maish Saidel-Keesing
    		.LINK
    			http://technodrone.blogspot.com/2012/03/set-updatetoolspolicyfor-your-vms.html
    
    	#>
    	[CmdletBinding()]
    	Param(
    	[Parameter(Position=0,Mandatory=$True,ValueFromPipeline=$True)]
    	[String]
    	$VM,
    
    	[ValidateSet("manual","upgradeAtPowerCycle")]
    	[String]
    	$Policy
    	)
        
    	begin {
    	$config = New-Object VMware.Vim.VirtualMachineConfigSpec
    	$config.Tools = New-Object VMware.Vim.ToolsConfigInfo
    	$config.Tools.ToolsUpgradePolicy = $Policy
    	}	
        Process
        {
            foreach ($vmobject in (Get-VM $VM)) {
    			$vmobject.ExtensionData.ReconfigVM($config)
    		}
        } 
    }
    Line 31: Using [ValidateSet("manual","upgradeAtPowerCycle")] as part of the definition of the parameter means that is can only be one of these two options – this is much easier than doing a switch on the variable.

    Line 36-40: This part is the same for all of the VM’s that will be processed – that is why it is part of the processed at the beginning of the script.

    I would also like to thank Damian Karlson for his post on this subject which was helpful.

    The reason why I wrote this small function .. - well that is for another post.

    As usual your comments are always welcome.