esx

warning: Creating default object from empty value in /var/www/itnervecenter.com/modules/taxonomy/taxonomy.pages.inc on line 33.

Windows Bugchecks on VMware ESXi with Xeon E5-2670 CPUs

UPDATE: Workaround available

Since writing the original post we have receive a workaround from VMware. Their current suggestion is to force VMs to use software MMU virtualization. This change can be done per VM or at the host level. VMware recommended to us to change it at the host level.

To force all guests on a host to use software MMU virtualization add the following line to /etc/vmware/config on all your hosts:


monitor.virtual_mmu = "software"

A reboot of the host may be necessary to get this setting to take effect. Once completed, as long as the nested hypervisor setting isn't enabled on the guest and VMotion to the host should set the MMU to software.

To check what MMU setting is in effect SSH to the host on which the guest is running, change to the datastore and directory for the guest and run the following command:


/vmfs/volumes/datastoreGUID/vm1 # grep "HV Settings" vmware.log
2014-03-31T17:34:58.409Z| vmx| I120: HV Settings: virtual exec = 'hardware'; virtual mmu = 'software'

virtual mmu = 'software' will let you know that you are running in software MMU virtualization.

If the nested hypervisor settings is enabled on a particular guest then this setting will not work and neither will forcing it at a guest VM level. This setting is enabled/disabled via the vSphere Web Client under the CPU section of the guest as a check box .

As general information, changing to software MMU could change baseline performance on certain workloads per this VMware whitepaper http://www.vmware.com/pdf/Perf_ESX_Intel-EPT-eval.pdf

I'll post any more updates as we receive them.

**** ORIGINAL ARTICLE BELOW ****

We recently upgraded most of our physical host hardware in our production VMware ESXi cluster. Since that time we've been having random bugchecks or Blue Screen of Death in some of our guests. Usually these are 0x000000FC (ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY) or 0x0000000A (IRQL_NOT_LESS_OR_EQUAL) BSOD codes. All of the information we've seen from research appears that these bugchecks are a result of hardware (bad memory, etc) or driver problems.

All of the guests that have bugchecked were running Windows Server 2008 R2 with all current Microsoft and software updates as of December 2013. All of the hosts are running ESXi 5.1 Build 1312873 with all VMware updates as of December 2013 as well.

We opened a case with Microsoft and VMware on these issues. Obviously there is the finger pointing say "it's the other guy's problem." This morning (8 Jan 2014) we received an email from the VMware support tech indicating that these issues may be a result of an Intel Xeon E5 CPU bug. They reference these bugs in the following Intel document:


BT39. An Unexpected Page Fault or EPT Violation May Occur After Another
Logical Processor Creates a Valid Translation for a Page


BT78. An Unexpected Page Fault or EPT Violation May Occur After Another
Logical Processor Creates a Valid Translation for a Page

http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-family-spec-update.pdf

It may be a combination of the hardware we're using in addition to a CPU bug. For reference our hosts have the following specs:


Dell R720
BIOS 2.0.19
(2) Intel Xeon E5-2670 10 core CPUs
384GB RAM
(2) Onboard Intel 1Gb I350 NICs
(2) Onboard Intel 10Gb 82599 NICs

We still don't have a resolution as of yet but I will update this article when we find out the root cause and a solution.

Until then I would advise caution when using E5-2670 CPUs in your virtual infrastructure

Cheers,
Flux.

Received missed errors with esxcli network command and Intel 10Gb NICs

We recently started upgrading the hosts in our production vSphere cluster. We replaced some hosts with new ones and we upgraded some existing hosts with more RAM. All servers in our cluster will eventually have at least two (2) Intel 10Gb network adapters which will be multi-purpose for VM traffic as well as VMotion traffic. Since we rely on

The addition of the new hosts went fine and the new 10Gb uplinks were speedy. When we started adding new 10Gb NICs to the older hosts everything seemed fine but I checked the NIC stats via esxcli on the hosts and the Receive missed errors counter started to increment. I searched all over Google and VMware's sites and forums and didn't find anything on this counter or its meaning.

For reference the new hosts hardware have onboard Intel 82599EB 10Gb SFI/SFP+ NICs and the add-in cards for the older hosts are Intel 10G 2P X520.

The only thing that changed was adding memory to the hosts and the new NICs. I checked on VMware's site for ESXi 5.1 and there was a newer driver VIB for the Intel 10Gb NICs (they use the ixgbe vmkernel module). I decided to give that a try.

The version of ixgbe that is current in ESXi 5.1U1 with all patches updated via vSphere Update Manager as shown on our other hosts is net-ixgbe 3.7.13.6iov-10vmw.510.1.20.1312873. You can see what VIBs are installed on your host by enabling SSH access to the host or the ESXi shell, logging in and running:


esxcli software vib list

I grabbed the VMware ESXi 5.0/5.1 Driver CD for Intel X540 and 82599 10 Gigabit Ethernet Controllers at: https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI5X-INTEL-IXGBE-3187&productId=285#sthash.aoGzIQdz.dpuf

I placed the host in maintenance mode, unzipped offline bundle on one of our web servers, and manually updated one of the hosts with the following command line:


esxcli software vib update -v http://vibhostingserver/path/to/expandedoffline/

NOTE: This update requires a reboot so make sure you migrate all your guests to another host and place this host in maintenance mode

Once I rebooted the host came up and after migrating machines to/from the host several times we saw no errors. After the update your hosts should show the following version:


net-ixgbe 3.7.13.6iov-10vmw.510.1.20.1312873

I've also added this driver package to our VUM repository so any new hosts get this new driver. Check out this fine post for information on patching hosts via esxcli and adding patches to vSphere Update Manager http://blog.mwpreston.net/2012/01/16/installing-offline-bundles-in-esxi-5/

I going to try to contact VMware and get better idea of what the Received missed error counter actually tells us and update this post at a later date.

Hope this helps.

Cheers,
Flux.

ESX vs Hyper-V Networking - Really? I have to know VLAN IDs?

One of my resolutions for 2013 is to write and post articles more frequently. Here's my first attempt.

At my company we are heavy users of VMware/ESX/ESXi. It's what we have used for well over 8 years. One of the reasons why: it just works, it's stable, and (as it related to this post) the virtual networking is just simpler (in my opinion).

How to kill stuck VMware Tools install from the command line ESXi 5.x

Trying to put a host into maintenance mode in vCenter and one machine wouldn't VMotion because it said VMware Tools were being installed or upgraded.

Searching Google came up with this article which applies to ESX 4.x:

http://lonesysadmin.net/2009/12/11/how-to-cancel-a-stuck-vmware-tools-in...

Well vmware-cmd is gone in ESXi 5.x. Here's how you do it in ESXi 5.x:


cd /vmfs/volumes/[datastore]/[vmname]
vim-cmd vmsvc/getallvms

Note the VmId of the machine having the issue. Then run:


vim-cmd vmsvc/tools.cancelinstall VmId

Citrix Netscaler VPX Boot Failure on ESX 3.5

We use Citrix Netscalers for web application security and acceleration. In order to test some things before a deployment of some new Netscalers we wanted to test then using Citrix's Netscaler VPX virtual appliance (http://www.citrix.com/netscalerVPX).

Related Story: 

[UPDATE] Dell DSET and ESX 3.5.0 - DSET will reboot your host

UPDATE - May 25, 2010 - It appears that our Dell DSET problem is hardware-related. We seem to have some hardware problems which are causing the DSET utility to reboot our ESX host. If you are having this issue, you should call Dell and let them know (you probably are already in contact with Dell if you're running DSET).

We'll let you know what we find out with regard to our hardware.

---------------------------------------------------

Problem Deploying a Windows Server 2008 R2 VM Guest using a KMS Server with VMware vCenter

We've been having problems deploying Windows Server 2008 R2 virtual machines from templates using the VMware vCenter and it's built in guest customization.

It appears that there may be an issue either with Microsoft's built-in sysprep utility in 2008 R2, or with VMware's vCenter guest customization wizard when using a Key Management Server, or KMS. A KMS is part of Microsoft's Volume Activation 2.0 as described here - http://www.microsoft.com/licensing/existing-customers/product-activation... and in this TechNet article - http://technet.microsoft.com/en-us/library/bb892849.aspx

Syndicate content