Wednesday, April 20, 2016

VMware:- Basic Network and Storage Troubleshooting


Basic Network Troubleshooting
1. Using vNetwork Distributed Switch (vDS) configuration panelThe vNetwork Distributed Switch provides the same troubleshooting feature set as the Standard Switch. In addition, network and server admins will find the vDS configuration panel (under Home> Inventory > Networking) provides a wealth of information useful in troubleshooting the virtual network.
When looking at the vDS for this example environment there are a few things to note:
  • Clicking on the actual DV Port Group (e.g. dv-management) will highlight the network path taken by those endpoints through the vDS. This will determine which dvUplinks and which vmnics are used for traffic to and from this DV Port Group.
  • Clicking on the “i” or information icon shows the properties and settings for that port group. In this example, you can see no VLAN is assigned (so native VLAN is in use) for the management ports and that they use dvUplink1 which maps to vmnic0 on all the hosts.



You can drill down further by selecting the actual port.
  • Clicking on the port (in this example vswif0 on 10.91.248.109) shows the actual network path through the vDS to the vmnic (vmnic0 on esx09a.tml.local in this example).
  • Clicking on the i information icon next to the port shows the Port ID, MAC address, IP address, and mask.

 You can also look at the physical network connection.
    • Clicking on the “i” information icon next to the vmnic will show the CDP (Cisco Discovery Protocol) information picked up by that vmnic. In the next example, you can see vmnic0 on esx09a.tml.local is connected to interface “GigabitEthernet 0/9” on the adjacent physical Cisco switch. You can also see the management address of the switch and what features/capabilities are enabled (e.g. multicast, etc).
Note: You need a Cisco switch with CDP enabled for this information to show up on the VMware ESX and vCenter Server.
Each of the individual ESX hosts can be configured to “down”, “listen”, “advertise”, or “both.” CDP is configured through the ESX Service Console command line interface.
The mode is displayed by entering:
esxcfg-vswitch –b <vswitch>The CDP mode is configured or changed by:
esxcfg-vswitch –B <mode> <vswitch> where <Mode> is one of “down”, “listen”, “advertise”, or “both”.
For more information on configuring CDP, refer to the ESX Configuration Guide.



Basic Storage Troubleshooting:
1. Detecting and Resolving Stale Lock on Storage DeviceLVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (1 tries left)
LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (0 tries left)
WARNING: LVM: OpenDevice:3777: Device mpx.vmhba1:C0:T1:L0:1 still locked. A host may have crashed during a volume operation. See vmkfstools -B command.
LVM: ProbeDeviceInt:5697: mpx.vmhba1:C0:T1:L0:1 => Lock was not free
This message in the vmkernel log means that a host may have crashed and left a stale lock on the device. To break the lock, a new vmkfstools -B option was introduced to ESX 4.0
The sample command to break the lock is:
# vmkfstools -B /vmfs/devices/disks/mpx.vmhba1\:C0\:T1\:L0:1
(you will get the following prompt)
VMware ESX Question:
LVM lock on device mpx.vmhba1:C0:T1:L0:1 will be forcibly broken. Please consult vmkfstools or ESX documentation to understand the consequences of this.
Please ensure that multiple servers aren’t accessing this device.
Continue to break lock?
0) Yes
1) No
Please choose a number [0-1]: 0
Successfully broke LVM device lock for /vmfs/devices/disks/mpx.vmhba1:C0:T1:L0:1
At this point, running either of the following commands will mount the VMFS volume that is on that LUN:
# esxcfg-rescan vmhba1
OR
# vmkfstools –V
To verify that the volume is mounted you may run:
# vdf
An output would look like this:
# vdf
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdc2 5044188 1696408 3091544 36% /
/dev/sda1 248895 50761 185284 22% /boot
/vmfs/devices 1288913559 01288913559 0% /vmfs/devices
/vmfs/volumes/48c693ef-c7f30e18-6073-001a6467a6de
142606336 6316032 136290304 4% /vmfs/volumes/vol143
2. Verifying the disk space used by thin provisioned disksYou may use this command:
# stat /vmfs/volumes/*/*/*-flat.vmdk
The output would show the size in 512bytes blocks:
File: `/vmfs/volumes/openfiler-nfs/BlockBridge/BlockBridge-flat.vmdk
Size: 4294967296 Blocks: 8396808 IO Block: 4096 regular file
Multiply this size times 512 you get the space used in bytes. (Or you may divide by 2 to get the size in KB)
Note: You may use du instead of stat. However, du is not an available command on VMware ESXi 4.0.
3. Troubleshooting Storage VMotionThe Virtual Machine’s log from the source side, vmware.log, is copied to the destination as vmware-0.log. Check that log for events prior to transferring the VM’s files to the destination volume.
To investigate destination power on failures, run a ‘tail’ command against the destination vmware.log files.
The proc node /proc/vmware/migration/history continues to exist in ESX 4 and provides very useful information on Storage VMotion operations as well as standard VMotion operations.
Virtual machines with a large number of virtual disks may time out during Storage VMotion. The error message could be:
Source detected that destination failed to resume
To increase the time out value, locate and modify the value of the following in the VM’s vmx file:
fsr.maxSwitchoverSeconds
The default value is 100 seconds.
4. LUN access is inconsistent across targetsThe vmkernel log shows (truncated for readability):
NMP: nmp_SatpMatchClaimOptions: Compared claim opt ‘tpgs_on’; result 0.
ScsiPath: SCSIClaimPath:3478: Plugin ‘NMP’ claimed path ‘vmhba2:C0:T0:L7’
NMP: nmp_SatpMatchClaimOptions: Compared claim opt ‘tpgs_on’; result 1.
ScsiPath: SCSIClaimPath:3478: Plugin ‘NMP’ claimed path ‘vmhba2:C0:T1:L7’
Notice that the path via target 0 (‘vmhba2:C0:T0:L7) shows “claim opt” ‘tpgs_on” value is 0 while it is 1 on via target 1 (vmhba2:C0:T1:L7)
PSA SATP is inconsistent on different targets to the same storage.
This array supports ALUA (Asymmetrical Logical Unite Access) and also supports “active/passive” (AP) access. This depends on the initiator record’s configuration on the array. If the record shows “TPGS” (Target Port Group Support) is enabled, ALUA will be used by that initiator. Otherwise, “Active/Passive” (AP) is used.
To resolve this inconsistency, reconfigure the initiator records for that Host on all Storage Controllers to be identical. After this is done, rescan and verify the logs show the matching claim options on all targets for a given LUN.
You may run this command to verify the changes as well:
# esxcli nmp device list
The output would look like this if tpgs_on is set to 0 on a Clariion:
naa.60060160b4111600826120bae2e3dd11
Device Display Name: DGC Fibre Channel Disk (naa.60060160b4111600826120bae2e
Storage Array Type: VMW_SATP_CX
Storage Array Type Device Config: {navireg ipfilter}
Path Selection Policy: VMW_PSP_MRU
Path Selection Policy Device Config: Current Path=vmhba0:C0:T0:L0
Working Paths: vmhba0:C0:T0:L0
And would look like this on the same array if tpgs_on is set to 1:
naa.60060160b4111600826120bae2e3dd11
Device Display Name: DGC Fibre Channel Disk (naa.60060160b4111600826120bae2e
Storage Array Type: VMW_SATP_ALUA_CX
Storage Array Type Device Config: {navireg ipfilter}
Path Selection Policy: VMW_PSP_FIXED
Path Selection Policy Device Config: {preferred=vmhba0:C0:T0:L0;Current Path=vmhba0:C0:T0:L0}
Working Paths: vmhba0:C0:T0:L0
All commands referenced in this document can be run via vMA (vSphere Management Assistance) Virtual Appliance for both ESX and VMware ESXi 4.0 (except for “stat” or “vdf” which are only available on the ESX 4.0 service console and ESXi “techsupport mode”).
They can also be run via the Service Console on ESX 4.0 only.
The logs can be collected via vm-support script (COS), Export Diagnostic Information (vSphere Client) or by forwarded logs to a syslog server or vMA appliance.

No comments:

Post a Comment

Welcome to Windows Server And VMware

Microsoft has also produced Windows Server Essential (formerly Windows Small Business Server) and Windows Essential Bussiness Server (discontinued), software bundles which includes a Windows Server operating system and some other Microsoft Servers products

VMware's desktop software runs on Microsoft,Windows,Linux, and Mac OS X, while its enterprise server hypervisor for servers,VMware ESX and VMware ESXi, are bare-metal hypervisor that run directly on Server hardware without requiring an additional underlying Operating System