I recently spent some time building kickstart files. Something I hadn’t done much since 2008. The most noticeable change in the process is I’m older and fatter. I re/learnt a few things, and did my first ever python script.
First up, William Lam has plenty of articles on the topic, and the official vSphere installation doco is pretty good too. While testing, it’s much easier to link to a kickstart file or scripts from a webserver than directly embedded on the boot CD.
Coming in to manage a virtual environment that’s already up and running, you guess it’s set up correctly for the most part. As time goes on, you may pick up a few things here and there to improve it. But what got me recently was the business’s interpretation of VMware’s HA.
Know your environment, understand the options During a switch failure causing network isolation of a host, the business wanted to know why their VM’s weren’t restarted on the remaining hosts.
For those using Dell hardware, when you log the job with Dell Support, they’ll ask you to run a DSET report. This collects various information of the server including service tag, all hardware devices, firmware versions etc.
There’s 3 ways to get DSET info.
1) Install DSET locally
2) Run DSET LiveCD
3) Run DSET remotely and create a report on a local server.
Each option has their pros and cons.
Dell OpenManage Server Administrator (OMSA) provides detailed information about the hardware. Handy to find out details of the physical drives, memory sticks and if there’s any failed components.
If you log a support call with Dell, chances are they will ask for more details, and possibly a DSET report, and having OMSA already installed, makes life easier.
Dell also leaverage the features of OMSA with other management packages such as OpenManage Essentials and the vSphere plugin.
It’s amazing how much is going on when you dig through logs. On this occasion I was looking at “tasks & events” of a host and noticed a lot of network errors.
Alarm ‘Network uplink redundancy lost’ on triggered an action
The error was occurring every 5 minutes. This was made visual with the use of Log Insight. My new favourite tool.
I couldn’t find anything wrong with this particular ESXi host, vSwitch or uplink.