Tuesday, 12 November 2013

Constant Alarm 'Network Uplink Redundancy Lost'


It's amazing how much is going on when you dig through logs. On this occasion I was looking at  "tasks & events" of a host and noticed a lot of network errors.

Alarm 'Network uplink redundancy lost' on <servername> triggered an action

The error was occurring every 5 minutes. This was made visual with the use of Log Insight. My new favourite tool.



2013-11-01
16:39:50.817
Alarm 'Network uplink redundancy lost': an SNMP trap for entity <servername> was sent
appname  source  hostname  vc_event_type  vc_alarm_type 
2013-11-01
16:39:50.810
Alarm 'Network uplink redundancy lost' on <servername> triggered an action
appname  source  hostname  vc_event_type  vc_alarm_type


I couldn't find anything wrong with this particular ESXi host, vSwitch or uplink. It had the same configuration as all the other hosts in the cluster.

The fix was to go to the top level where the alarm is defined, Edit Settings, disable the alarm, then go back and re-enable it.



After that, the errors stopped appearing.

2013-11-01
16:42:07.827
Reconfigured alarm 'Network uplink redundancy lost' on Datacenters
appname vc_username source hostname vc_event_type vc_details 
2013-11-01
16:41:48.137
Reconfigured alarm 'Network uplink redundancy lost' on Datacenters
appname vc_username source hostname vc_event_type vc_details


4 comments:

  1. Hello Daunce,

    I have same issue at the moment, errors also stopped for me by following your steps
    however on checking logs within host, issue is still there.

    So you should remove this blog, which provides a wrong soln, which only stops events from coming.

    BR

    ReplyDelete
    Replies
    1. Hi BR.

      Of course you need to re-enable the alarm.

      In my case it was a false positive.

      Your situation may be different. If it continues, contact VMware support.

      Delete
  2. Thanks! Turning this alarm off then back on fixed the same issue here. Wonder what might trigger that.

    ReplyDelete
  3. Ran the most recent updates for 5.5 and this alarm triggered afterwords. Disabling then re-enabling still fixed it! Thanks for posting the tip.

    ReplyDelete