This is my second post for CCIE Enterprise Study
Whenever a switch discovers an error condition on a port, it will automatically disable the port. In this state the port is basically shut down and there is no traffic passing through the interface neither Rx nor Tx.
How to recognize it? by issuing the command „show interface Gi0/1“ you will see the port in „err-disabled“ state. If you are not in front of the CLI you may also be able to recognize it by looking at the LED on the interface… it should be orange. In case you are in front of the CLI while a port goes down due an error condition, you will see a message appearing with a description why the port went down. You can also have a look later by looking into the logs (show logging).
Very important to mention here is, that an error is not necessary a permanent issue, it can be also be temporary like a flapping link, or an user plugging in a wrong device on the port. It would be very bad if the port remains down, imagine the uplink to a management switch, in a remote branch far away, going err-disabled only because of a misconfiguration of the port by enabling „BPDU Guard“. You will lose management access to all devices connected to this switch. To avoid this issue Cisco switches provide a feature called „errdisabled recovery“, which will re-enable the interface after 300 seconds (configurable value) it went down in order to see if the issue has been resolved or not. This function needs to be enabled for different cases.
First of all let us see the situations where a port will go into err-disabled state.
Switch#show errdisable detect ErrDisable Reason Detection Mode ----------------- --------- ---- arp-inspection Enabled port bpduguard Enabled port channel-misconfig (STP) Enabled port community-limit Enabled port dhcp-rate-limit Enabled port dtp-flap Enabled port ekey Enabled port gbic-invalid Enabled port iif-reg-failure Enabled port inline-power Enabled port invalid-policy Enabled port l2ptguard Enabled port link-flap Enabled port link-monitor-failure Enabled port loopback Enabled port lsgroup Enabled port oam-remote-failure Enabled port mac-limit Enabled port pagp-flap Enabled port port-mode-failure Enabled port pppoe-ia-rate-limit Enabled port psecure-violation Enabled port security-violation Enabled port sfp-config-mismatch Enabled port sgacl_limitation:enforcem Enabled port sgacl_limitation:multiple Enabled port storm-control Enabled port udld Enabled port unicast-flood Enabled port vmps Enabled port psp Enabled port dual-active-recovery Enabled port evc-lite input mapping fa Enabled port vsl-and-non-vsl-port-pair Enabled port Recovery command: "clear Enabled port fasthello-and-non-fasthel Enabled port Switch#
There is a further command where we can also have a look what are the threshhold for flapping links.
Switch#show errdisable flap-values ErrDisable Reason Flaps Time (sec) ----------------- ------ ---------- pagp-flap 3 30 dtp-flap 3 30 link-flap 5 10 Switch#
If a common port flapped 5 times within the last 10 seconds, the port will enter the err-disabled state.
The cases where a port can be recovered are the same as the reason the port goes down.
Switch#show errdisable recovery ErrDisable Reason Timer Status ----------------- -------------- arp-inspection Disabled bpduguard Disabled channel-misconfig (STP) Disabled dhcp-rate-limit Disabled dtp-flap Disabled gbic-invalid Disabled inline-power Disabled l2ptguard Disabled link-flap Disabled mac-limit Disabled link-monitor-failure Disabled loopback Disabled oam-remote-failure Disabled pagp-flap Disabled port-mode-failure Disabled pppoe-ia-rate-limit Disabled psecure-violation Disabled security-violation Disabled sfp-config-mismatch Disabled storm-control Disabled udld Disabled unicast-flood Disabled vmps Disabled psp Disabled dual-active-recovery Disabled evc-lite input mapping fa Disabled Recovery command: "clear Disabled Timer interval: 300 seconds Interfaces that will be enabled at the next timeout: Switch#
The Interval for recovery can be configured is shown in the next table. If we dont want to wait till the timer expires, the interface can recover it original state by shutting it down and unshutting it.
Switch(config)#errdisable recovery interval ? <30-86400> timer-interval(sec) Switch(config)#errdisable recovery interval
By default all functions are disabled. Let us configure BPDU Guard on a link between to switches and let see what we can see. After enabling BPDU Guard the port went automatically into err-disabled state. See the output below, this is what you can read from the console.
*Jun 23 21:18:09.183: %SPANTREE-2-BLOCK_BPDUGUARD: Received BPDU on port Et0/0 with BPDU Guard enabled. Disabling port. *Jun 23 21:18:09.184: %PM-4-ERR_DISABLE: bpduguard error detected on Et0/0, putting Et0/0 in err-disable state *Jun 23 21:18:09.503: %SYS-5-CONFIG_I: Configured from console by console *Jun 23 21:18:10.189: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/0, changed state to down *Jun 23 21:18:11.184: %LINK-3-UPDOWN: Interface Ethernet0/0, changed state to down
The port has detected an incomming BPDU frame on the interface and automatically shut the port down. By enabling errdisable recovery function for BPDU-Guard we can also see on the bottom of the following table how long it takes till the interface is being re-enabled again and why the port was down (Interface Et00/0, reason: bpduguard, timeleft: 73 seconds).
Switch#show errdisable recovery ErrDisable Reason Timer Status ----------------- -------------- arp-inspection Disabled bpduguard Enabled channel-misconfig (STP) Disabled dhcp-rate-limit Disabled dtp-flap Disabled gbic-invalid Disabled inline-power Disabled l2ptguard Disabled link-flap Disabled mac-limit Disabled link-monitor-failure Disabled loopback Disabled oam-remote-failure Disabled pagp-flap Disabled port-mode-failure Disabled pppoe-ia-rate-limit Disabled psecure-violation Disabled security-violation Disabled sfp-config-mismatch Disabled storm-control Disabled udld Disabled unicast-flood Disabled vmps Disabled psp Disabled dual-active-recovery Disabled evc-lite input mapping fa Disabled Recovery command: "clear Disabled Timer interval: 30 seconds Interfaces that will be enabled at the next timeout: Interface Errdisable reason Time left(sec) --------- ----------------- -------------- Et0/0 bpduguard 73 Switch#
After the 73 seconds expired, the port will be re-enabled, if the error condition still exist, the port will go into err-disabled state again. We can observe by inspecting the logs, that the system attempts to recover from err-disable state, but BPDU frames are still arriving on that port.
*Jun 24 07:47:08.761: %PM-4-ERR_RECOVER: Attempting to recover from bpduguard err-disable state on Et0/0 *Jun 24 07:47:08.793: %SPANTREE-2-BLOCK_BPDUGUARD: Received BPDU on port Et0/0 with BPDU Guard enabled. Disabling port.
Let add a BPDU filter on the opposite site.
Switch#sh int status Port Name Status Vlan Duplex Speed Type Et0/0 err-disabled 1 auto auto RJ45 Et0/1 notconnect 1 auto auto RJ45 Et0/2 notconnect 1 auto auto RJ45 Et0/3 notconnect 1 auto auto RJ45 Switch# *Jun 24 07:51:38.603: %PM-4-ERR_RECOVER: Attempting to recover from bpduguard err-disable state on Et0/0 *Jun 24 07:51:40.608: %LINK-3-UPDOWN: Interface Ethernet0/0, changed state to up *Jun 24 07:51:41.610: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/0, changed state to up Switch#show int status Port Name Status Vlan Duplex Speed Type Et0/0 connected 1 a-full auto RJ45 Et0/1 notconnect 1 auto auto RJ45 Et0/2 notconnect 1 auto auto RJ45 Et0/3 notconnect 1 auto auto RJ45 Switch#
Now the port remains up after recovering. It is important to understand where the recovery function has to be implemented and where not. Continuously recovering of a flappnig link could cause some issues in the network. Therefore sometimes it is better to leave the port into err-disable state till the problem has been fixed.