Upstream provider routing issues

Degraded Performance

published: 2019-05-16 10:30:00
started: 2019-05-16 10:02:00
resolved: 2019-05-16 15:23:00

One of our upstream providers in Manchester has suffered a router failure. This caused a brief period of packet loss for networks reachable by that provider until our BGP routers had reconverged.

The affected connection has been taken out of service until the service has been fully restored by the provider.


Timeline

2019-05-16 10:28:00

The provider has acknowledged our ticket.

2019-05-16 10:37:00

The provider’s senior engineers are reviewing and update will be in the next 30 minutes. We extend their apologies for any inconvenience this may have caused.

2019-05-16 10:37:00

The provider’s senior engineers are reviewing and update will be in the next 30 minutes. We extend their apologies for any inconvenience this may have caused.

2019-05-16 10:51:00

The provider has sent a senior engineer to site and they are currently investigating their router.

The provider’s engineers have restored their core router, and are seeing services restore with all their monitoring alerts clear. Their engineers are remaining onsite and will continue to investigate. Faelix will keep this link disabled until we receive an all-clear.

2019-05-16 12:28:00

We have received the following update from our provider:

A cold restart was required to restore engineer access, but hardware monitoring reported component failure. Following the removal of a line card a second cold restart was performed, with the router reporting booting successfully with no further issues being reported. At this point the line card was reinserted, and a final cold restart performed. A full system health check was performed with no further issues being reported.

2019-05-16 15:23:00

All services should be returned to full operation.