Core Ring at Risk Backbone

Event Started
2022-11-07 09:00
Report Published
2022-11-08 17:50
Last Updated
2023-10-27 09:20
Event Finished
Ongoing

One of our core link providers is waiting on fibre works to be completed to reinstate our Manchester-London link.

While they are rectifying this, we are relying on connectivity via Leeds.

Timeline (most recent first)
  • 2022-11-17
    11:56:00

    Provider's NOC has again closed the ticket, which we have rejected. Their NOC team lead called us, explaining that this is a provisioning issue rather than an operational issue, and as such should not be in the NOC's ticket queue. We have explained that provisioning and account management aren't communicating to us about a fix, so we are not going to let the operational team close this ticket because it is a fault causing us operational issues.

  • 2022-11-17
    10:00:00

    Account manager at provider is saying that the necessary works may take several days to be booked in. We have explained why this is unacceptable.

  • 2022-11-17
    09:30:00

    Provider is again trying to close our ticket without the promised updates about progress. We are continuing to push back and say that we need a resolution.

  • 2022-11-16
    11:29:00

    Finally we have managed to get someone at the provider's provisioning team to talk to us.

  • 2022-11-16
    10:59:00

    We are getting no traction with the provider's provisioning team, so we have rejected the fault closure by the provider's operations team.

  • 2022-11-15
    23:36:00

    We have received a detailed email from our Manchester-London path provider's operations team. In it they state that, because this is a provisioning fault, they are unable to rectify service for us.

    [This will require the] Provision escalations team and the order manager for [service] for this issue to be resolved tomorow. Unfortunately, they will not be online until 09:30 on 16.11.22 at the earliest

  • 2022-11-15
    23:27:00

    Tests showed that our "path of last resort" was in our IGP after we configured it on 10th November. However, the link was not used during problems we faced on our Manchester-Leeds-London path this evening due because the OSPF configuration seems to have broken at the London end of this link:

    vyos@fossey.a.faelix.net:~$ show configuration commands | grep "vif 32"

    [...snip...]

    set interfaces ethernet eth2 vif 32 mtu '2976'

    vyos@fossey.a.faelix.net:~$ show ip ospf interface | grep eth2.32

    eth2.32 is up

    vyos@fossey.a.faelix.net:~$

    But in London, although the configuration is there, OSPF does not appear to be active on the interface:

    vyos@gunn.x.faelix.net:~$ show configuration commands | grep "vif 32"

    [...snip...]

    set interfaces ethernet eth1 vif 32 mtu '2976'

    vyos@gunn.x.faelix.net:~$ show ip ospf interface | grep "eth1.32"

    vyos@gunn.x.faelix.net:~$

  • 2022-11-15
    22:50:00

    We have escalated the fault.

  • 2022-11-15
    11:27:00

    We have raised a fault with the provider of our Manchester-London path as this circuit is not working.

  • 2022-11-15
    11:08:00

    The provider's NOC have finally moved some of our services across as originally ordered, which has restored resilience to our ring.

  • 2022-11-15
    11:08:00

    The provider of our Manchester-London path has handed over a certificate of completion. We will keep this link costed out until we are satisfied with its functioning.

  • 2022-11-10
    08:49:00

    We enabled a higher latency backup path via two other providers, costed as a "path of last resort". The latency for this is 17ms rather than the usual ~6ms, but it will ensure the integrity of our ring should our Leeds-London path experience issues.

  • 2022-11-09
    13:59:00

    We believe that the optical issue at Leeds has been resolved with a replacement DWDM optic. We are monitoring the situation.

  • 2022-11-09
    05:00:00

    We are investigating a potential DWDM issue in Leeds. Engineers are going to site later today to assess whether an optic needs to be replaced there.

  • 2022-11-08
    17:50:00

    We currently do not have an ETA for when this will be resolved.