CenturyLink's latest outage one of the largest in history
The CenturyLink outage that occurred this past Sunday, August 30, was reportedly one of the largest Internet outages in history, according to Cloudflare which saw its own services go down and recorded a 3.5% drop in total global Internet traffic.
The network outage also took down the services of big tech names including Amazon, Twitter, Hulu, Microsoft (Xbox Live), EA, Blizzard, Steam, Discord, Reddit, Starbucks, Chase, GoDaddy, Peloton, Venmo and many others. It took several hours to fix. In a blog post, networking monitoring firm ThousandEyes noted that the outage started at approximately 6 a.m. ET and was not "fully resolved" until 11:30 a.m. ET.
CenturyLink – a Monroe, Louisiana-based ISP – first posted a tweet on Sunday morning, at 9:30 a.m. ET confirming awareness of the issue:
Ultimately, CenturyLink stated that the outage was caused by an incorrect Flowspec announcement, originating from the company's data center in Mississauga, Canada, that prevented Border Gateway Protocol (BGP) from establishing correctly. Flowspec, an extension of BGP, is a commonly used protocol for pushing out network firewall rules.
As Cloudflare explained in its post-outage blog: "Because this outage appeared to take all of the CenturyLink/Level(3) network offline, individuals who are CenturyLink customers would not have been able to reach Cloudflare or any other Internet provider until the issue was resolved. We saw a 3.5% drop in global traffic during the outage, nearly all of which was due to a nearly complete outage of CenturyLink's ISP service across the United States."
To put the error into context, Cloudflare further said that while the Internet normally sees about 1.5MBs – 2MBs of BGP updates every 15 minutes, the number of BGP updates spiked to more than 26MBs in that same time frame. The cause of that BGP instability, according to CenturyLink, was the misconfigured Flowspec.
Occurring early on a Sunday morning, the multi-hour outage was less disruptive than it could have been, despite taking down some of the largest Internet services in existence.
But as ThousandEyes points out, the incident was still "extremely unusual." The firm further noted in its blog how enterprises can protect themselves from such a disruption going forward:
"During the course of the incident, some traffic routed through service providers other than Level 3 [CenturyLink] was reaching services, but getting dropped by Level 3 on the reverse path. Keeping in mind asymmetric routing, if enterprises had not only revoked advertisements to Level 3 (which were ignored by the provider), but also stopped accepting route announcements from Level 3 and shut down peering, they could have reduced the impact on their traffic."
At 11:17 a.m. ET, CenturyLink's help team tweeted that all problems were resolved (although that was followed by several replies from customers saying that they were still down).
Last outage triggered FCC investigation
At the time, CenturyLink told Light Reading that the issue was "a faulty network management card from a third-party equipment vendor."
The network outage got the attention of FCC Chairman Ajit Pai and triggered an investigation:
In its report on the incident, issued in August 2019, the FCC recommended a series of best practices that it said "if implemented, could have prevented the outage." Those included turning off or disabling idle system features; having network monitoring memory and processor utilization alarms that are "regularly audited to ensure functionality and evaluated to improve early detection and calibration"; and "having standard operating procedures for network repair that address cases where normal networking monitoring procedures are inoperable or otherwise unavailable."
— Nicole Ferraro, contributing editor, Light Reading
Here's where you can find episode links for 'The Divide,' Light Reading's podcast series featuring conversations with broadband providers and policymakers working to close the digital divide.
As we have for the past two years, Light Reading will present our Cable Next-Gen Europe conference as a free digital symposium on June 21.
Charter has sparked RDOF work in all 24 states where it won bids. The cable op booked about $19 million in RDOF revenues in Q1, and expects to have about $9 million per month come in over the next ten years.
As we have for the past two years, Light Reading will stage the Cable Next-Gen Technologies & Strategies conference as a free digital event over two half-days in mid-March.
Launch of 2-Gig and 5-Gig FTTP tiers in 70-plus markets puts more pressure on cable ops to enhance their existing DOCSIS 3.1 network or accelerate their upgrade activity centered on the new DOCSIS 4.0 specs.
Wednesday, September 14, 2022
1:00 p.m. New York / 6:00 p.m. London
When your broadband business adds new services and connected devices, do they also add complexity, slowing customer support teams as they navigate multiple data sources to uncover connectivity issues? We’ve worked with hundreds of support teams to help them implement a subscriber experience management platform that gives greater visibility into subscriber issues. They can proactively troubleshoot amid complexity—improving the subscriber experience and raising customer satisfaction ratings like Net Promoter Scores.
Join this webinar with experts from Calix and global research leader Omdia who will share exclusive research about how you can:
Broadband World News
About Us Advertise With Us Contact Us Help Register Twitter Facebook RSS
Copyright © 2023 Light Reading, part of Informa Tech,
in partnership with