Optus finally makes call on cause of outage chaos

Optus finally makes call on cause of outage chaos


Optus is blaming a software upgrade for an outage that left millions without phone connections and internet services.

A 12-hour outage on Wednesday left 10 million individuals and businesses unable to make or receive calls or complete transactions.

The company said in a statement on Monday the cause was now known and steps had been taken to ensure it won’t happen again.

“At around 4.05am Wednesday morning, the Optus network received changes to routing information from an international peering network following a routine software upgrade,” the company said.

“These routing information changes propagated through multiple layers in our network and exceeded preset safety levels on key routers which could not handle these.

“This resulted in those routers disconnecting from the Optus IP Core network to protect themselves.”

The time taken to restore the system was longer than anticipated because some of the routers needed to be physically rebooted, requiring Optus staff to be deployed across a number of sites across the country.

“The restoration of the network was at all times our priority and we subsequently established the cause working together with our partners,” the telco said.

“We have made changes to the network to address this issue so that it cannot occur again.”

Suggestions the outage was caused by a software update were dismissed by Optus chief executive Kelly Bayer Rosmarin last week.

“It’s highly unlikely, our systems are actually very stable,” she told ABC Radio Sydney last Wednesday morning.

Optus will co-operate with reviews launched by the government and the Senate.

RMIT University associate professor Mark Gregory said the company’s statement confirmed it was human error, rather than a hardware failure or cyber attack.

“Optus has not explained what went wrong with the test process that should have occurred before the routing software upgrade occurred,” he said.

“Also, there is no explanation as to why there appears to have been a lack of redundancy of the key routers, so that if there was a problem the key routers would swap to the redundant routers, which you would expect to be running the previous iteration of software.”

Communications Minister Michelle Rowland found out about the problem from media reports last Wednesday.

She has asked her department for a post-incident review.



Source link