Atlassian says that this month’s two-week-long cloud outage has impacted almost double the number of customers it initially estimated after learning of the incident.
As the company’s Chief Technology Officer Sri Viswanath revealed on April 14th, nine days after the incident started, a maintenance script accidentally wiped hundreds of customer sites due to communication issues between two Atlassian teams working on deactivating a legacy app.
However, instead of being provided the ID required to disable the app, the deactivation team was sent the IDs for the cloud sites where the app was installed.
Also, the script was launched using the wrong execution mode (i.e., permanent deletion of data instead of deleting it with a recovery failsafe).
The 14-day-long outage impacted a very small set of Atlassian customers between April 5th and April 18th. The first set of impacted sites was restored until April 8th and the rest of the affected customer sites by April 18th.
During the incident, the following Atlassian products have been unavailable for impacted customers: the entire Jira family of products, Confluence, Atlassian Access, Opsgenie, and Statuspage.
We have now restored our customers impacted by the outage and have reached out to key contacts for each affected site. https://t.co/ZvAFZ2pq8A
— Atlassian (@Atlassian) April 17, 2022
The outage impacted a total of 775 customers
While Atlassian told us when we first reported on this outage that the sites of roughly 400 out of its over 200,000 cloud customers were wiped, Viswanath revealed on Friday that the actual number was almost double.
After analyzing the data gathered during the incident’s investigation, Atlassian’s estimate also changed to include impacted inactive, free, or small accounts with low numbers of active users.
“The result was an immediate deletion of 883 sites (representing 775 customers) between 07:38 UTC and 08:01 UTC on Tuesday, April 5th, 2022,” Viswanath said.
“Although this was a major incident, no customer lost more than five minutes of data. In addition, over 99.6% of our customers and users continued to use our cloud products without any disruption during the restoration activities.”
While a small number of Atlassian customers had their Confluence or Insight databases restored and lost five minutes’ worth of data, Atlassian says it was able to recover it and is working on getting all the data restored.
“We have since recovered the remainder of the data, contacted the customers affected by this, and are helping them apply changes to further restore their data,” Viswanath added.
Outage timeline (Atlassian)
Not the result of a cyberattack
Atlassian initially estimated that the restoration efforts would not take more than several days and confirmed to BleepingComputer that there was no unauthorized access to the customers’ data since this outage was not caused by a cyberattack or an internal malicious act.
“More broad, public communications surrounding the outage, along with the repetition of the critical message that there was no data loss and this was not the result of a cyberattack, would have been the correct approach,” Viswanath said.
“Rather than wait until we had a full picture, we should have been transparent about what we did know and what we didn’t know.
“Providing general restoration estimates (even if directional) and being clear about when we expected to have a more complete picture would have allowed our customers to better plan around the incident.”
The outage impacted customers using the company’s cloud products and came after Atlassian announced in October 2020 that it would no longer sell licenses for on-premises products starting February 2021.
One of Atlassian’s co-founders and Co-CEOs, Scott Farquhar, also added that support for already active licenses would be discontinued three years later, on February 2nd, 2024.