Widespread Application Outage

Apps 7 hours, 25 minutes 59 hours, 58 minutes
Data 3 hours, 46 minutes 67 hours, 15 minutes
Tools 2 hours, 46 minutes 54 hours, 48 minutes

Follow-up Report

Activity

  • Resolved

    We have fully restored git services for all applications and App Operations are functioning as expected. We will provide a postmortem for this incident once we've completed our internal review.

    Posted 14 years ago, Apr 24, 2011 12:21 UTC

  • Update

    At this time all shared databases and all dedicated databases are restored. App Operations are functioning as expected.

    We're still working to restore git push functionality to a very small percentage of applications and expect to have this functionality restored soon.

    Posted 14 years ago, Apr 24, 2011 03:39 UTC

  • Update

    At this time most applications are online and we're making progress recovering the remaining affected shared databases. We do not have an ETA but the pace of recovery has increased.

    We're still working to restore deploy functionality for all applications.

    Posted 14 years ago, Apr 23, 2011 23:30 UTC

  • Update

    At this time most applications are online. Deploys have been restored for a majority of applications too. We're still hard at work recovering the remaining affected shared databases and deploy functionality. We do not at this time have an ETA but are dedicated to fully restoring all functionality as quickly as possible.

    Posted 14 years ago, Apr 23, 2011 19:22 UTC

  • Update

    We are continuing to work on recovering shared databases and restore git push functionality to any affected applications with all available resources. We do not at this time have an ETA.

    Posted 14 years ago, Apr 23, 2011 08:11 UTC

  • Update

    We have successfully recovered a number of databases. Additional applications running on a few shared database servers are still being recovered. We are continuing to work on recovering these shared databases with all available resources. We do not at this time have an ETA for the full recovery of these shared databases.

    Posted 14 years ago, Apr 22, 2011 22:26 UTC

  • Update

    We are aware of the incredible difficulties this downtime is causing many customers.

    Current Operational Status:
    All dedicated database applications are fully online.
    The majority of shared databases are online.
    The majority of applications can deploy via git.
    New app creation is fully working.

    Next steps:
    We are working with our service provider to restore both deployments and operation to affected shared databases as quickly as possible. In parallel, we are working on alternative recovery options.

    Posted 14 years ago, Apr 22, 2011 13:54 UTC

  • Update

    We have restored all dedicated databases and are continuing to work on the effected shared databases and deploy tools.

    Posted 14 years ago, Apr 22, 2011 09:19 UTC

  • Update

    We are still working through restoring the affected databases and restoring full deploy capabilities.

    Posted 14 years ago, Apr 22, 2011 06:07 UTC

  • Update

    We have successfully brought up our core services and begun restoring service to applications. Many applications are now fully operational. The remaining effected apps databases are being brought online now. We will continue to work to bring the remainder of applications up as quickly as possible.

    API services are now fully restored, and all gem commands are now working. Deploys are working for some applications. We will continue to work on restoring deploys for the remaining applications.

    Posted 14 years ago, Apr 22, 2011 03:27 UTC

  • Update

    We have successfully brought up our core services and begun restoring service to applications. Many applications are now fully operational. The remaining effected apps databases are being brought online now. We will continue to work to bring the remainder of applications up as quickly as possible.

    API services are now fully restored, and all gem commands are now working. We are working on restoring deploys.

    Posted 14 years ago, Apr 22, 2011 01:42 UTC

  • Update

    We are continuing to restore service to applications. In some cases the process of bringing many applications online simultaneously has created intermittent availability and elevated error rates. We continue to work to fully restore availability as quickly as possible.

    Posted 14 years ago, Apr 21, 2011 23:15 UTC

  • Update

    We have successfully brought up our core services and begun restoring service to applications. Many applications are now fully operational. We will continue to work to bring the remainder of applications up as quickly as possible and then restore api and git services.

    Posted 14 years ago, Apr 21, 2011 22:07 UTC

  • Update

    We have been able to successfully boot new servers and are in the process of restoring our core services. Once our core services come online we will be able to start to bring app operations back online. We will post further updates as soon as we have additional information.

    Posted 14 years ago, Apr 21, 2011 21:15 UTC

  • Update

    We continue to experience widespread connectivity issues that are preventing us from booting servers. We are working with our service provider to resolve this as soon as possible. We do not currently have an estimate for when this will be resolved. We will post further updates as soon as we have additional information.

    Posted 14 years ago, Apr 21, 2011 20:43 UTC

  • Update

    We are continuing to work with our service provider to restore outstanding connectivity issues. We will continue to update every half hour or as new information becomes available.

    Posted 14 years ago, Apr 21, 2011 20:01 UTC

  • Update

    We are continuing to work with our service provider to restore outstanding connectivity issues. We will continue to update every half hour or as new information becomes available.

    Posted 14 years ago, Apr 21, 2011 19:31 UTC

  • Update

    We are continuing to work with our service provider to restore outstanding connectivity issues. We will continue to update every half hour or as new information becomes available.

    Posted 14 years ago, Apr 21, 2011 19:01 UTC

  • Update

    We are continuing to work with our service provider to restore outstanding connectivity issues. We will continue to update every half hour or as new information becomes available.

    Posted 14 years ago, Apr 21, 2011 18:30 UTC

  • Update

    We are continuing to work with our service provider to restore outstanding connectivity issues. We will continue to update every half hour or as new information becomes available.

    Posted 14 years ago, Apr 21, 2011 17:58 UTC

  • Update

    We are continuing to work with our service provider to restore outstanding connectivity issues. We will continue to update every half hour or as new information becomes available.

    Posted 14 years ago, Apr 21, 2011 17:28 UTC

  • Update

    We are continuing to work with our service provider to restore outstanding connectivity issues. We will continue to update every half hour or as new information becomes available.

    Posted 14 years ago, Apr 21, 2011 16:58 UTC

  • Update

    We are continuing to work with our service provider to restore outstanding connectivity issues. We will continue to update every half hour or as new information becomes available.

    Posted 14 years ago, Apr 21, 2011 16:25 UTC

  • Update

    We are continuing to work with our service provider to restore outstanding connectivity issues. We will continue to update every half hour or as new information becomes available.

    Posted 14 years ago, Apr 21, 2011 15:54 UTC

  • Update

    We are continuing to work with our service provider to restore outstanding connectivity issues. We will continue to update every half hour or as new information becomes available.

    Posted 14 years ago, Apr 21, 2011 15:23 UTC

  • Update

    There is nothing new to report. We're continuing to work to get full connectivity restored and will continue updating every half hour.

    Posted 14 years ago, Apr 21, 2011 14:51 UTC

  • Update

    There is nothing new to report. We're continuing to work to get full connectivity restored and will continue updating every half hour.

    Posted 14 years ago, Apr 21, 2011 14:21 UTC

  • Update

    There is nothing new to report. We're continuing to work to get full connectivity restored and will continue updating every half hour.

    Posted 14 years ago, Apr 21, 2011 13:50 UTC

  • Update

    There is nothing new to report. We're continuing to work to get full connectivity restored and will continue updating every half hour.

    Posted 14 years ago, Apr 21, 2011 13:20 UTC

  • Update

    We're still seeing elevated error rates due to connectivity issues and are working with our service provider to fully restore service.

    Posted 14 years ago, Apr 21, 2011 12:50 UTC

  • Update

    We're still seeing elevated error rates due to connectivity issues and are working with our service provider to fully restore service.

    Posted 14 years ago, Apr 21, 2011 12:21 UTC

  • Update

    We're still seeing elevated error rates due to connectivity issues and are working with our service provider to fully restore service.

    Posted 14 years ago, Apr 21, 2011 11:50 UTC

  • Update

    We're still seeing elevated error rates due to connectivity issues and are working with our service provider to fully restore service.

    Posted 14 years ago, Apr 21, 2011 11:20 UTC

  • Update

    We're still seeing elevated error rates due to connectivity issues and are working with our service provider to fully restore service.

    Posted 14 years ago, Apr 21, 2011 10:50 UTC

  • Update

    Connectivity issues are causing applications and tools to work intermittently. We're working with our network service provider to fully restore service. There is nothing new to report at this time.

    Posted 14 years ago, Apr 21, 2011 10:20 UTC

  • Update

    Connectivity issues are causing applications and tools to work intermittently. We're working with our network service provider to fully restore service at this time.

    Posted 14 years ago, Apr 21, 2011 09:50 UTC

  • Update

    We do not have anything new to report at this time. We're still working with our network service provider to fully restore connectivity. Applications and tools are working intermittently at this time. We'll continue to update on a 30 minute interval unless we have something new to report.

    Posted 14 years ago, Apr 21, 2011 09:21 UTC

  • Update

    We're continuing to work with our network service provider to fully restore connectivity. Applications and tools are working intermittently at this time.

    Posted 14 years ago, Apr 21, 2011 09:07 UTC

  • Update

    The elevated error rates are due to connectivity issues. We're continuing to work with our network service provider to fully restore connectivity. Applications and tools are working intermittently at this time.

    Posted 14 years ago, Apr 21, 2011 08:52 UTC

  • Update

    Error rates appear to have stabilized. Both applications and tools are functioning as expected at this time. We're continuing to keep a close eye on the situation as we investigate the root cause.

    Posted 14 years ago, Apr 21, 2011 08:34 UTC

  • Issue

    We are investigating high error rates. We'll post an update when we know more.

    Posted 14 years ago, Apr 21, 2011 08:15 UTC

Current status