SUMO community discussions

[RESOLVED] [OUTAGE] Temporary issues with accessing SUMO

  1. You may be experiencing difficulties in using the site.

    We received this information from the MOC team:

    One of the three database instances serving support.mozilla.org was failing causing long response times and 38 minutes of outage over the current incident duration. Site stability has greatly improved but database load still is not at normal levels. Continued troubleshooting involving MOC, WebOps and Pythian is ongoing.

    Bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1313012 (restricted access during troubleshooting).

    You may be experiencing difficulties in using the site. We received this information from the MOC team: One of the three database instances serving support.mozilla.org was failing causing long response times and 38 minutes of outage over the current incident duration. Site stability has greatly improved but database load still is not at normal levels. Continued troubleshooting involving MOC, WebOps and Pythian is ongoing. Bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1313012 (restricted access during troubleshooting).
  2. Another update:

    Root Cause: A long running query causing heavy load on the database is believed to be the source of the issues on support.mozilla.org. Additional research is being conducted on the query interaction with the database to validate this analysis. An RCA will be completed by involved teams to provide next steps to prevent further outages. The site response time and availability has been stable since 04:00a and the MOC team will continue to monitor the site closely for any issues.

    End date/time: 2016-10-26 04:00a

    Duration: 193 minutes

    bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1313012

    Another update: Root Cause: A long running query causing heavy load on the database is believed to be the source of the issues on support.mozilla.org. Additional research is being conducted on the query interaction with the database to validate this analysis. An RCA will be completed by involved teams to provide next steps to prevent further outages. The site response time and availability has been stable since 04:00a and the MOC team will continue to monitor the site closely for any issues. End date/time: 2016-10-26 04:00a Duration: 193 minutes bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1313012