Error on login
Incident Report for signNow
Postmortem

On Wednesday, August 23rd, 2023, signNow experienced an incident preventing users from logging in for 3 hours and 5 minutes, starting from approximately 12:30 PM. As of 3:30 PM ET, service has been fully restored. If you are still experiencing difficulty logging in or any abnormal performance degradation, our support teams are here to assist you.

Below is a recap of the incident, along with steps being taken to ensure this does not happen again and how we plan to improve communications with affected customers should future incidents occur.

Incident timeline

Aug 23, 2023 At approximately 12:30 PM ET signNow support received customer complaints regarding login issues in signNow.

12:34 PM - Identified an issue with one of our databases and its encryption key. The encryption key was deactivated and the cluster was stuck due to the unavailability of the encryption key, resulting in an inaccessible state.

12:37 PM - The encryption key was restored and re-enabled.

13:06 PM - The database cluster still did not recover, so we took multiple steps simultaneously, including rebuilding the cluster on alternate AWS resources per the disaster recovery plan.

15:30 PM - We restored the functionality of the website.

15:55 PM - Incident closed

What went wrong:

The problem was caused by an encryption key that was accidentally disabled. This caused our AWS Relational Database (RDS) cluster to switch to the backup. This is a typical safety measure to ensure high availability for our databases. However, the data on this backup couldn’t be decrypted as the encryption key was needed. As a result, the database cluster became unavailable, affecting our services and user logins.

Restoration challenge:

Attempts to fix the problem faced challenges. Although the key’s state was restored, our RDS database cluster still had trouble accessing it, delaying our efforts to restore service. Investigation into what caused the deactivation of the encryption key is ongoing and will be necessary to prevent such incidents from recurring.

Here’s what was done to remediate this incident:

  • A new database cluster was deployed and made ready for production.
  • Service configurations were updated to use the new resource.
  • Testing was carried out to guarantee the implemented solution.

Here are the steps we’re taking to ensure this doesn’t happen again:

  • Review existing encryption keys management practices and their policies. Additional measures will be added to prevent unintentional key deactivation from happening in the future.
  • More resilient architecture.  We’ll assess our application architecture to make it more resilient to failure and stable to ensure smoother services.
  • Robust testing. We’re enhancing our pre-production infrastructure tests to catch and fix problems before they affect you proactively.
  • Timely communication with our customers. Should any similar situation arise, we’re implementing new protocols to proactively notify impacted customers and inform them about our progress with the best available information.

We want to reiterate that we understand the importance of our services to your business. The relationships we’ve built with you are paramount to us, and we are using this to improve the service we provide our valued customers like yourself.

Posted Aug 29, 2023 - 13:12 EDT

Resolved
This incident has been resolved.
Posted Aug 23, 2023 - 15:04 EDT
Update
We are continuing to monitor the issue and apply additional fixes.
Posted Aug 23, 2023 - 14:24 EDT
Monitoring
We are monitoring the results of the fix.
Posted Aug 23, 2023 - 13:46 EDT
Identified
We are polishing the fix and will apply it in the shortest time possible.
Posted Aug 23, 2023 - 13:07 EDT
Monitoring
A fix has been applied and we are monitoring the results.
Posted Aug 23, 2023 - 13:00 EDT
Update
We are continuing to work on a fix for this issue.
Posted Aug 23, 2023 - 11:40 EDT
Identified
The issue has been identified and we are implementing the fix.
Posted Aug 23, 2023 - 11:40 EDT
Update
We are continuing to investigate this issue.
Posted Aug 23, 2023 - 11:34 EDT
Investigating
We are currently investigating the issue.
Posted Aug 23, 2023 - 11:33 EDT
This incident affected: Web, Mobile apps, and API.