Delays in displaying tickets, invoices, memos, and payments via the Subscriber Transaction and Logs in the Visp Web App
Incident Report for Visp
Postmortem

RESOLVED: Primary Database Performance Degradation and Brief RADIUS Outage [April 19, 2024]

Greetings!

A RADIUS outage occurred last April 19th, 2024 that lasted approximately three hours. The cause was a degraded performance in the primary database. The database's automatic scaling mechanism could not effectively allocate resources because the database storage usage neared the maximum threshold. 

Here’s a timeline of the incident:

  1. 10:43 AM PST. New payments or invoices were not immediately visible in the Transaction table but were still recorded in the Logs for each subscriber.
  2. 12:08 PM PST, a second incident occurred, preventing users from saving new invoices or storing payment details.
  3. 12:30 PM PST, Intermittent RADIUS authentication issues started occurring.

Root Cause Analysis (RCA)

Root Cause 1: Issue with Database Storage and Automatic Scaling Mechanism: The primary database relies on an automated system to scale its resources based on demand. However, the database storage size approached the maximum storage threshold, which prevented the system from triggering the necessary scaling steps during the incident. 

Contributing Factor 1: Unexpected Database Surge: A higher-than-anticipated surge in database activity occurred prior to the outage. While the automatic scaling mechanism should have addressed this, the limitation mentioned in the Root Cause, resulted in database performance issues. 

Contributing Factor 2: Synchronization delays between the Primary and Replica databases compounded the performance degradation already caused by the scaling issue and contributed to additional delays in certain billing queries and processes. 

Mitigation Steps:

To mitigate the identified issues, the team took immediate steps to increase the maximum storage threshold and the allocated storage of our database. However, the optimization process required several hours to complete, prolonging the resolution of the RADIUS authentication issue.

Action Items:

  1. The team will conduct a thorough investigation and implement solutions to ensure that the automatic scaling mechanism functions as expected. Review and potentially revise the thresholds and triggers for automatic scaling to ensure they are adequate to meet anticipated traffic and database load.
  2. The team will evaluate the database storage capacity to handle potential future surges in demand. 
  3. There’s already an ongoing project to migrate the RADIUS database to a separate instance and isolate it from other services.
  4. Conduct a post-mortem review with relevant stakeholders to discuss lessons learned and identify opportunities for improvement in

    1. Our monitoring and alerting procedures for database performance.
    2. Mass notifications of stakeholders or App Users in the event of an outage. 

If you have any questions or concerns, feel free to reach out to your Visp Client Success team via success@visp.net, or call at 541-955-6900.

Posted Apr 23, 2024 - 23:57 UTC

Resolved
The team has implemented a fix. Updates in the Subscriber Tickets, Logs, and Transaction panels should display quickly.

We apologize for the inconvenience and appreciate your patience and understanding while the team was working to resolve the issue.
Posted Apr 20, 2024 - 00:08 UTC
Update
Along with the reported delays, radius issues were also identified. The team is still working on this. The servers are not down, but the responses could be slow; however, it is expected to be resolved within an hour. Please bear with us.
Posted Apr 19, 2024 - 20:19 UTC
Update
We are continuing to work on a fix for this issue.
Posted Apr 19, 2024 - 20:17 UTC
Update
As the team works on this issue you may continue to experience delays in displaying invoices, memos, or payments in the Transaction panel of the subscriber account. However, the transactions will be visible from the Subscriber Logs.

For those who are creating invoices and adding new item charges, the functionality will continue to work as expected.

Feel free to contact your Visp Client Success team if you have questions or concerns. Updates will be posted once available.
Posted Apr 19, 2024 - 19:31 UTC
Identified
The team has identified the issue and a fix is being implemented.
Posted Apr 19, 2024 - 19:09 UTC
Investigating
We received reports of delays in displaying newly added transactions or items like invoices, custom invoices, credit memos, and payments in the subscriber transaction panel in the Visp Web App. The team is investigating the issue and will provide an update once available.
Posted Apr 19, 2024 - 17:58 UTC
This incident affected: VISP.