Resolved -
This incident has now been resolved and we can confirm that the log ingestion backlog was fully cleared by approximately 12:40 a.m. on Friday (3rd April).
Apr 3, 01:00 BST
Monitoring -
We have identified the root cause of the issue. A tenant’s failed deployments generated an unusually high volume of logs, exceeding what OpenSearch could process. This has now been resolved, and ingestion rates have returned to normal. Log ingestion will continue to catch up gradually over the weekend. This will be monitored over the course of the bank holiday and into the week. Thank you for your patience.
Apr 2, 17:59 BST
Update -
We have identified that the issue is caused by a spike in log ingestion leading to overall performance degradation. As a result, there is a backlog in Logstash processing. Remediation actions are currently being investigated.
Apr 2, 11:28 BST
Investigating -
We have received reports of OpenSearch logs not processing in real-time. The team are currently investigating the issue at hand. We thank you for your patience.
Apr 2, 10:32 BST