Elevated API Errors
Incident Report for Minds
Postmortem

Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.common.util.concurrent.TimedRunnable@d0b23c0 on QueueResizingEsThreadPoolExecutor[name = R2O1pwk/search, queue capacity = 5001, min queue capacity = 5001, max queue capacity = 5001, frame size = 2000, targeted response rate = 1s, task execution EWMA = 26.9ms, adjustment amount = 50, org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutor@638ddaeb[Running, pool size = 13, active threads = 13, queued tasks = 5071, completed tasks = 2224192729]]

Posted Apr 07, 2021 - 18:11 UTC

Resolved
We are no longer seeing 500 errors as a result of the rolling restarts.
Posted Apr 07, 2021 - 18:10 UTC
Monitoring
Elasticsearch exceeds it's active session quota during production rolling restarts, despite all 4 nodes having a 5000 connection limit. The issue usually resolves itself.
Posted Apr 07, 2021 - 18:05 UTC
Investigating
We're experiencing an elevated level of API errors and are currently looking into the issue.
Posted Apr 07, 2021 - 18:02 UTC
This incident affected: API.