
We are currently experiencing a service issue with crossmint.com and our API. Our team is working to identify the root cause and fix the issue. Users may be experiencing drop of requests.
The issue resolved. We’re sorry for inconvenience caused. Please reach out to support@crossmint.com for any questions.
We are currently experiencing a service degradation on our main API server (production). Our team is working to identify the root cause and fix the issue.
The issue has been resolved at 12:05 EST. We’re sorry for the downtime caused.
Impact: production API server for crossmint.com had an intermittent outage for 1h 20minutes.
Root cause:
- AWS ECS agent (this is Amazon software), running on EC2, had faulty behavior. It stopped releasing EC2 instances when recycling them, leaving them in abandoned state. We found frm our infra provider (flightcontrol) that this is a recent issue others have been experiencing as well.
- Our deploy pipelien is not resilient to not being able to obtain an instance from our pool - and crashed. Due to faulty logic, we stopped serving traffic from our last deploy until the issue was resolved.
Fix
The issue has been resolved permanently:
- We deployed an infra change that cleans the state of EC2 instances if this situation occurs again
- We doubled our pool size
- We set new alerts that will fire if the issue seems to be manifesting again, before the pool is exhausted
- We will escalate the issue to AWS
We are currently experiencing a service degradation on our main API server (production). Our team is working to identify the root cause and fix the issue.
The issue has been resolved at 12:05 EST. We’re sorry for the downtime caused.
Impact: production API server for crossmint.com had an intermittent outage for 1h 20minutes.
Root cause:
- AWS ECS agent (this is Amazon software), running on EC2, had faulty behavior. It stopped releasing EC2 instances when recycling them, leaving them in abandoned state. We found frm our infra provider (flightcontrol) that this is a recent issue others have been experiencing as well.
- Our deploy pipelien is not resilient to not being able to obtain an instance from our pool - and crashed. Due to faulty logic, we stopped serving traffic from our last deploy until the issue was resolved.
Fix
The issue has been resolved permanently:
- We deployed an infra change that cleans the state of EC2 instances if this situation occurs again
- We doubled our pool size
- We set new alerts that will fire if the issue seems to be manifesting again, before the pool is exhausted
- We will escalate the issue to AWS
