Hello again,
We have analyzed your logs and seems like there might be couple of possible causes of this intermittent downtime:
- Infrastructure Issues:
- Resource contention (e.g., memory, CPU, or I/O bottlenecks).
- Network instability affecting API endpoints or backend services (LDAP, database).
- Cache and Session Management:
- Inefficient cache usage, leading to frequent lookups in the database or LDAP directory, causing delays or overload.
- LDAP and Persistence:
- Null LDAP responses and delays in fetching user data could cause failures in authentication and token issuance.
- High Request Volume:
- Increased API traffic during specific periods could exacerbate the above issues, causing system-wide instability.
I think server is facing [same](https://support.gluu.org/outages/11796/inaccessibility-of-the-gluuigree-application/#at89469) issue again.
Can you please share three things:
- What is the output of `dsreplication status` ?
- When server is down, which process is taking most of the CPU and memory?