Inbox

4 unread alerts

Unread — 4

Deployment failed — Phi-3.8B-mini

error

Node gpu-07 ran out of memory during model load. OOM error at 94% allocation. Retry or select a larger GPU instance.

DeploymentPhi-3.8B-minius-east-11 h ago

GPU quota at 91% — eu-west-1

warning

Your dedicated GPU allocation in eu-west-1 is at 91% capacity. Consider requesting a quota increase to avoid throttling.

Quota Warningeu-west-11 h ago

p95 latency spike — Llama-3.1-70B

warning

p95 latency increased by +140 ms above the 7-day baseline. Current p95: 568 ms. Investigate GPU saturation or request queue depth.

Metric Anomalymeta-llama/Llama-3.1-70Bus-east-12 h ago

Quota increase request — pending approval

info

Your request for +8 A100 GPUs in us-east-1 is pending review by the infrastructure team. Expected SLA: 24 h.

Quota Requestus-east-13 h ago

Read — 8

Deployment started — Gemma-2-9B-it

info

Model deployment initiated on A10G × 2 in us-east-1. Estimated time to ready: ~8 minutes.

Deploymentgoogle/gemma-2-9b-itus-east-14 h ago

Error rate exceeded threshold — Mistral-7B

error

Error rate reached 4.2% over the last 15 minutes, exceeding the 2% alert threshold. Top error: HTTP 503 (upstream timeout).

Metric Anomalymistralai/Mistral-7B-v0.3ap-southeast-14 h ago

Token quota at 78% — shared environment

warning

Monthly token consumption in the shared dev cluster has reached 78% of the allocated quota. Resets on June 1.

Quota Warning6 h ago

Deployment successful — Qwen3.6-35B-A3B

success

Model is live and serving traffic on A100 × 4 in us-east-1. Current RPS: 18.4. Health checks passing.

DeploymentQwen/Qwen3.6-35B-A3Bus-east-19 h ago

Quota increase approved — ap-southeast-1

success

+4 A10G GPUs approved for ap-southeast-1. Quota is now active. You can deploy additional model instances immediately.

Quota Requestap-southeast-11 d ago

Throughput anomaly detected — Qwen3.6-35B

info

Throughput dropped 22% between 09:00–09:30 UTC. Likely correlated with upstream load balancer health check interval. Auto-resolved.

Metric AnomalyQwen/Qwen3.6-35B-A3Bus-east-11 d ago

Deployment failed — Falcon-40B

error

Scheduling failed: no eligible nodes with H100 GPUs available in eu-west-1. All H100 slots are currently occupied.

Deploymenttiiuae/Falcon-40Beu-west-11 d ago

Quota request requires additional info

warning

Your H100 quota increase request for eu-west-1 requires a business justification. Please update the request with a use-case description.

Quota Requesteu-west-12 d ago