Senior Site Reliability Engineer AI Infrastructure

Andromeda Cluster

San Francisco 2026-04-30 $0–$0

Apply now ← Back to search

Is this role worth it? →

AI Summary

You will design, operate, and debug large-scale GPU infrastructure used for distributed training and inference, working directly with customers pushing the limits of modern AI systems. What Youâll Own GPU Cluster Architecture: Design and evolve multi-provider, multi-region GPU compute clusters optimized for large-scale training.

Job description

Share: LinkedIn X

Get a weekly digest of similar roles

Save this search for Senior Site Reliability Engineer AI Infrastructure in San Francisco around $0–$0 and get the strongest matches every week.

Privacy-first. Unsubscribe anytime.

Similar roles

Senior Full Stack Engineer, Creator Recruit @ CreatorIQ - Jobs

Creatoriq · London, ENG, GB

Senior Systems Engineer @ Share - Jobs

Share · London, ENG, GB

Senior Full Stack Engineer @ CreatorIQ - Jobs

Creatoriq · London, ENG, GB