Catalitium
May be filled

MTS - Distributed Inferencing Software Engineer - AI Models

AMD

Austin, Texas, United States 2025-12-05 US$ 143.280,00 per year - US$ 214.920,00 per year

AI Summary

Powered by Claude

Job description

The Person Strong technical and analytical skills in C++/Python AI development, solving performance and investigating scalability on multi-GPU, multi-node clusters. Key Responsibilities Enable, benchmark AI models on distributed systems Work in a distributed computing setting to optimize for both scale-up (multi-GPU) / scale-out (multi-node) / scale-across systems Collaborate and interact with internal GPU library teams to analyze and optimize distributed workloads for high throughput/low latency Expertise on parallelization strategies for AI workloads - and application for best performance for each configuration Contribute to distributed model management, model zoos, monitoring, benchmarking and documentation Preferred Experience Knowledge of GPU computing (HIP, CUDA, OpenCL) AI framework engineering experience (vLLM, SGLang, Llama.cpp) Understanding of KV cache transfer mechanisms, options (Mooncake, NIXL/RIXL) and Expert Parallelization (DeepEP/MORI/PPLX-Garden) Excellent C/C++/Python programming and software design skills, including debugging, performance analysis, and test design.

Get a weekly digest of similar roles

Save this search for MTS - Distributed Inferencing Software Engineer - AI Models in Austin, Texas, United States around US$ 143.280,00 per year - US$ 214.920,00 per yea and get the strongest matches every week.

Privacy-first. Unsubscribe anytime.

Catalitium logo

Weekly high-match job digest

One weekly email with your best matches and salary signal. Unsubscribe anytime.

Privacy-friendly. One email per week.

Catalitium logo

Contact us

Questions, partnerships, or feedback? Drop a note and we'll reply.

Catalitium logo

Submit a job

Share a role; we'll reach out via your email. Anonymous otherwise.

Optional. Share range in any format; leave blank if unsure.

0 / ~5000 words max