Machine Learning Engineer - Inference
Together AI - San Francisco
Posted Jun 6, 2024
Benefits
- Parental leave
- Not verified
- Non-birth-parent leave
- Not verified
- Family-building benefits
-
- Fertility benefits: Not verified
- Adoption assistance: Not verified
- Surrogacy assistance: Not verified
- Mental health support
- Not verified
- Relocation assistance
- Not verified
- Childcare support
- Not verified
- Learning budget
- Not verified
- Verification
- Not verified
- Salary
- Not verified not verified - source not recorded; timestamp not recorded
- 401(k) match
- Not verified
Was this benefit information wrong? Tell us.
Schedule
- Shift type
- Not verified
- Weekend work
- Not verified
Application
- Cover letter
- Not verified
- Assessment
- Not verified
- Deadline
- Not stated
Where they hire
State eligibility is not yet verified.
About this role
Machine Learning Engineer - Inference San Francisco About the Role Together AI is seeking a Machine Learning Engineer to join our Inference Engine team, focusing on optimizing and enhancing the performance of our AI inference systems. This role involves working with state-of-the-art large language models models and ensuring they run efficiently and effectively at scale. If you are passionate about AI inference, PyTorch, and developing high-performance systems, we want to hear from you. This position offers the chance to collaborate closely with AI researchers and engineers to create cutting-edge AI solutions. Join us in shaping the future at Together AI! Responsibilities - Design and build the production systems that power the Together AI inference engine, enabling reliability and performance at scale. - Develop and optimize runtime inference services for large-scale AI applications. - Collaborate with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world. - Conduct design and code reviews to ensure high standards of quality. - Create services, tools, and developer documentation to support the inference engine. - Implement robust and fault-tolerant systems for data ingestion and processing. Requirements - 3+ years of experience writing high-performance, well-tested, production-quality code. - Proficiency with Python and PyTorch. - Demonstrated experience in building high performance libraries and tooling. - Excellent understanding of low-level operating systems concepts including multi-threading, memory management, networking, storage, performance, and scale. - Preferred: Knowledge of existing AI inference systems such as TGI, vLLM, TensorRT-LLM, Optimum - Preferred: Knowledge of AI inference
Read the full description at job-boards.greenhouse.io. FewerJobs shows a source-linked preview and links to the original posting.
Apply link not verified; last-live date unavailable.
What verified means
Verified means a displayed claim has a recorded source field, a source URL when available, and a timestamp showing when FewerJobs checked or enriched the evidence.
Related jobs
-
Systems Engineer - (Execution) - Level 3/4
Northrop Grumman - United States-Alabama-Huntsville
-
Business Analyst (Top Secret cleared)
ICF International INC - Washington, DC
-
Engineering Project Specialist II (Full Time) - United State
Cisco - San Jose, California, US
-
Automation AI Ops Engineer
Cisco - 2 Locations