Machine Learning Engineer - Inference

Together AI - San Francisco

Posted Jun 6, 2024

Benefits

Parental leave: Not verified
Non-birth-parent leave: Not verified
Family-building benefits: Fertility benefits: Not verified
Adoption assistance: Not verified
Surrogacy assistance: Not verified
Mental health support: Not verified
Relocation assistance: Not verified
Childcare support: Not verified
Learning budget: Not verified
Verification: Not verified
Salary: Not verified not verified - source not recorded; timestamp not recorded
401(k) match: Not verified

Was this benefit information wrong? Tell us.

Schedule

Shift type: Not verified
Weekend work: Not verified

Application

Cover letter: Not verified
Assessment: Not verified
Deadline: Not stated

Where they hire

State eligibility is not yet verified.

About this role

Machine Learning Engineer - Inference San Francisco About the Role Together AI is seeking a Machine Learning Engineer to join our Inference Engine team, focusing on optimizing and enhancing the performance of our AI inference systems. This role involves working with state-of-the-art large language models models and ensuring they run efficiently and effectively at scale. If you are passionate about AI inference, PyTorch, and developing high-performance systems, we want to hear from you. This position offers the chance to collaborate closely with AI researchers and engineers to create cutting-edge AI solutions. Join us in shaping the future at Together AI! Responsibilities - Design and build the production systems that power the Together AI inference engine, enabling reliability and performance at scale. - Develop and optimize runtime inference services for large-scale AI applications. - Collaborate with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world. - Conduct design and code reviews to ensure high standards of quality. - Create services, tools, and developer documentation to support the inference engine. - Implement robust and fault-tolerant systems for data ingestion and processing. Requirements - 3+ years of experience writing high-performance, well-tested, production-quality code. - Proficiency with Python and PyTorch. - Demonstrated experience in building high performance libraries and tooling. - Excellent understanding of low-level operating systems concepts including multi-threading, memory management, networking, storage, performance, and scale. - Preferred: Knowledge of existing AI inference systems such as TGI, vLLM, TensorRT-LLM, Optimum - Preferred: Knowledge of AI inference

Read the full description at job-boards.greenhouse.io. FewerJobs shows a source-linked preview and links to the original posting.

Apply at job-boards.greenhouse.io

Apply link not verified; last-live date unavailable.

What verified means

Verified means a displayed claim has a recorded source field, a source URL when available, and a timestamp showing when FewerJobs checked or enriched the evidence.

Related jobs

Systems Engineer - (Execution) - Level 3/4

Northrop Grumman - United States-Alabama-Huntsville
Business Analyst (Top Secret cleared)

ICF International INC - Washington, DC
Engineering Project Specialist II (Full Time) - United State

Cisco - San Jose, California, US
Automation AI Ops Engineer

Cisco - 2 Locations

Machine Learning Engineer - Inference

Benefits

Schedule

Application

Where they hire

About this role

What verified means

Related jobs

Systems Engineer - (Execution) - Level 3/4

Business Analyst (Top Secret cleared)

Engineering Project Specialist II (Full Time) - United State

Automation AI Ops Engineer