Multimodal Generative AI Researcher

Stability AI - Remote

Posted Jan 30, 2026

Benefits

Parental leave: Not verified
Non-birth-parent leave: Not verified
Family-building benefits: Fertility benefits: Not verified
Adoption assistance: Not verified
Surrogacy assistance: Not verified
Mental health support: Not verified
Relocation assistance: Not verified
Childcare support: Not verified
Learning budget: Not verified
Verification: Not verified
Salary: Not verified
401(k) match: Not verified

Was this benefit information wrong? Tell us.

Schedule

Shift type: Not verified
Weekend work: Not verified

Application

Cover letter: Not verified
Assessment: Not verified
Deadline: Not stated

Where they hire

State eligibility is not yet verified.

About this role

Multimodal Generative AI Researcher Remote Multimodal Generative AI Researcher Location: Remote About the Role We're looking for a Research Scientist with deep expertise in training and fine-tuning large Vision-Language and Language Models (VLMs / LLMs) for downstream multimodal tasks. You'll help push the next frontier of models that reason across vision, language, and 3D , bridging research breakthroughs with scalable engineering. What You'll Do - Design and fine-tune large-scale VLMs / LLMs - and hybrid architectures - for tasks such as visual reasoning, retrieval, 3D understanding, and embodied interaction. - Build robust, efficient training and evaluation pipelines (data curation, distributed training, mixed precision, scalable fine-tuning). - Conduct in-depth analysis of model performance: ablations, bias / robustness checks, and generalisation studies. - Collaborate across research, engineering, and 3D / graphics teams to bring models from prototype to production. - Publish impactful research and help establish best practices for multimodal model adaptation. What You Bring - PhD (or equivalent experience) in Machine Learning, Computer Vision, NLP, Robotics, or Computer Graphics. - Proven track record in fine-tuning or training large-scale VLMs / LLMs for real-world downstream tasks. - Strong engineering mindset - you can design, debug, and scale training systems end-to-end. - Deep understanding of multimodal alignment and representation learning (vision-language fusion, CLIP-style pre-training, retrieval-augmented generation). - Familiarity with recent trends, including video-language and long-context VLMs , spatio-temporal grounding , agentic multimodal reasoning , and Mixture-of-Experts (MoE) fine-tuning. - Awareness of 3D-aware multimodal models - using NeRFs, Gaussian splatting, or differentiable renderers for

Read the full description at stability.ai. FewerJobs shows a source-linked preview and links to the original posting.

Apply at stability.ai

Apply link not verified; last-live date unavailable.

What verified means

Verified means a displayed claim has a recorded source field, a source URL when available, and a timestamp showing when FewerJobs checked or enriched the evidence.

Related jobs

Security Coordinator 4 (12675-1. 15471-1. 13771-1)

Northrop Grumman - United States-Utah-Roy
Loan Servicing Representative

AXOS Financial INC - Las Vegas, NV
Staff Test Conductor

Northrop Grumman - United States-California-Palmdale
Off Premise Specialist

Constellation Brands - 2 Locations

Multimodal Generative AI Researcher

Benefits

Schedule

Application

Where they hire

About this role

What verified means

Related jobs

Security Coordinator 4 (12675-1. 15471-1. 13771-1)

Loan Servicing Representative

Staff Test Conductor

Off Premise Specialist