FewerJobs.
All jobs

ML Research Scientist I/II, Multimodal Data Extraction

Lila Sciences - Cambridge, MA USA

Posted Nov 3, 2025

Benefits

Parental leave
Not verified
Non-birth-parent leave
Not verified
Family-building benefits
  • Fertility benefits: Not verified
  • Adoption assistance: Not verified
  • Surrogacy assistance: Not verified
Mental health support
Not verified
Relocation assistance
Not verified
Childcare support
Not verified
Learning budget
Not verified
Verification
Not verified
Salary
Not verified not verified - source not recorded; timestamp not recorded
401(k) match
Not verified

Was this benefit information wrong? Tell us.

Market context

Median wage (BLS OEWS)
$111,944 national median
Projected growth (BLS Employment Projections)
+13.7% - Much faster than average

114% above the BLS national median for data and ml aggregate.

Matched to SOC 15-1252 - Data and ML aggregate by role bucket.

Source: U.S. Bureau of Labor Statistics, OEWS, May 2024 and Employment Projections, 2024-2034.

Schedule

Shift type
Not verified
Weekend work
Not verified

Application

Cover letter
Not verified
Assessment
Not verified
Deadline
Not stated

Where they hire

State eligibility is not yet verified.

About this role

ML Research Scientist I/II, Multimodal Data Extraction Cambridge, MA USA Your Impact at LILA As a ML Research Scientist - Multimodal Data Extraction , you will advance Lila's vision of scientific superintelligence by developing foundation models that autonomously read, interpret, and structure scientific knowledge across text, images, and experimental data in the physical sciences. Your research will help unify the world's scientific information into machine-understandable form, powering reasoning, prediction, and autonomous discovery across materials science and chemistry. What You'll Be Building - Research and develop AI systems that extract and structure knowledge from diverse scientific sources. - Design and fine-tune large language, multi-modal and specialized models for factual, interpretable data extraction. - Build scalable pipelines for unstructured and heterogeneous scientific data , integrating text, tables, and visuals. - Collaborate with domain experts to align extracted data with real-world discovery workflows. - Publish research that advances the state of the art in multimodal understanding and AI-driven knowledge extraction. What You'll Need to Succeed - PhD (or equivalent research experience) in Computer Science, Chemistry, Materials Science, or related field. - Expertise in machine learning , NLP , and vision-language modeling using PyTorch and Hugging Face Transformers . - Proven ability to train, fine-tune, and evaluate LLMs and multimodal models for scientific data extraction. - Strong understanding of data structures and representations used in the physical sciences. - Demonstrated research impact through publications, preprints, or open-source work (e.g., NeurIPS, ICLR, ICML, ACL, EMNLP, Scientific Journals). Bonus Points For - Experience with multimodal fusion

Read the full description at job-boards.greenhouse.io. FewerJobs shows a source-linked preview and links to the original posting.

Apply at job-boards.greenhouse.io

Apply link not verified; last-live date unavailable.

What verified means

Verified means a displayed claim has a recorded source field, a source URL when available, and a timestamp showing when FewerJobs checked or enriched the evidence.

Related jobs