Machine Learning Safety: Evaluation Research Engineer
Apple - San Francisco, United States of America
Posted Mar 16, 2026
Benefits
- Parental leave
- Not verified
- Non-birth-parent leave
- Not verified
- Family-building benefits
-
- Fertility benefits: Not verified
- Adoption assistance: Not verified
- Surrogacy assistance: Not verified
- Mental health support
- Not verified
- Relocation assistance
- Not verified
- Childcare support
- Not verified
- Learning budget
- Not verified
- Verification
- Not verified last checked Jun 13, 2026
- Salary
- Not verified not verified - source not recorded; timestamp not recorded
- 401(k) match
- Listed Source: EMPLR_CONTRIB_INCOME_AMT. source Last checked Jun 13, 2026.
Was this benefit information wrong? Tell us.
Schedule
- Shift type
- Not verified
- Weekend work
- Not verified
Application
- Cover letter
- Not verified
- Assessment
- Not verified
- Deadline
- Not stated
Where they hire
State eligibility is not yet verified.
About this role
Machine Learning Safety: Evaluation Research Engineer San Francisco, United States of America This role supports the design and development of safety evaluation methodologies for generative and agentic AI features that enable users across the globe to interact with our media products and services. You will play an impactful role: shaping responsible AI and safety policies, evaluating fidelity to product safety requirements, creating risk assessments and taxonomies, curating exemplar safety evaluation datasets, and ensuring that evaluation frameworks are culturally and linguistically grounded. An ideal candidate possesses a strong understanding of issues in responsible AI and A and society, technology evaluation design principles and practices, and brings experience designing evaluations to support policies and/or product requirements, classification systems, and annotation and/or study participant guidelines. Taxonomy Development: Design, refine, and maintain safety-relevant taxonomies that capture risk categories, content types, and policy distinctions, achieved through collaborations with subject matter experts who bring knowledge across languages and cultural contexts. You will work collaboratively to ensure taxonomies are comprehensive, internally consistent, and actionable for downstream evaluation work. Policy-to-Data Translation: Develop and validate exemplar sets that illustrate taxonomy categories, edge cases, and boundary conditions. Collaborate with language and cultural experts to ensure exemplars are culturally appropriate and representative across target markets. Partner with policy, product, and engineering teams to translate responsible AI policies and guidelines into concrete data requirements, annotation schemas, and evaluation criteria that can be operationalized across markets. Develop and maintain synthetic data generation pipelines to augment evaluation coverage, stress-test safety boundaries, and support evaluation
Read the full description at jobs.apple.com. FewerJobs shows a source-linked preview and links to the original posting.
Apply link not verified; last-live date unavailable.
What verified means
Verified means a displayed claim has a recorded source field, a source URL when available, and a timestamp showing when FewerJobs checked or enriched the evidence.
Related jobs
-
Systems Engineer - (Execution) - Level 3/4
Northrop Grumman - United States-Alabama-Huntsville
-
Business Analyst (Top Secret cleared)
ICF International INC - Washington, DC
-
Engineering Project Specialist II (Full Time) - United State
Cisco - San Jose, California, US
-
Automation AI Ops Engineer
Cisco - 2 Locations