Software Engineer, Site Reliability

Fal - San Francisco

Posted Feb 23, 2026

Benefits

Parental leave: Not verified
Non-birth-parent leave: Not verified
Family-building benefits: Fertility benefits: Not verified
Adoption assistance: Not verified
Surrogacy assistance: Not verified
Mental health support: Not verified
Relocation assistance: Not verified
Childcare support: Not verified
Learning budget: Not verified
Verification: Not verified
Salary: Not verified not verified - source not recorded; timestamp not recorded
401(k) match: Not verified

Was this benefit information wrong? Tell us.

Schedule

Shift type: Not verified
Weekend work: Not verified

Application

Cover letter: Not verified
Assessment: Not verified
Deadline: Not stated

Where they hire

State eligibility is not yet verified.

About this role

Software Engineer, Site Reliability San Francisco fal is the generative media ecosystem powering the next generation of AI products. We build the infrastructure, tools, and model access that teams need to move from idea to production, and do it at scale without compromise. For developers and enterprises, fal is the foundation that makes generative media not just possible, but practical: a unified platform where high-performance inference, orchestration, and observability come together to unlock new categories of AI-native products. As generative media reshapes industries across a market projected to grow by hundreds of billions over the next decade, fal is becoming the ecosystem that ambitious teams build on. About this role You are a seasoned SRE who keeps production infrastructure running at scale. You own the reliability and availability of customer-facing systems - from Kubernetes clusters to deployment pipelines to the networking layer that connects it all. You think in SLOs, automate ruthlessly, and treat every incident as a chance to make the system better. Key Responsibilities - Own and operate our Kubernetes infrastructure: cluster lifecycle, upgrades, networking, and multi-tenant isolation for customer workloads - Build and maintain CI/CD pipelines and deployment infrastructure - Leverage AI to an extreme level to automate analysis and resolution of production issues, and improve software development speed, reliability and maintainability - Build dashboards, alerting, and anomaly detection across our systems - Define and enforce SLOs and build out incident response processes - Manage and improve our networking, load balancing, and service mesh configurations - Drive

Read the full description at job-boards.greenhouse.io. FewerJobs shows a source-linked preview and links to the original posting.

Apply at job-boards.greenhouse.io

Apply link not verified; last-live date unavailable.

What verified means

Verified means a displayed claim has a recorded source field, a source URL when available, and a timestamp showing when FewerJobs checked or enriched the evidence.

Related jobs

Manufacturing Technician - Entry Level

Northrop Grumman - United States-Mississippi-Iuka
Sales Development Representative (SDR) Program (Multiple Openings!)

Viavi Solutions INC - Home Office, USA
Staff System Architect

Northrop Grumman - United States-Illinois-Rolling Meadows
Branch Customer Service

Accendra Health INC - FL LAKE CITY

Software Engineer, Site Reliability

Benefits

Schedule

Application

Where they hire

About this role

What verified means

Related jobs

Manufacturing Technician - Entry Level

Sales Development Representative (SDR) Program (Multiple Openings!)

Staff System Architect

Branch Customer Service