Member of Technical Staff, Performance Optimization
Fireworks AI - San Mateo, CA
Posted May 6, 2025
Benefits
- Parental leave
- Not verified
- Non-birth-parent leave
- Not verified
- Family-building benefits
-
- Fertility benefits: Not verified
- Adoption assistance: Not verified
- Surrogacy assistance: Not verified
- Mental health support
- Not verified
- Relocation assistance
- Not verified
- Childcare support
- Not verified
- Learning budget
- Not verified
- Verification
- Not verified
- Salary
- Not verified not verified - source not recorded; timestamp not recorded
- 401(k) match
- Not verified
Was this benefit information wrong? Tell us.
Market context
- Median wage (BLS OEWS)
- $116,543 national median
- Projected growth (BLS Employment Projections)
- +9.8% - Much faster than average
69% above the BLS national median for software engineering aggregate.
Matched to SOC 15-1252 - Software Engineering aggregate by role bucket.
Source: U.S. Bureau of Labor Statistics, OEWS, May 2024 and Employment Projections, 2024-2034.
Schedule
- Shift type
- Not verified
- Weekend work
- Not verified
Application
- Cover letter
- Not verified
- Assessment
- Not verified
- Deadline
- Not stated
Where they hire
State eligibility is not yet verified.
About this role
Member of Technical Staff, Performance Optimization San Mateo, CA About Us: At Fireworks, we're building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry. We've been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge innovation through projects like our own function calling and multimodal models. Fireworks is a Series C company valued at $4 billion and backed by top investors including Benchmark, Sequoia, Lightspeed, Index, and Evantic. We're an ambitious, collaborative team of builders, founded by veterans of Meta PyTorch and Google Vertex AI. The Role: We're looking for a Software Engineer focused on Performance Optimization to help push the boundaries of speed and efficiency across our AI infrastructure. In this role, you'll take ownership of optimizing performance at every layer of the stack-from low-level GPU kernels to large-scale distributed systems. A key focus will be maximizing the performance of our most demanding workloads, including large language models (LLMs), vision-language models (VLMs), and next-generation video models. You'll work closely with teams across research, infrastructure, and systems to identify performance bottlenecks, implement cutting-edge optimizations, and scale our AI systems to meet the demands of real-world production use cases. Your work will directly impact the speed, scalability, and cost-effectiveness of some of the most advanced generative AI models in the world. Key Responsibilities: - Optimize system and GPU performance for high-throughput AI workloads across training and inference - Analyze and improve latency, throughput, memory
Read the full description at job-boards.greenhouse.io. FewerJobs shows a source-linked preview and links to the original posting.
Apply link not verified; last-live date unavailable.
What verified means
Verified means a displayed claim has a recorded source field, a source URL when available, and a timestamp showing when FewerJobs checked or enriched the evidence.
Related jobs
-
Manufacturing Technician - Entry Level
Northrop Grumman - United States-Mississippi-Iuka
-
Sales Development Representative (SDR) Program (Multiple Openings!)
Viavi Solutions INC - Home Office, USA
-
Staff System Architect
Northrop Grumman - United States-Illinois-Rolling Meadows
-
Branch Customer Service
Accendra Health INC - FL LAKE CITY