Employer profile
Together AI
54 open roles indexed with location, benefit, and apply-link signals where available.
Open roles
Showing the most recent indexed roles for this employer.
-
Data Center Operations Coordinator
San Francisco
unspecified $150K-$200KData Center Operations Coordinator San Francisco About the Role We're looking for a detail-oriented Data Center Operations professional to manage and track all break/fix activities across multiple data center locations. This role acts as the central point of coordination for hardware incidents, vendor dispatches, ticket management, asset tracking, and operational reporting to ensure maximum uptime and fast issue resolution. Responsibilities - Track and manage all break/fix incidents across multiple data centers - Monitor ticket queues and ensure SLA compliance for incident response and resolution - Coordinate with on-site technicians, remote hands teams, vendors, and engineering groups - Maintain accurate records of failed hardware, replacements, RMAs, and repair status - Escalate critical outages and recurring infrastructure issues to leadership and engineering teams - Schedule and oversee maintenance windows and emergency repair activities - Provide daily/weekly operational status reports and incident summaries - Ensure all work follows data center operational procedures and change management policies - Identify trends in hardware failures and recommend process improvements Requirements - Experience working in data center operations, IT infrastructure, or hardware support - Strong understanding of server, storage, and networking hardware - Experience with ticketing systems such as ServiceNow, Jira, or Remedy - Ability to manage multiple priorities across several sites simultaneously - Excellent communication and organizational skills - Familiarity with SLA management and incident escalation processes - Proficiency with Excel, reporting dashboards, and inventory tracking tools Preferred Qualifications - Experience supporting enterprise or hyperscale data centers - Knowledge of remote hands operations and vendor management
-
Staff Engineer, Distributed Storage and HPC & AI Infrastructure
San Francisco
unspecified $250K-$300KStaff Engineer, Distributed Storage and HPC & AI Infrastructure San Francisco About the Role In this role, you will design and deliver multi-petabyte storage systems purpose-built for the world's largest AI training and inference workloads. You'll architect high-performance parallel filesystems and object stores, evaluate and integrate cutting-edge technologies such as WekaFS, Ceph, and Lustre, and drive aggressive cost optimization-routinely achieving 30-50% savings through intelligent tiering, lifecycle policies, capacity forecasting, and right-sizing. You will also build Kubernetes-native storage operators and self-service platforms that provide automated provisioning, strict multi-tenancy, performance isolation, and quota enforcement at cluster scale. Day-to-day, you'll optimize end-to-end data paths for 10-50 GB/s per node, design multi-tier caching architectures, implement intelligent prefetching and model-weight distribution, and tune parallel filesystems for AI workloads. Responsibilities - Design multi-petabyte AI/ML storage systems; integrate WekaFS, Ceph, etc.; lead capacity planning and cost optimization (30-50% savings via tiering, lifecycle policies, right-sizing). - Design/optimize RDMA, InfiniBand, 400GbE networks; tune for max throughput/min latency; implement NVMe-oF/iSCSI; troubleshoot bottlenecks; optimize TCP/IP for storage. - Build Kubernetes storage operators/controllers; enable automated provisioning, self-service abstractions, multi-tenant isolation, quotas; create reusable Helm/Terraform patterns. - Deliver 10-50 GB/s per GPU node; optimize caching (weights/datasets/checkpoints), parallel filesystems, and data paths; troubleshoot with profiling tools; scale to thousands of nodes. - Build multi-tier caches (local NVMe, distributed, object); optimize data locality and model-weight distribution; implement smart prefetching/eviction. - Implement monitoring, alerting, SLOs; design DR/backups with runbooks; run chaos engineering; ensure 99.9%+ uptime via proactive/automated remediation. - Partner with ML/SRE teams; mentor on storage
-
Manager, Infrastructure Strategy & Operations
San Francisco
unspecified $220K-$260KManager, Infrastructure Strategy & Operations San Francisco About The Role Together AI is rapidly scaling its compute infrastructure across multiple sites and deployment types. The Manager, Infrastructure Strategy & Operations role will be the analytical backbone of the Infrastructure Strategy team, owning the research, benchmarking, and decision frameworks that shape how we source, evaluate, and deploy compute at scale. You will sit at the center of real-time sourcing and vendor decisions, owning the market intelligence, site comparisons, and operational analysis that drive infrastructure strategy. You will produce the domain-specific inputs that inform cross-functional decisions with Finance, Infra Eng, and leadership. Responsibilities - Conduct strategic analysis on how to scale and deploy Together's compute infrastructure, translating complex operational data into clear, actionable recommendations for leadership. - Build dashboards and reporting infrastructure that give the team real-time visibility into compute utilization, infrastructure costs, deployment status, and vendor pipelines. - Identify opportunities to optimize how infrastructure is allocated and operated across workloads through compute utilization analysis. - Develop and maintain comparison frameworks for infrastructure sourcing decisions (own vs. lease, location strategy, vendor selection, site evaluation) and synthesize vendor proposals and market data into decision-ready recommendations. - Run ad hoc analyses to support capacity planning decisions. - Develop tracker to monitor critical compute costs across existing and future providers - Research and evaluate data center sites and energy sourcing options, comparing power availability, connectivity, permitting timelines, deployment readiness, and reliability. - Champion process improvements across the Infrastructure Strategy function, collaborating cross-functionally with Engineering, Data,
-
Lead/Manager Together Cloud Infrastructure
Amsterdam
unspecified Salary not disclosedLead/Manager Together Cloud Infrastructure Amsterdam About the Role Together AI is building the AI Acceleration Cloud, an end-to-end platform for the full generative AI lifecycle, combining the fastest LLM inference engine with state-of-the-art AI cloud infrastructure. As a Lead/Manager, you will play a key role in building the Together cloud platform engineering team in the Netherlands. We are a highly available, global, blazing-fast cloud infrastructure that virtualizes cutting-edge ML hardware (GB200s/GB300s, BlueField DPUs) and enables state-of-the-art ML practitioners with self-serve AI cloud services, such as on-demand + managed Kubernetes and Slurm clusters. This platform serves both our internal SaaS products (inference, fine-tuning) and our external cloud customers, spanning dozens of data centers across the world. Some of what you'll work on: - Work on a distributed GPU scheduling system for the on-demand clusters product, Instant Clusters. - Build out a global management plane for managing our data center compute, networking, and storage. - Design and build new customer-facing cloud platform services, delivering killer enterprise AI cloud features. Hybrid working 2 days a week at our offices in Amsterdam Responsibilities - Lead/Manage a team of 8 together cloud Infrastructure Engineer in Amsterdam, - Identify, design, and develop foundational backend services that power Together's commerce platform - Analyze and improve the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure - Partner with product teams to understand functional requirements and deliver solutions that meet business needs - Write clear, well-tested, and maintainable software and IaC for both new and existing
-
Customer Support Engineer (Inference)
San Francisco, CA
unspecified Salary not disclosedCustomer Support Engineer (Inference) San Francisco, CA About the Role As a Customer Support Engineer at a pioneering AI company, you'll be the first line of defense to support customers as they build out training, fine tuning, and inference solutions with Together AI. You'll dive deep into complex technical challenges, providing swift and effective solutions while serving as a product expert. As a part of the Customer Experience organization, you will collaborate closely with product and sales, driving continuous improvement of our offerings. This is an exciting opportunity for a deeply technical professional passionate about AI and customer success to make a significant impact in a fast-paced, innovative environment. Responsibilities - Engage directly with customers to tackle and resolve complex technical challenges involving our cutting-edge GPU clusters and our inference and fine-tuning services; ensure swift and effective solutions every time. - Become a product expert in all of our Gen AI solutions, serving as the last line of technical defense before issues are escalated to Engineering and Product teams. - Collaborate seamlessly across Engineering, Research, and Product teams to address customer concerns; collaborate with senior leaders both internally and externally to ensure the highest levels of customer satisfaction. - Transform customer insights into action by identifying patterns in support cases and working with Engineering and Go-To-Market teams to drive Together's roadmap (e.g., future models to support) - Maintain detailed documentation of system configurations, procedures, troubleshooting guides, and FAQs to facilitate knowledge sharing with team and customers. - Be flexible in
-
Senior Technical Recruiter, AI/ML Research
San Francisco
unspecified $165K-$210KSenior Technical Recruiter, AI/ML Research San Francisco About the Role Together AI is building the AI Native Cloud - an end-to-end platform for generative AI lifecycle, integrating fast, reliable inference, and model-shaping services with cutting-edge AI cloud infrastructure. We are looking for a seasoned Senior Technical Recruiter to partner closely with AI Research and Engineering leadership to scale world-class research teams across kernels, inference optimization, applied AI and model shaping. This role is ideal for someone who understands the unique dynamics of recruiting top-tier AI researchers and research engineers in a highly competitive market and can operate as a strategic talent partner to technical leadership. Responsibilities - Partner with executives, research leadership, and hiring managers to define and execute hiring strategies - Lead full-cycle recruiting for specialized AI talent, including researchers, research engineers, applied scientists, and ML systems engineers - Build and nurture relationships with top AI talent across academia, open-source communities, research labs, and industry networks - Drive exceptional candidate experiences from initial engagement through offer close, with a strong focus on relationship building and long-term talent cultivation - Provide market intelligence on AI talent trends, compensation, competitive hiring landscapes, and emerging research organizations to influence hiring strategy and organizational planning - Collaborate cross-functionally with sourcing, coordination, people operations, and leadership teams to continuously improve recruiting processes and operational excellence - Design and refine interview processes tailored for research hiring, including technical evaluations, publication reviews, and research presentation loops Requirements - 5+ years of technical recruiting experience at high-growth
-
Junior Technical Program Manager — Infrastructure Operations
San Francisco
unspecified $150K-$175KJunior Technical Program Manager — Infrastructure Operations San Francisco About the Role Together AI runs one of the most demanding GPU fleets in the industry. Keeping that fleet healthy - every node online, every GPU performing, every datacenter transition running on schedule - is operationally complex and genuinely high-stakes. We're looking for a Junior TPM to own that operational reality. This is not a coordination or status-reporting role. You will own the end-to-end node lifecycle - from the moment a node goes down through repair, return, and re-integration - and you'll drive the cross-functional work to close every gap as fast as possible. You'll manage datacenter bring-ups, hunt down GPU utilization loss, and build the processes and dashboards that make our fleet operations more visible and accountable over time. The environment moves fast and doesn't always come with a clear playbook. Much of what you'll work on is genuinely novel - you'll be figuring things out alongside engineers who are building at the frontier. If that sounds like an obstacle, this isn't the right role. If it sounds like the best possible way to learn, keep reading. Responsibilities - Own the end-to-end node lifecycle - from failure through repair, return, and re-integration - across provider ticketing, internal tooling, and the state machine that governs each stage - Drive node remediation to resolution with urgency, eliminating gaps in ownership at every handoff - Manage project timelines for new datacenter bring-ups, coordinating across internal teams and external providers to keep milestones on
-
Staff Platform Engineer, Voice AI
San Francisco
unspecified $220K-$280KStaff Platform Engineer, Voice AI San Francisco About the Role Together AI is defining the infrastructure layer for the next generation of voice applications. Our Voice AI platform powers production-grade, real-time voice agents at scale - and we're looking for a Staff Platform Engineer to own the architecture that makes it possible. This isn't a role about maintaining what exists. You'll set the technical direction for how developers interact with Together's voice platform - from the real-time API primitives they build on, to the autoscaling systems that keep latency SLOs intact under unpredictable load, to the multi-provider abstraction layer that makes our platform uniquely powerful. Voice infrastructure is categorically harder than text inference: bidirectional audio streams, stateful long-lived connections, millisecond latency requirements, and complex multi-model routing don't forgive architectural shortcuts. You'll bring the judgment to get this right the first time, at scale. This is a foundational hire on a small, high-conviction team. The decisions you make in this role will define the platform architecture for years. Responsibilities - Own the architecture and reliability of Together's real-time API layer - set the technical direction for WebSocket and HTTP streaming APIs powering STT and TTS at scale; establish the reliability bar (connection lifecycle, backpressure, graceful degradation, reconnection) that production voice agents - contact centers, AI agents, communication platforms - depend on. - Lead autoscaling architecture for latency-sensitive voice workloads - design and ship orchestration systems that handle bursty, real-time traffic across tens of thousands of GPUs; solve the hard problems at
-
Staff Machine Learning Engineer, Voice AI
San Francisco
unspecified $220K-$280KStaff Machine Learning Engineer, Voice AI San Francisco About the Role Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications - serving speech-to-text and text-to-speech models with best-in-class latency and reliability. We're looking for a Staff ML Engineer to drive the model serving layer for voice workloads. You'll work hands-on with inference engines like TRT-LLM and SGLang to optimize how we serve models like Whisper, Parakeet, Orpheus, and Kokoro - pushing latency and throughput to the frontier. You'll profile GPU utilization, design batching strategies for streaming audio, and ensure new model architectures can go from research to production quickly. This is a foundational hire on a small, high-impact team. Voice inference has unique challenges - streaming audio, tokenization, real-time latency budgets - that require dedicated ML engineering focus. You'll shape how Together serves voice models as the industry moves from pipeline architectures (ASR → LLM → TTS) toward end-to-end speech-to-speech. - Own the model serving stack that powers Together's voice platform across STT, TTS, and speech-to-speech. - Work directly with state-of-the-art accelerators (H100s, H200s, B200s) to optimize voice model inference. - Collaborate with model partners (Cartesia, Deepgram, Rime, and others) to bring their models to production on Together's infrastructure. - Build quality evaluation frameworks that guide model selection for customers and inform the roadmap. - Join a small, early-stage team with outsized impact on a fast-growing product area. Responsibilities - Own the voice inference roadmap end-to-end -
-
AI Infrastructure Engineer
San Francisco
unspecified $190K-$270KAI Infrastructure Engineer San Francisco As an AI Infrastructure Engineer at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, and mature automation to our operating environments and codebase. You specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems. Responsibilities - Participate in on-call rotation (Pagerduty) to respond to production incidents - Build and run our infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users - Build monitoring systems to ensure the highest quality service for our customers - Design and implement operational processes (such as deployments and upgrades) - Debug production issues across all services and levels of the stack - Identify improvements for the product architecture from the reliability, performance and availability perspectives - Plan the growth of Together AI's infrastructure Requirements - 5+ years of professional AI Infra or related experience - Bachelor's degree in Computer Science or a related field or equivalent work experience - Knowledge of Ansible (roles, playbooks), Terraform, and Kubernetes - Proficiency in programming/scripting languages - Direct experience in monitoring and observability practices - Knowledge of cloud services - Ability to thrive in a collaborative environment involving different stakeholders and subject matter experts About Together AI Together AI is a research-driven artificial intelligence company. We believe
-
Infrastructure Design Engineer
San Francisco
unspecified $210K-$250KInfrastructure Design Engineer San Francisco About The Role Together AI is building its infrastructure footprint at scale, and this role is central to making that happen. As an Infrastructure Design Engineer, you will own the design, planning, and technical execution of whitespace environments (where servers, storage, and network equipment are deployed) across our AI data center portfolio. You are the in-house expert who ensures that rack layouts, power distribution, cooling strategy, structured cabling, and physical infrastructure design are all built to support the density, redundancy, network and reliability requirements of large-scale AI GPU clusters. You will serve as the lead engineer across our DC portfolio, creating white space designs, reviewing partner and contractor designs, and ensuring plans are executed to spec. You will work closely with the Infrastructure Strategy, Infrastructure Engineering, and Operations teams, as well as external MEP consultants, general contractors, and data center partners. This is a technical role on a small, high-accountability team where your judgment directly shapes our ability to bring capacity online on time and to spec. Responsibilities - Architect HPC clusters by designing whitespace layouts, including rack placement, aisle configuration, hot/cold aisle containment, equipment density, and airflow strategy for high-density GPU deployments - Collaborate with electrical and mechanical engineers to integrate power and cooling infrastructure into whitespace environments - Collaborate with Network Engineering to define and validate physical layer requirements (structured cabling, pathway planning, port density) for high-speed AI cluster interconnects, ensuring design compatibility with both physical and logical network architectures. - Advise Data
-
Sr. Revenue Accountant
San Francisco
unspecified $160K-$190KSr. Revenue Accountant San Francisco About the Role Together AI is looking for a Senior Revenue Accountant to join our Revenue Accounting team and help scale our revenue operations as we grow. This role will own three core areas: end-to-end accounts receivable, invoicing, and collections, as well as revenue recognition in compliance with ASC 606. You will partner closely with FP&A, Sales, Customer Success, and Commerce Engineering to ensure revenue is recorded accurately, in compliance with US GAAP. The ideal candidate is detail-oriented and collaborative, with a strong foundation in revenue accounting and hands-on experience across the full order-to-cash cycle. If you thrive in a fast-paced environment with the opportunity to lead, influence, and innovate across both day-to-day operations and high-impact accounting policy decisions, we want to talk to you. Responsibilities - Prepare monthly and quarterly journal entries for revenue, deferred revenue, and AR; lead close workstreams including reconciliations, roll-forwards, and flux analysis. - Reconcile revenue accounts, AR aging, deferred revenue balances, and other revenue-related accounts on a monthly basis; support revenue disclosures for management reporting and audits. - Own the end-to-end AR process: invoice generation, cash application, aging management, and collections follow-up; monitor overdue accounts and drive timely resolution. - Partner with Customer Success and Sales to resolve billing disputes, credits, and customer-specific payment terms; maintain AR KPIs including DSO and collections effectiveness. - Ensure invoices are issued accurately and in alignment with executed contracts; review order forms to validate billing terms, pricing, and revenue treatment prior to invoicing.
-
Infrastructure Accounting Manager
San Francisco
unspecified $180K-$220KInfrastructure Accounting Manager San Francisco About the Role Together AI is looking for an Infrastructure Accounting Manager to help shape the future of the department by building scalable and reliable accounting processes for our rapidly growing capital asset and lease portfolios. This position will report to the Director of FinOps and Accounting and will own end-to-end accounting for Together AI's fixed assets, construction-in-progress (CIP), and lease obligations supporting our AI infrastructure. The ideal candidate is a highly skilled, detail-oriented professional with deep technical accounting expertise and a proactive, cross-functional mindset. If you thrive in a fast-paced environment with the opportunity to lead, influence, and innovate across both day-to-day operations and high-impact accounting policy decisions, we want to talk to you. Responsibilities - Own end-to-end accounting for Together AI's fixed asset portfolio, including GPU infrastructure, networking equipment, racks, and related data center assets; help build and maintain capitalization policy and asset classification guidelines, including useful life determinations, in accordance with ASC 360. - Review and approve asset additions, transfers, disposals, and retirements; ensure accurate depreciation calculations and subledger-to-GL reconciliations; maintain and continuously improve capitalization policies and related internal controls. - Manage accounting for Construction in Progress (CIP) related to GPU infrastructure deployments and data center buildouts, tracking asset commissioning and ensuring timely capitalization once assets are placed into service. - Partner with engineering, infrastructure, and business operation teams to validate deployment milestones, cost allocations, and in-service dates; monitor CIP aging and ensure appropriate classification and capitalization timing. - Manage accounting and
-
Forward Deployed Engineer (Inference & Post-Training)
San Francisco
unspecified $270K-$300KForward Deployed Engineer (Inference & Post-Training) San Francisco About the role As a Forward Deployed Engineer (FDE) focused on Inference & Post-Training, you will be a hands-on technical partner to our most strategic customers - production AI teams looking to leverage high quality models and do inference at scale. For us, FDE is not a replacement for a Solutions Architect; you will partner with our SAs as a deep-domain specialist in inference optimization, fine-tuning pipelines, and production deployment. As key contributors to both the CX, Engineering, and Sales organizations, FDEs add tremendous value by ensuring we can meet the requirements of our most complex POCs, facilitate successful platform adoption, and guide tailored optimization efforts - directly impacting customer success, company growth, and the hardening of our core platform. Responsibilities - Inference Engine Optimization: Select, configure, and optimize inference engine based on hardware, model architecture, and workload profile - Configuration & Performance Tuning: Develop configuration updates to win critical POCs, benchmarks, and optimize customer deployments; tune KV cache, apply speculative decoding, determine optimal tensor parallelism, and determine quantization strategy to hit throughput and latency targets. - Post-Training & Fine-Tuning: Drive hands-on RL training runs and optimize system design; guide customers through LoRA, SFT, DPO, RLHF, and GRPO pipelines from experimentation through production. - Strategic Customer Alignment: Act as the primary technical point of contact for aligned strategic accounts - monitoring and optimizing endpoint configurations, helping customers get the most out of the platform, and collaborating to ensure we hit critical milestones.
-
Technical Account Manager (TAM), GPU Cluster
San Francisco
unspecified $260K-$290KTechnical Account Manager (TAM), GPU Cluster San Francisco About the role As a TAM at Together AI, you will serve as the named technical owner for one of our most strategic customer relationships. You will be the primary technical point of contact across all infrastructure domains - compute, networking, storage, and facilities - ensuring flawless delivery and operational health of large-scale GPU deployments. This role sits at the intersection of deep infrastructure expertise and high-stakes customer partnership, making you a critical driver of both customer success and company growth. Responsibilities - Serve as the named technical point of contact for a dedicated strategic customer, owning the end-to-end technical relationship across compute, networking, storage, and facilities - Drive structured engagement through regular cadences including status reporting, technical steering meetings, and executive business reviews - Translate customer operational feedback into actionable input for Engineering, Product, and Infrastructure roadmaps - Lead issue lifecycle management, escalation, and RCA authorship across all infrastructure domains in partnership with Support, SRE, DC Ops, and Engineering teams - Own end-to-end RMA coordination and hardware lifecycle management, including acceptance testing, spare inventory management, and hardware health reporting for large-scale GPU deployments - Maintain deep technical expertise across the customer's infrastructure stack - GPU compute, high-speed fabric, and large-scale storage systems - advising on configuration, operational best practices, and incident resolution - Own the observability strategy for the customer estate, including alert policy definition, dashboard development, and proactive health management across all infrastructure layers - Coordinate DC operations and facilities
-
Finance Analytics Engineer
San Francisco
unspecified $200K-$240KFinance Analytics Engineer San Francisco About the Role This is the first dedicated data hire on Together AI's Finance team. You will own the data layer that Finance runs on - building from scratch the models, pipelines, and reporting infrastructure that allow Strategic Finance, FP&A, and Accounting teams to get reliable answers quickly. The person in this role will have direct exposure to every part of the Finance organization and a real opportunity to shape how data-driven decision-making develops here as the company scales. A significant portion of your work will touch the data behind the economics of Together's infrastructure, which sits behind nearly every financial question we ask. You will work closely with Together's Data and Commerce engineering team, which owns the underlying billing pipelines and data warehouse. Your job is to define and build the modeling and reporting layer that turns raw operational data into finance-grade datasets - aligning on data contracts, representing Finance's requirements in data design decisions, and ensuring the metrics Finance depends on are correct, documented, and trusted. Responsibilities - Own and evolve the dbt transformation layer for Finance: design, build, test, document, and maintain models covering billing, financial performance, compute unit economics, and operational metrics - Author and maintain Airflow DAGs (Astronomer-managed) that orchestrate Finance dbt runs, data quality checks, and downstream dependencies reliably - Deliver dashboards and reporting in Hex for the executive team covering financial performance, utilization, and key operating metrics - Partner with Strategic Finance, FP&A, and Accounting teams on the
-
Staff Backend Engineer - Commerce
San Francisco
unspecified $230K-$270KStaff Backend Engineer - Commerce San Francisco About the Role Together AI is seeking a Staff Backend Engineer to own the technical vision, architecture, and execution of the commerce platform powering Together's Cloud products. As a staff engineer on the Commerce Engineering team, you will set the engineering direction for mission-critical capabilities - including usage-based billing, payment processing, customer-facing analytics, and product entitlements - while raising the bar for the engineers around you. This role is ideal for a seasoned engineer who can operate at multiple altitudes: driving system design and long-term architecture decisions while staying hands-on in code. You'll be the connective tissue between engineering, product, finance, and go-to-market - translating complex business requirements into durable, scalable backend primitives that directly impact revenue and customer experience. You'll be expected to define not just what we build, but how we build it well. Requirements - 8+ years of experience building large-scale, fault-tolerant API-driven services and backend systems - Demonstrated track record of tech lead or staff-level scope - owning system architecture decisions, influencing team direction, and delivering multi-quarter initiatives - Deep expertise in designing relational database schemas and backend APIs that support complex product and business requirements at scale - Strong experience leading and influencing cross-functional teams - comfortable partnering with product, finance, and non-technical stakeholders - Expert-level proficiency in one or more of Golang, TypeScript, Python, C++, or Java - Deep knowledge of distributed systems tradeoffs: consistency, availability, fault tolerance, and performance at scale - Strong understanding of low-level
-
Director, Data Center Operations
San Francisco
unspecified $250K-$300KDirector, Data Center Operations San Francisco About the Role Together AI is scaling its physical AI infrastructure rapidly - and we're looking for a Director of Data Center Operations to help us build it right. This is a ground-floor opportunity to own the operational foundation of Together's growing data center portfolio across the US and Asia. You'll be responsible for designing and commissioning white space deployments - taking pre-built environments and fitting them out with the power distribution, cooling distribution, and systems infrastructure needed to run high-density GPU workloads at scale. At the same time, you'll be building the break-fix and smart hands team from scratch: hiring, defining the playbook, and standing up the function that keeps our sites running around the clock. This is not a steady-state operations role. It's a builder role. You'll be joining a small but fast-moving team, with real ownership over outcomes and the autonomy to shape how Together AI operates its physical infrastructure for years to come. If you've scaled data center infrastructure through hypergrowth before and want to do it again with more ownership - this is that opportunity. Responsibilities - Own the design, fit-out, and commissioning of white space sites across the US and Asia, with a focus on power distribution (PDUs), cooling distribution (CDUs), and IT-adjacent infrastructure - Build and lead a ~20-person break-fix and smart hands team from scratch - define the operating model, hire the initial team, and establish the processes and playbooks that keep sites running - Manage
-
Customer Support Engineer (GPU Cluster)
San Francisco
unspecified Salary not disclosedCustomer Support Engineer (GPU Cluster) San Francisco About the role As a Customer Support Engineer at a pioneering AI company, you'll be the first line of defense to support customers as they build out training, fine tuning, and inference solutions with Together AI. You'll dive deep into complex technical challenges, providing swift and effective solutions while serving as a product expert. As a part of the Customer Experience organization, you will collaborate closely with product and sales, driving continuous improvement of our offerings. This is an exciting opportunity for a deeply technical professional passionate about AI and customer success to make a significant impact in a fast-paced, innovative environment. Responsibilities - Engage directly with customers to tackle and resolve complex technical challenges involving our cutting-edge Kubernetes GPU clusters; ensure swift and effective solutions every time. - Become a product expert in our GPU Cluster service, serving as the last line of technical defense before issues are escalated to Engineering and Product teams. - Collaborate seamlessly across Engineering, Research, and Product teams to address customer concerns; collaborate with senior leaders both internally and externally to ensure the highest levels of customer satisfaction. - Transform customer insights into action by identifying patterns in support cases and working with Engineering and Go-To-Market teams to drive Together's roadmap (e.g., future models to support) - Maintain detailed documentation of system configurations, procedures, troubleshooting guides, and FAQs to facilitate knowledge sharing with team and customers. - Be flexible in providing support coverage during holidays, nights and
-
Analytics Engineer — Data Warehouse
San Francisco
unspecified $130K-$170KAnalytics Engineer — Data Warehouse San Francisco About the Role Together AI is building high-performance inference compute and the software platform around it. We're looking for an early-career Analytics Engineer with strong fundamentals and high growth potential to grow into a technical lead over time. You'll contribute to designing and operating our data warehouse, ETL pipelines and orchestration, work on core data models and metrics, and help raise the bar on data quality and governance across the org - with mentorship and support from experienced engineers. Requirements - 0-4 years of professional experience (or strong internships/projects) working with data warehouses, pipelines, or analytics engineering. - Solid SQL fundamentals - you're comfortable writing queries and have some exposure to window functions or dimensional modeling concepts. - Some hands-on experience with dbt or Airflow, or strong eagerness to learn - coursework and personal projects count. - Basic Python for scripting and data tooling; any exposure to Spark (PySpark/SQL) is a plus. - Familiarity with data modeling concepts like SCD2 or star schemas - even if only from coursework. - Good communication skills: you can ask clarifying questions, explain your reasoning, and work with stakeholders to understand their needs. - High standards for data quality, reliability, and maintainability - you care about getting things right. Responsibilities - Contribute to building and maintaining a medallion/curated data warehouse stack (bronze/silver/gold) for product, usage, billing, and operational data. - Build and maintain Airflow orchestrated pipelines and dbt transformation projects (modular, tested, documented). - Help design analytics-ready