Last updated: 2025-05-22

47 Site Reliability Engineer jobs in San Jose.

Hiring now: Platformsite Reliability @ Zoox, Site Reliability Engr @ Replit, Infrastructure Team Membe @ Neuralink, Sr Software Engr @ Personalis, Production Engr @ Meta, Staff Software Engr Relia @ Robinhood , Performance Reliability E @ Aerospike, Site Reliability Engr Sre @ Coupang, Software Engr In Reliabil @ Newsbreak, Staff Software Engr @ Intuit.Explore more at at kaamvaam.com

šŸ”„ Skills

Kubernetes (21) DevOps (12) SRE (11) AWS (10) Python (9) Site Reliability Engineering (8) monitoring (8) automation (8) Terraform (7) scalability (7)

šŸ“ Locations

Palo Alto (7) San Jose (7) Mountain View (6) Santa Clara (6) Foster City (4) Menlo Park (4) Fremont (3) Redwood City (3) San Mateo (3) Sunnyvale (3)

Zoox

Skills & Focus: site reliability engineer, uptime, autonomous vehicles, fault-tolerant systems, deployment, operation, data-processing pipelines, compute-intensive tasks, CPUs, GPUs
About the Company: Zoox is a robotics company focused on developing autonomous vehicles with an ethos of automation throughout the infrastructure components they build.

Replit

Skills & Focus: Site Reliability Engineering, SRE, Infrastructure Automation, Monitoring Solutions, Infrastructure as Code, CI/CD Pipelines, Incident Management, Performance Optimization, Distributed Systems, Cloud-native Technologies
About the Company: Replit is the fastest way to turn ideas into software. With our powerful AI-powered Agent and Assistant, anyone can create and launch apps from natural languag…
Experience: 3+ years of experience in Site Reliability Engineering or similar roles (DevOps, Systems Engineering, Infrastructure Engineering)
Type: Full-Time
Benefits: Flexible Work Hours, Competitive Salary & Equity, Home Office Set-Up Stipend, Health, Dental, Vision and Life Insurance…

Zoox

Skills & Focus: IT Technical Operations, real-time command center, monitoring services, Site Reliability Engineering (SRE), Technical Operations Engineering, stability, live robot missions, strategic initiatives, innovative solutions, reliability and performance
Skills & Focus: Site Reliability Engineering, Autonomous Vehicles, Microservice Architecture, Kubernetes, Data Pipelines, Performance Metrics, Linux, Python, C/C++, AWS
About the Company: Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the…
Experience: 2+ years
Salary: $210,000 to $250,000
Type: Full-time
Benefits: A comprehensive package including paid time off, health insurance, long-term and short-term disability insurance, life …

Neuralink

Skills & Focus: software engineering, cloud architecture, infrastructure, networking protocols, Linux systems, hybrid cloud, security fundamentals, IAC tools, cryptographic protocols, systems administration
About the Company: We are creating devices that enable a bi-directional interface with the brain. These devices allow us to restore movement to the paralyzed, restore sight to th…
Experience: Experience building hybrid cloud/on-prem infrastructure, software engineering skills, and system administration experience.
Salary: $35/Hr USD
Type: Full-time
Benefits: An opportunity to change the world, growth potential, excellent medical/dental/vision insurance, paid holidays, commute…

Personalis, Inc

Skills & Focus: software engineering, LIMS, CI/CD pipelines, Python, Java, PostgreSQL, MySQL, Flask, Django, site reliability engineering
About the Company: Personalis is transforming the active management of cancer through breakthrough personalized testing, focusing on cancer management and patient care.
Experience: 5+ years of experience in software engineering, site reliability engineering, and/or devops.
Salary: $147,000 to $180,000 per year
Type: Full-time
Benefits: Competitive compensation package and benefits including medical, dental, vision, 401(k) match, ESPP, tuition reimbursem…

Neuralink

Skills & Focus: software engineering, networking protocols, Linux systems, cloud infrastructure, system administration, DevOps, automating processes, cryptographic protocols, production environments, Brain-Computer Interface (BCI)
About the Company: We are creating devices that enable a bi-directional interface with the brain. These devices allow us to restore movement to the paralyzed, restore sight to th…
Experience: Robust software engineering skills, experience in Linux systems, cloud/on-prem infrastructure.
Salary: $116,000 - $235,000 USD
Type: Full-time
Benefits: Medical, dental, and vision insurance, paid holidays, commuter benefits, meals provided, equity + 401(k) plan, parental…

Meta

Skills & Focus: Production Engineering, DevOps Engineer, Site Reliability Engineer, UNIX, TCP/IP, Python, Kubernetes, Terraform, MySQL, Infrastructure Management
About the Company: Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Ap…
Experience: 2+ years of experience in UNIX and TCP/IP network fundamentals, 2+ years of coding experience
Salary: $117,000/year to $173,000/year + bonus + equity + benefits
Type: Full-time
Benefits: Meta offers various benefits including bonuses and equity options.

Robinhood Markets

Skills & Focus: reliability, scalability, performance, security, distributed systems, programming languages, Linux, networking, incident metrics, monitoring
About the Company: Robinhood Markets was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood…
Experience: 8+ years
Salary: $217,000 - $255,000 USD
Type: Full-time
Benefits: 100% paid health insurance for employees with 90% coverage for dependents; Annual lifestyle wallet for personal wellnes…
Skills & Focus: reliability, software engineering, systems operations, incident metrics, production readiness, black box monitoring, infrastructure, Kubernetes, cloud computing, system resilience
About the Company: Join a leading fintech company that’s democratizing finance for all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing…
Experience: 8+ years experience
Salary: $217,000 - $255,000 USD
Type: Full-time
Benefits: 100% paid health insurance for employees with 90% coverage for dependents, annual lifestyle wallet for personal wellnes…
Skills & Focus: reliability, scalability, performance, security, software engineering, distributed systems, incident metrics, operational excellence, mentoring, infrastructure
About the Company: Robinhood Markets was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood…
Experience: 8+ years experience in designing, building, and maintaining large-scale, distributed systems
Salary: $217,000 — $255,000 USD (Zone 1); $190,000 — $224,000 USD (Zone 2); $169,000 — $199,000 USD (Zone 3)
Type: Full-time
Benefits: 100% paid health insurance for employees with 90% coverage for dependents; Annual lifestyle wallet for personal wellnes…

Aerospike

Skills & Focus: performance engineering, reliability, distributed systems, database concepts, performance tuning, Linux/Unix, observability tools, problem-solving, collaboration, communication
About the Company: Aerospike, a leader in next-generation, always-on, hyperscale data solutions, enables extreme-scale, real-time applications for various industry leaders.
Experience: Experience with distributed systems or large-scale services, preferably in a production setting.
Salary: $140,000 - $175,000
Type: Full-time
Benefits: Equal Opportunity Employer, commitment to a non-discriminatory environment.

Coupang

Skills & Focus: Site Reliability Engineering, Automation, Infrastructure Automation, Cloud-based Infrastructure, DevOps, CI/CD, Kubernetes, Observability, Large-Scale Systems, E-commerce
About the Company: Coupang is a large-scale e-commerce company, operating complex systems to deliver mission-critical services.
Experience: 10+ years of industry experience building and operating large-scale distributed systems.
Type: Full-time

Newsbreak

Skills & Focus: AWS, Kubernetes (EKS), EMR (Elastic MapReduce), service reliability, fault-tolerant architectures, Infrastructure-as-Code (IaC), CI/CD pipelines, monitoring tools (Prometheus, Grafana), high-availability strategies, incident response
About the Company: NewsBreak is redefining the way users interact with local news and their communities. By bridging local users, local content creators, and local businesses, ou…
Experience: 2+ years in SRE, DevOps, or Infrastructure Engineering roles
Salary: $130,000 – $260,000 USD
Type: Full-time
Benefits: Discretionary bonus and options may also be available; overall rewards package designed to attract top talents.

Intuit

Skills & Focus: Kubernetes, AWS, DevOps, Platform Engineering, Reliability Engineering, Cloud Architecture, Automation, Observability, Incident Management, Data Analysis
About the Company: Intuit is the global financial technology platform that powers prosperity for the people and communities we serve. With approximately 100 million customers wor…
Experience: 7+ years
Salary: $184,500 - $250,000
Type: Full-time
Benefits: Cash bonus, equity rewards and benefits

Coupang

Skills & Focus: observability solutions, monitoring, alerting, logging, tracing, Kubernetes, DevOps, SRE practices, cloud-based infrastructure, performance indicators
About the Company: Coupang is a leading force in South Korean commerce, known for its exceptional customer service and innovative approach to retail and e-commerce. The company b…
Experience: Strong experience in implementing and managing observability solutions in large-scale, complex environments.
Salary: $159,000 - $324,000/year
Type: Full-time
Benefits: Medical/Dental/Vision/Life insurance, Flexible Spending Accounts, Long-term/Short-term Disability, Employee Assistance …
Skills & Focus: site reliability engineering, performance, distributed systems, large-scale systems, project management, security, privacy, compliance, stakeholders, scalability
About the Company: A fastest-growing retail company, disrupting the commerce industry from South Korea, combining startup culture with large global resources.
Experience: Minimum 12 years managing large-scale cross-functional projects
Salary: $159,000 - $324,000 per year
Type: Full-time
Benefits: Medical/Dental/Vision/Life, AD&D insurance, FSA & HSA, Disability insurance, EAP, 401K with match, PTO, public holidays…

Moody's Shared Services, Inc.

Skills & Focus: Design, Build, Operate, System operation, Monitoring, Hardware upgrades, Disaster recovery, Vendor communication, Big data Spark clusters, Kubernetes
Experience: At least two (2) years as a Systems Engineer or related role
Salary: $110,032 - $220,250/yr
Type: Full-time
Benefits: Medical, dental, vision, parental leave, paid time off, 401(k), life, disability, and accident insurance, stock purchas…

Hippocratic Ai

Skills & Focus: infrastructure automation, deployment pipelines, monitoring, scalable systems, cloud platforms, Kubernetes, Terraform, Ansible, Jenkins, security compliance
About the Company: Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically improve healthca…
Experience: At least 5 years of professional experience in DevOps engineering or a related field
Type: Full time
Skills & Focus: ML Infrastructure, Kubernetes, Terraform, multi-cloud environments, orchestration platform, cloud platforms, resource optimization, automation, system health monitoring, capacity planning
About the Company: Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically improve healthca…
Experience: 3-5 years
Type: Full time
Skills & Focus: infrastructure automation, Kubernetes, DevOps, monitoring, scalability, cloud platforms, security compliance, deployment pipelines, disaster recovery, mentorship
About the Company: Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare, aiming to improve accessibility and outcomes by applying deep healthca…
Experience: At least 5 years of professional experience in DevOps engineering or a related field
Type: Full time

Luma Ai

Skills & Focus: Site Reliability Engineer, SRE, Infrastructure, GPU clusters, H100 GPUs, Monitoring tools, Management tools, Performance problems, Maintenance problems, Data Processing

Glean

Skills & Focus: SRE, cloud infrastructure, automation, monitoring, incident management, performance optimization, scalability, security compliance, software development, cloud platforms
About the Company: We’re on a mission to make knowledge work faster and more humane. We believe that AI will fundamentally transform how people work.
Experience: 8+ years of experience in a senior-level role within Site Reliability Engineering or similar role
Salary: $155,000 - $250,000 annually
Type: Full-time
Benefits: Competitive compensation, Medical, Vision and Dental coverage, Flexible work environment and time-off policy, 401k, Com…

Luma Ai

Skills & Focus: SRE, GPU, infrastructure, monitoring, cloud providers, automation, scalability, containerization, observability, problem-solving
Experience: 5+ years
Type: Full-time
Skills & Focus: SRE, Infrastructure, GPU clusters, H100 GPUs, Training, Data Processing, Monitoring, Management tools, Performance, Maintenance

Celonis

Skills & Focus: Site Reliability Engineering, SRE principles, observability, automation, incident prevention, cloud platforms, Java, Python, Kubernetes, error budgets
About the Company: Celonis helps some of the world’s largest and most esteemed brands make processes work for people, companies, and the planet. With over 5,000 enterprise custom…
Experience: Minimum of 8+ years of experience in software engineering or SRE roles.
Salary: $195,000 - $235,000 USD
Type: Full-time
Benefits: Great compensation and benefits packages (equity, life insurance, time off, generous leave for new parents from day one…

Box

Skills & Focus: SRE, reliability, scalability, cloud-native, Kubernetes, AWS, GCP, observability, automation, distributed systems
About the Company: Box (NYSE:BOX) is the leader in Intelligent Content Management. Our platform enables organizations to fuel collaboration, manage the entire content lifecycle, …
Experience: 5+ years of working experience designing, developing, and operating large-scale, customer-facing products or services
Type: Full-time
Benefits: Equity and benefits including healthcare benefits.

Celonis

Skills & Focus: Site Reliability Engineering, Microservices, Kubernetes, Automation, Incident management, Cloud computing, Java, Python, Observability, CI/CD
About the Company: Celonis helps some of the world’s largest and most esteemed brands make processes work for people, companies and the planet. With over 5,000 enterprise custome…
Experience: Minimum of 5 years of experience building and maintaining cloud-based software applications.
Salary: $160,000 - $210,000 USD
Type: Full-time
Benefits: Great compensation and benefits packages (equity, life insurance, time off, generous leave for new parents, etc.).

Astronomer

Skills & Focus: Reliability Engineering, SRE, Cloud-native, Automation, Observability, Scalability, Incidents Management, Service Uptime, Distributed Systems, Team Leadership
About the Company: Astronomer empowers data teams to bring mission-critical software, analytics, and AI to life and is the company behind Astro, the industry-leading unified Data…
Experience: 10+ years in software engineering, SRE, or DevOps roles; 5+ years in technical leadership
Salary: $260,000 - $290,000 plus equity
Type: Full-time

Netapp, Inc.

Skills & Focus: Cloud, SRE, Kubernetes, Automation, Incident Response, Monitoring, Security, Troubleshooting, Infrastructure, DevOps
About the Company: NetApp is the intelligent data infrastructure company, turning a world of disruption into opportunity for every customer. They help their customers identify an…
Experience: 8-12 years
Type: Full-time
Benefits: Volunteer time off, employee assistance programs, health care, life and accident plans, legal and financial services, p…

Leading Destination For Short-Form Mobile Video

Skills & Focus: Site Reliability Engineering, service lifecycle, cloud-managed infrastructure, Kubernetes, Redis, MySQL, Flink, automate scaling systems, distributed systems, problem solving
About the Company: It is the largest Unicorn startup and the leader in short-form video hosting service with approximately 1.5 billion monthly active users worldwide.
Experience: 5+ years
Type: Full-time
Benefits: 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents, dental, vision…

Netapp

Skills & Focus: Cloud, Software Engineering, SRE, Incident Management, Observability, Application Security, Python, Golang, DevSecOps, Virtualization
About the Company: NetApp is the intelligent data infrastructure company, turning a world of disruption into opportunity for every customer. No matter the data type, workload or …

Netapp, Inc.

Skills & Focus: Kafka, Cloud Operations, Linux, AWS, Azure, GCP, Docker, Python, Java, Cluster Operations
About the Company: NetApp is the intelligent data infrastructure company, turning a world of disruption into opportunity for every customer. They help customers identify and real…
Experience: Minimum of 5 years, highly desirable 5 to 8 years
Type: Full-time
Benefits: Volunteer time off, health care, life and accident plans, emotional support resources, legal services, financial saving…

Goodleap

Skills & Focus: Site Reliability Engineer, software engineering, system engineering, automation, monitoring, incident response, infrastructure management, DevOps, observability, AWS
About the Company: GoodLeap is a technology company delivering best-in-class financing and software products for sustainable solutions, from solar panels and batteries to energy-…
Salary: $97,000 - $141,000 a year
Type: Full Time

Arkose Labs

Skills & Focus: Platform Engineering, Infrastructure, Site Reliability, Cloud Infrastructure, Incident Response, AWS, Azure, Distributed Systems, CI/CD, Infrastructure-as-Code
About the Company: Arkose Labs protects enterprises from cybercrime and abuse, offering the world's first $1M warranties for credential stuffing and SMS toll fraud. They have a s…
Experience: 5+ years of leadership experience in Platform, Infrastructure, SRE, or related fields; 10+ years of experience in software engineering.
Salary: $270,000.00-$350,000.00
Type: Full-time
Benefits: Competitive salary + Equity; 401k plan; Robust benefits package (85% medical, dental, vision for employees; 75% for dep…

Xero

Skills & Focus: Product SRE, SRE engineers, reliability, Observability, high performing services, Engineering, high performing teams, Product SRE strategy, transformation, expert communicator
About the Company: Xero helps businesses by automating routine tasks and connecting them with the right data, advisors, and apps, ultimately contributing to a stronger economy.
Experience: Strong Engineering background, deep experience in SRE

Sustainable Talent

Skills & Focus: Infrastructure, Data Centers, Hardware, Software, Networking, Troubleshooting, DevOps, Maintenance, Collaboration, Testing
About the Company: Sustainable Talent is a staffing agency partnered with Nvidia, focusing on providing talent for tech roles in infrastructure and data centers.
Experience: 4+ years of equivalent experience in a Lab or Datacenter environment.
Salary: $70/hr - $80/hr
Type: Full-time
Benefits: Full benefits, PTO, and amazing company culture.

Palo Alto Networks

Skills & Focus: DevOps, Site Reliability Engineering, Cortex, Security, Engineering Management, Cloud, Platforms, Production Operations, AI, Software Development
About the Company: Palo Alto Networks is a cybersecurity company that offers advanced firewalls and cloud-based security services to secure the digital transformation.
Type: Full-time
Skills & Focus: Site Reliability Engineering, DevOps, cloud-native applications, AWS, GCP, Terraform, Kubernetes, automation, programming languages, CI/CD
About the Company: Palo Alto Networks is a cybersecurity company that aims to redefine protection and security in the digital age. Their mission is to be the cybersecurity partne…
Experience: 4+ years as an engineer in Infrastructure, Operations, DevOps, or System Engineering; 2+ years building high availability, scalable cloud-native applications on AWS and GCP
Type: Full-time
Benefits: FLEXBenefits wellbeing spending account, mental and financial health resources, personalized learning opportunities

Servicenow

Skills & Focus: Machine Learning, AI, infrastructure, platform, deployment, observability, GPU, scalable, code reviews, SRE
About the Company: PLATO (Platform Engineering and AI Technology Organization) at ServiceNow is a customer-focused innovative group building intelligent software using a variety …

Palo Alto Networks

Skills & Focus: DevOps, SRE, Cloud infrastructure, Automation, Terraform, Kubernetes, GitLab CI/CD, Monitoring, Security, Reliability
Skills & Focus: Site Reliability Engineer, DevOps, Cloud infrastructure, Automation, Kubernetes, GCP, AWS, Python, Docker, Terraform
About the Company: Palo Alto Networks is a cybersecurity company committed to protecting our digital way of life. The company aims to redefine cybersecurity standards and focuses…
Experience: BS or MS in Computer Science, a related field, or equivalent professional experience
Salary: $160,000 - $225,000/YR
Type: Full-time
Benefits: FLEXBenefits wellbeing spending account, mental and financial health resources, personalized learning opportunities

Meta

Skills & Focus: Production Engineering, DevOps Engineer, Site Reliability Engineer, UNIX, TCP/IP, Python, Kubernetes, Terraform, MySQL, Infrastructure Management
About the Company: Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Ap…
Experience: 2+ years of experience in UNIX and TCP/IP network fundamentals, 2+ years of coding experience
Salary: $117,000/year to $173,000/year + bonus + equity + benefits
Type: Full-time
Benefits: Meta offers various benefits including bonuses and equity options.

Apple

Skills & Focus: Postgres, Database, AWS, Kubernetes, High Availability, Replication, Performance Tuning, Disaster Recovery, Backup, Cloud Infrastructure
About the Company: Apple Inc. is a leading technology company known for its innovative products and services.
Experience: 5-15 years supporting Postgres databases in a high volume environment
Type: Full-time

Google

Skills & Focus: software development, site reliability development, coding, algorithms, complexity analysis, large-scale systems, automation, system capacity, performance optimization, team collaboration
About the Company: Google is a global technology company that specializes in Internet-related services and products, which include search engines, online advertising technologies…
Experience: Experience with data structures/algorithms and software development in one or more programming languages.
Salary: $118,000-$170,000
Type: Full-time
Benefits: bonus + equity + benefits