Job Description: Database SRE (Bridge)
The Bridge Engineering team at Instructure is looking for an SRE with a special focus on databases and stateful services to help us grow our product, scale our systems and empower our feature teams.
This role is a blend of SRE, software engineering and database administration designed to bring special expertise on all things data to our team. If you have less experience managing databases but demonstrate strong SRE skills, we still want to hear from you!
What We Do
Bridge is a tool that helps people find their place at work, form meaningful relationships with peers and managers, and forge a path towards growth. We’re helping our customers create work cultures people love.
Who We’re Looking For
- A problem solver who asks questions to get at the core issue that the team is grappling with before deciding on a solution.
- A pragmatist who knows how to make trade offs to solve challenges while building an architecture that scales for the future.
- A systematic thinker who can understand how the larger system operates and knows when to take a step back and consider alternative approaches.
- A team player who loves teaching and learning from others.
What We Offer
- Experience working on a highly available business-to-business (B2B) software as a service (SaaS) product with thousands of active customers.
- Competitive compensation package
- Flexible work environment
- Quarterly hack week events
What You’ll Be Doing
- Owning cloud operations for dozens of PostgreSQL databases backing services in multiple regions, environments and language stacks.
- Optimizing databases for speed and reliability for large datasets spread out across multiple schemas and clusters.
- Configuring database observability systems to identify incidents before they happen.
- Discovering database-related problem areas (i.e. slow queries, high resource saturation, etc) and working with service owners to resolve or mitigate them.
- Managing other stateful services such as Elasticache, DynamoDB, etc.
- Implementing automation to reduce toil and enable healthy data systems by default.
- Working alongside a highly skilled SRE team running services in multiple Kubernetes clusters.
- Building tools and resources for upskilling other engineering teams to make database creation and maintenance self-service.
- Cost optimizing cloud data operations.
- Responding to incidents and contributing to a continuous improvement culture with occasional participation in 24/7 on call rotations.
- Helping to architect data analytics pipelines and data warehouse operations.
- Shaping data-related policies like backup cadence, retention policies, security best practices, disaster recovery plans, etc.
What You’ll Need
- At least 3-5 years experience running production systems at scale as an SRE or senior engineer.
- Deep understanding of at least one modern programming language (Ruby, Go, Java, etc).
- Deep knowledge of SQL database operations and optimization. Experience with PostgreSQL is a plus.
- Knowledge of cloud-based providers (AWS preferred, Azure, Google Cloud).
- Familiarity with cloud networking configuration (VPCs, security groups, load balancers, DNS, etc).
- Familiarity with system observability through monitoring and alerting (like Datadog, Sentry, etc.)
- Ability to work with a globally distributed team in multiple time zones.
- Experience with configuration-as-code tools such as Terraform.
- Experience with Kubernetes or other container orchestration systems.
- Experience with streaming data services like Pulsar, Kafka or Kinesis is a plus.
- Ability to speak fluent English