The Senior Customer Reliability Engineer is responsible for owning complex customer and internal technical inquiries end-to-end, ensuring timely resolution while maintaining a high standard of technical accuracy and communication.
This role operates at the intersection of customers, Engineering, and Product, and requires both deep technical expertise and the ability to translate complex system behavior into clear, actionable guidance. Beyond resolving escalations, the Senior CRE proactively detects patterns across cases, drives root cause analysis, and partners with Engineering and Product to influence improvements.
The role contributes to multiple CRE pillars: Advanced Technical Support (independent mastery of complex cases), Product Quality Guardian (incident detection, communication review), Cross-Functional Hub (escalation point for internal teams), Operational Improvement (10–20% Productivity Engineer track), and early Product Contribution (small fixes, feature feedback).
Responsibilities
Pillar ① Advanced Technical Support
-
Independently resolve complex technical inquiries across all product areas, taking the lead on more complex issues and difficult customer situations.
-
Own and resolve complex technical support cases and escalations across data pipelines, workflows, and integrations.
-
Perform deep-dive investigations using SQL (Presto/Trino), logs, APIs, and internal tools to identify root causes.
-
Rapidly catch up on new products and newly emerging issues to ensure timely response.
-
Use observability tooling (e.g., Splunk, Datadog) to monitor application and system behavior in real time.
-
Collaborate with global Customer Reliability teams (Japan, UK, US, Canada) for follow-the-sun coverage.
Pillar ② Product Quality Guardian
-
Detect anomalies from inquiry patterns and initiate incident escalation with detailed technical analysis.
-
Participate actively in incident response, coordinating customer communications and cross-team impact assessment.
-
Review and refine customer-facing communications during incidents.
-
Create and maintain internal and external documentation, including knowledge base articles and runbooks.
Pillar ③ Cross-Functional Hub
-
Serve as an escalation point for internal inquiries from Sales, Customer Success, GTM, and other customer-facing teams.
-
Mentor CREs at the same or lower tier; conduct case reviews and provide technical guidance.
Pillar ④ Operational Improvement
-
Contribute to operational improvement projects (Productivity Engineer track: 10–20%), including AI-driven support workflows, prompt engineering, and KB optimization for AI retrieval.
-
Participate in incident response and provide post-mortem analysis when required.
Requirements
-
Hands-on troubleshooting experience with cloud-based data infrastructure, including SQL, APIs, and data pipelines
-
Experience conducting research and analysis using AI
-
Experience with scripting languages (Python, Ruby, Shell, etc.)
-
Strong written and verbal communication skills with both technical and non-technical audiences
-
Demonstrated track record of mentoring junior team members
-
2–5 years of relevant experience
Nice to haves
While not specifically required, tell us if you have any of the following.
-
Experience with CDP (Customer Data Platform)
-
Hands-on experience with RESTful APIs using Postman or cURL
-
Knowledge of distributed systems and data platforms
-
Experience with observability tools (Splunk, Datadog, etc.)
-
Familiarity with ticketing tools such as Jira or Zendesk