Back to Jobs

Senior AIOps Engineer, Incident Response

TrulyRemote Verified

Hand-curated global remote job with direct application link

Technical Requirements

AWSJiraConfluenceDevOpsObservabilityIncident ManagementAI/LLM-powered systems

About Us

Quanata is on a mission to help ensure a better world through context-based insurance solutions. We are an exceptional, customer centered team with a passion for creating innovative technologies, digital products, and brands.

The Role

We’re looking for an experienced production operations and reliability leader to help evolve Quanata’s operational support model through AI-driven automation and intelligent agent workflows. This role will own production health, incident response, and operational reliability while partnering closely with engineering and AI orchestration teams to improve scalability, reduce operational toil, and accelerate issue resolution.

Your Day-to-Day

  • Own production health, reliability, and operational support processes across critical systems and services
  • Lead incident response efforts, stakeholder communication, root cause analysis, and post-incident reviews
  • Identify patterns in production issues and drive improvements to reduce recurring incidents and operational overhead
  • Design and implement AI-driven agents and workflows that automate support and operational tasks
  • Partner with engineering, product, and AI orchestration teams to improve system resilience and operational efficiency
  • Build and maintain operational runbooks, documentation, and knowledge base content for both human and AI-assisted workflows
  • Support observability, monitoring, and troubleshooting efforts across cloud-based production environments
  • Participate in on-call rotations and continuously improve operational readiness and response processes

About You

  • 6–8 years of experience in production operations, site reliability engineering, technical support engineering, or similar operational roles
  • Strong background in incident management, root cause analysis, and production system troubleshooting
  • Experience working within modern SDLC, DevOps, and change management environments
  • Familiarity with operational tooling such as Jira, Confluence, and observability/monitoring platforms
  • Strong analytical and problem-solving skills

Bonus Points

  • Experience building or working with AI/LLM-powered systems, intelligent agents, or workflow automation tools
  • Familiarity with cloud platforms such as AWS and modern observability ecosystems
  • Experience with event-driven architectures, orchestration frameworks, or operational automation platforms
Senior AIOps Engineer, Incident Response
Quanata
Apply