Back to Jobs

Senior Network Engineer

TrulyRemote Verified

Hand-curated global remote job with direct application link

Technical Requirements

Cumulus LinuxBGPEVPNVXLANPythonAnsibleTerraformRDMA

Who We Are

Lightning AI is the company behind PyTorch Lightning. Founded in 2019, we build an end-to-end platform for developing, training, and deploying AI systems—designed to take ideas from research to production with less friction.

Through our merger with Voltage Park, a neocloud and AI Factory, Lightning AI combines developer-first software with cost-efficient, large-scale compute. Teams get the tools they need for experimentation, training, and production inference, with security, observability, and control built in.

What We’re Looking For

Lightning AI is seeking a Senior Network Engineer with hands-on Cumulus Linux expertise to build and scale the network backbone behind our AI infrastructure platform. You’ll play a critical role in designing highly reliable, automated data center networks that support some of the most demanding AI workloads in the world.

What You'll Do

  • Design and deploy scalable spine/leaf network architectures for AI data centers
  • Engineer high-performance Ethernet fabrics supporting GPU clusters and AI workloads
  • Build and maintain EVPN/VXLAN, BGP, and high-speed routing environments
  • Optimize east-west traffic flows for AI training and inference operations
  • Support RoCE/RDMA networking and low-latency transport technologies
  • Support backbone, DCI, WAN, and edge connectivity solutions
  • Collaborate with compute, storage, AI platform, and operations teams to deliver integrated infrastructure solutions
  • Develop automation and Infrastructure-as-Code (IaC) solutions for network provisioning and operations
  • Troubleshoot complex network, performance, and congestion issues across distributed environments
  • Improve network observability, telemetry, and operational visibility

Required Qualifications

  • Experience with Cumulus NOS
  • 5+ years of experience in large-scale data center networking
  • Experience in spine-leaf architectures and L3 fabrics
  • Experience with BGP, EVPN, VXLAN
  • Experience operating high-performance computing (HPC) or GPU-dense environments
  • Experience designing networks for hyperscalers, neoclouds, or high-scale SaaS infrastructure
  • Experience in automation with Python, Ansible, or Terraform
  • Experience with network observability tooling and telemetry pipelines

Ideal Experience

  • Familiarity with NVIDIA networking (Spectrum, Quantum, BlueField, etc.)
  • Familiarity with RDMA, RoCE, or InfiniBand fabrics
  • Experience with multi-region backbone design
  • Exposure to bare-metal provisioning systems
Senior Network Engineer
Lightning AI
Apply