Software Engineer – SRE

Software Development
  • Software Development
  • India

Maersk

About the Role

We are looking for a highly skilled Software Engineer with strong AI/ML expertise and a foundational understanding of SRE principles to help transform reliability engineering through intelligent, automation-driven solutions.

This role is not just about applying AI; it’s about applying engineering mindset and AI capabilities to reliability problems. You should be comfortable writing clean, maintainable code and have a understanding of SRE principles such as observability, incident response, and automation. By combining software skills with practical knowledge of operational challenges, you’ll help eliminate toil, drive proactive reliability improvements, and embed intelligence into day-to-day engineering workflows. Your efforts will directly contribute to unifying reliability efforts across teams, enabling consistent engineering standards, and fostering a shared accountability model for service health. By driving operational discipline and aligning reliability goals with business priorities, you will help create a culture where platform stability, developer productivity, and customer experience go hand in hand. These contributions will play a vital role in supporting the organization’s broader strategy—enabling faster innovation, scalable growth, and a resilient technology foundation aligned with long-term business outcomes.

Key Responsibilities

· Support initiatives to enhance SRE capabilities using AI/ML, ensuring strong foundations in reliability engineering and operational excellence.

· Leverage AI and machine learning technologies to architect and implement solutions that advance the overall SRE agenda—improving reliability, automation, observability, and operational efficiency across complex systems.

· Contribute to incident management, change management, and release processes—bringing structure, automation, and intelligent insights to drive stability, safety, and velocity.

· Participate and Drive key SRE practices and routines—including initiation and facilitation of SRE Community of Practice (CoP), aligning SLAs/SLOs, launching error budget governance, and enabling data-driven process improvements across reliability areas.

· Partner effectively with SREs, platform engineers, and data teams to develop production-grade, measurable, and reliable models and tools.

· Develop and maintain internal frameworks and tooling to accelerate AI/ML adoption across reliability use cases.

· Partner , Understand and assist in driving Zero-Touch Operations by enabling platforms to detect, analyze, and resolve issues autonomously.

· Utilize metrics, logs, and historical incident data to build actionable insights and reliability dashboards.

· Actively participate in on-call rotations, improving incident response processes and escalation management.

· Integrate security best practices into workflows and collaborate with security teams to ensure platform stability.

· Contribute significantly to shaping the AI-in-SRE strategy and mentor junior team members.

Required Skills & Qualifications

· 3–5 years of experience as a software engineer or platform engineer, with a strong focus on building production-grade systems, developer tooling, or intelligent automation.

· LLM-Native Development Approach- Proficiency in leveraging LLM-powered tools (e.g., for research, code generation, or automation). Demonstrated experience building AI-assisted workflows or custom automations that enhance engineering efficiency, reduce manual effort, or accelerate operational tasks.

· Proficient in Python, Go, or equivalent, with strong software engineering fundamentals—testing, version control, CI/CD, and clean code practices.

· Understanding of core SRE principles (SLIs/SLOs, incident response, error budgets), with the ability to partner with SREs to productionize reliability tooling.

· Hands-on experience with cloud platforms (AWS, GCP, Azure), containers/orchestration (Docker, Kubernetes), and infrastructure-as-code patterns.

· Familiarity with observability and telemetry systems—building or integrating with tools like Prometheus, OpenTelemetry, or Elastic stack.

· Comfortable working with Linux-based systems, debugging performance issues, and understanding systems-level behavior.

· Ability to translate operational pain points into intelligent, automated solutions using software, AI, and data-driven techniques.

Preferred Qualifications.

· Advanced SRE Practice Exposure: Familiarity with operating in mature SRE environments—such as participating in production readiness reviews, chaos engineering exercises, Capacity planning, Error budget governance and operational health reviews etc.

· Exposure to building AI-assisted tools using LLMs, vector databases, or prompt engineering techniques to streamline engineering or operational workflows would be a big plus.

Maersk is committed to a diverse and inclusive workplace, and we embrace different styles of thinking. Maersk is an equal opportunities employer and welcomes applicants without regard to race, colour, gender, sex, age, religion, creed, national origin, ancestry, citizenship, marital status, sexual orientation, physical or mental disability, medical condition, pregnancy or parental leave, veteran status, gender identity, genetic information, or any other characteristic protected by applicable law. We will consider qualified applicants with criminal histories in a manner consistent with all legal requirements.

We are happy to support your need for any adjustments during the application and hiring process. If you need special assistance or an accommodation to use our website, apply for a position, or to perform a job, please contact us by emailing [email protected].

To apply for this job please visit www.maersk.com.

Similar Jobs to Apply
  • Live Connections
    India

    Hiring for Senior Software Engineer - Oracle EBS Job Summary: We're hiring for a Senior Software Engineer in Dubai, to lead the design, development, and implementation of Oracle EBS (R12) solutions.
  • weblite technology
    India

    Responsibilities • Design and develop high-quality Android applications using best practices. • Collaborate with UX/UI designers to create intuitive user interfaces. • Implement RESTful APIs to
  • mindstix
    Pune / Remote

    Roles and Responsibilities Brainstorm on product requirements, technical design, and solution architecture. Collaborate with business stakeholders and product managers. Creative problem
  • Achutha Associates
    India

    Locations: Remote Time : 9:00PM or 10:00PM to 5:00AM or 6:00AM (PST time) Interview round : 5 Budget : 1LPM (No GST) EXP : 10 + Year Note : Don’t look for candidate based on specific requirements. I