About the Role
We are looking for an experienced Site Reliability Engineer (SRE) to join the SAP for Me team SAPs strategic customer portal that serves as the central digital entry point for customers entire SAP relationship.
The platform provides a personalized and unified view of key customer assets and interactions, including licenses, cloud consumption, support cases, and system landscapes.
In this role, you will collaborate closely with platform engineers, product owners, and operations teams to drive reliability, scalability, observability, and automation across SAP for Me. You will support the SRE Champion in shaping and executing the SRE roadmap, ensuring sustainable service operations and continuous improvement.
Key Responsibilities
Support and advise product and platform teams in developing and maintaining highly available, reliable, and scalable systems .
Design and implement automation and tooling to enhance operations efficiency in areas such as :
Monitoring and observability
Release management and deployment automation
Risk analysis and proactive problem prevention (e.g., design reviews, resource forecasting)
Participate in incident management and on-call rotations as needed, including root cause analysis and post-incident improvements .
Contribute to DevOps and SRE practices , including :
Kubernetes cluster management
CI / CD pipeline automation (GitHub Actions, ArgoCD)
Infrastructure as Code principles
Partner with cross-functional teams to define and maintain service-level indicators (SLIs) and service-level objectives (SLOs).
Promote a culture of reliability, resilience, and continuous improvement within the engineering organization.
Required Skills & Experience
6+ years of experience in SAP or related enterprise cloud environments.
Proven experience with observability and monitoring tools (Dynatrace, Grafana, Octobus).
Strong hands-on knowledge of Kubernetes , SAP Gardener , and SAP Kyma .
Solid understanding of DevOps methodologies , CI / CD pipelines , and Infrastructure-as-Code (GitHub Actions, ArgoCD).
Strong scripting and automation skills (e.g., Bash, Python).
Proficient in Linux-based systems administration and troubleshooting.
Excellent communication and collaboration skills, with the ability to work across multiple teams and stakeholders.
Fluent in English (both written and spoken).
Nice to Have
Familiarity with AIOps concepts and tools.
Experience working within large-scale enterprise environments or complex hybrid cloud architectures.
Consultant • Khet Pathum Wan, Bangkok