We are seeking a skilled and experienced Specialist DevOps Engineer who is responsible for building, managing, and scaling our cloud infrastructure, automating the deployment pipeline, and ensuring the reliability and availability of our services.
- Design, build, and maintain scalable cloud infrastructure on platforms like AWS, Azure, or Google Cloud.
- Implement Infrastructure as Code (IaC) using tools like Terraform, CloudFormation, or Ansible.
- Monitor, manage, and optimize system performance and resource utilization.
- Develop and manage Continuous Integration and Continuous Deployment (CI/CD) pipelines for automated build, test, and deployment processes.
- Collaborate with development teams to improve deployment efficiency and reduce manual intervention.
- Ensure pipelines are reliable, fast, and secure.
- Automate routine operational tasks using shell scripts, Python, or other relevant scripting languages.
- Implement automation for configuration management, scaling, monitoring, and system health checks.
- Set up, configure, and maintain monitoring tools such as Prometheus, Grafana, Datadog, or CloudWatch.
- Proactively identify performance bottlenecks, troubleshoot issues, and participate in root cause analysis.
- Ensure high availability and reliability of applications and services.
- Implement security best practices in cloud infrastructure, including firewalls, VPNs, encryption, and identity access management (IAM).
- Ensure compliance with relevant security standards and frameworks (e.g., ISO 27001, SOC 2, etc.).
- Perform security assessments, vulnerability scans, and patch management.
- Work closely with development, QA, and product teams to ensure smooth code releases and system scalability.
- Document processes, configurations, and system designs for cross-team knowledge sharing.
- Support the on-call rotation for production incidents and escalations.
- Analyze system performance and implement tuning for optimal efficiency.
- Automate scaling solutions to handle spikes in traffic and ensure high availability.