Job Title | [Remote] Site Reliability Engineering |
Company | NSC Software - Premier Software Development Company |
Job Location | |
Workplace Type | |
Job Type | fulltime |
Job Category | Engineering and Information Technology |
Min Pay | 0 |
Max Pay | 0 |
Pay Currency | |
Pay Cycle | |
Last Seen |
2 day(s) ago
|
Description | Company Description NSC Software was founded with the belief that highly-qualified Vietnamese IT resources could be provided to enterprises of all sizes worldwide. Since our inception we have worked with the best talent in the country to deliver solutions that exceed our clients' business needs and expectations. We continuously expand our resource pool improve our offerings optimize our delivery processes and master new cutting-edge technologies to achieve this goal. Why should you join us? Remote work flexible working environment Attractive salary package Technical training and certifications Global career opportunities Enhance autonomy and independence Responsibilities As a Site Reliability Engineer at NSC you will have the opportunity to work remotely with fixed working hours from 3PM to 11PM Vietnam time. You will take ownership of our AWS-based infrastructure automate reliability practices and ensure our platform meets high standards of uptime observability and scalability. You will collaborate closely with software engineers — particularly on Node.js-based services — to design build and operate production systems with a strong focus on automation and resiliency. Main responsibilities Architect build and maintain infrastructure on AWS using best practices (VPC EC2 RDS S3 IAM ALB etc.) Manage EKS (Elastic Kubernetes Service) clusters implement Helm charts and ensure smooth deployment of containerized services Work with Node.js applications in production environments ensuring high availability performance and smooth deployments Design and maintain CI/CD pipelines using tools like GitLab CI CodePipeline or Jenkins Set up and manage monitoring logging and alerting systems using CloudWatch Prometheus/Grafana or third-party tools like Datadog Define and monitor SLIs/SLOs manage error budgets and participate in on-call rotations Implement and manage infrastructure as code using Terraform CloudFormation or Pulumi Perform root cause analysis and postmortem of production incidents to drive reliability improvements Implement cost optimization strategies across AWS services Collaborate with development and QA teams to improve release velocity while ensuring system stability Job Requirements Skills & Qualifications 3+ years of experience in SRE DevOps or cloud infrastructure roles Hands-on experience managing infrastructure on AWS Cloud Proficient in managing Kubernetes (EKS) clusters in production environments Strong experience with Node.js in production systems (debugging performance tuning deployment best practices) Strong scripting skills in Python Bash or similar Solid understanding of Linux systems networking and cloud security concepts Experience with Terraform Ansible or other IaC tools Familiarity with system observability and alerting (CloudWatch ELK Prometheus Grafana) Strong knowledge of CI/CD workflows and DevOps culture Nice-to-have skills and experience AWS certifications (e.g. AWS Certified DevOps Engineer Solutions Architect Associate) Experience with AWS Lambda API Gateway Step Functions or other serverless services Experience with ArgoCD Flux or other GitOps tools Experience with chaos engineering load testing or capacity planning Familiarity with security best practices IAM policies secret management WHY YOU WILL LOVE WORKING WITH US Compensation and Benefits Competitive Compensation: up to 2500 USD (negotiable) Flexible Work Arrangements: Work remotely Working time: 5 days/week (Monday to Friday) Attractive Benefits: 13th-month bonus social insurance Opportunity to work within a professional and multicultural environment Enhance English skills daily with global team Assistance and support through all aspects of the onboarding process Personal Growth Company Team Building Trip every year Training sponsorship programs Professional and dynamic working environment Mental health support at work Health care and Annual paid leave Private health insurance Social insurance Unemployment Insurance Parental paid Leave: 5 days Vacation Leave: 12 days per year Medical Leave: 8 days per year Email: duong.na@nscsoftware.com |
Apply Now
|
|