Senior DevOps Engineer

Remote: Yes
Seniority Level: Mid-Senior level
Employment Type: Full-time
Locations: San Jose, Costa Rica
Department: Information Technology

What We Are Looking For

Neverfail is looking for a Sr. DevOps Engineer to join our team. This DevOps engineer is going to be a blend of security, DevOps, and monitoring application monitoring. Really must have a solid understanding of the entire container lifecycle.

You will be responsible for DevOps support and maintenance and automate the manual process to improve operational efficiency.

This role includes working within an existing infrastructure, creating automation scripts, and supporting new infrastructure and 24x7 Operational support as needed for the client in a dynamic and challenging environment.

Job Description

  • Experience with Infrastructure-as-Code and Release Management. Support existing infrastructure and applications stack and automate the process to improve the overall efficiency.
  • Improve operational performance on existing platform- upstream and downstream applications and provide end to end hardware management and distributed systems support.
  • Monitor and diagnose the system's health and fix the issues and make sure operations are running without any problems/issues to support the development and deliverables of the applications.
  • Partner with development and other project teams to improve the systems process and provide the necessary support in day to day operations.
  • Perform the regular maintenance task and monitor the automated systems and reporting system's health.
  • Provide required On-Call support and deliver the assigned task on time.
  • Development and implement standardized, automated operational, and quality control processes to provide accurate data and timely reporting to meet SLAs.
  • Experience in performance tuning and troubleshooting issues on time.
  • The ideal candidate should be experienced in design, implement entire software stacks that include application and system layers.
  • Experience in providing the necessary support and maintaining the 24x7 production support environment across multiple data centers and software/hardware infrastructure.
  • Perform periodical health checks and monitoring the management tools for a 24x7 support environment.
  • Should be able to work in an efficient operations environment and support on various critical tasks in a large complex distributed system.

Minimum Requirements

  • Experience in DevOps Engineering.
  • Experience in Application (Software & Hardware) support and troubleshooting.
  • Experience in Amazon Web Services Ecosystem (EC2, RDS, S3, etc.).
  • Experience with Ansible, Bash/Shell/PowerShell.
  • Experience with cloud, container security, and PKI would be a plus.
  • Experience with EKS required.
  • Experience with Terraform, managing infrastructure as code.
  • Experience with Splunk or Sumologic.
  • Experience of PAM with Hashicorp Vault and AWS Secrets Manager.

Preferred Skills

  • AWS architect experience or certification(s) & Amazon Web Services Ecosystem (EC2, RDS, S3, etc.).
  • Must Experience Kubernetes, Docker: Really must have a solid understanding of the entire container lifecycle.
  • Experience in RHEL6/7 or other CentOS flavored system automation & Bash/Shell/PowerShell.
  • Should have prior experience working on configuration and maintenance of any systems.
  • Should possess very strong communication (Verbal and Written) skills.
  • Experience with Github Actions.