Kubernetes SRE (Private Cloud)

Posted · Add Comment
Career Techniques Inc
Published
February 5, 2025
Location
Dallas, TX - 5 days/week In-Office
Category
 
Job Type

Description

As a Kubernetes Site Reliability Engineer (SRE) with a specific focus on private cloud/on-prem environments, you will design, implement, and maintain highly reliable and scalable Kubernetes clusters in private cloud and on-premise environments. You will work closely with developers, infrastructure engineers, and security teams to ensure that these clusters meet performance, security, and compliance standards.

Responsibilities:

  • Kubernetes Architecture and Deployment: Design, implement, and manage Kubernetes clusters in private cloud/on-prem environments, optimizing for performance and scalability.
  • Infrastructure Optimization: Continuously evaluate and enhance on-prem Kubernetes infrastructure to ensure high performance and efficient resource utilization.
  • Automation and Deployment: Use Helm charts and other automation tools to streamline the deployment process and ensure consistent builds.
  • Monitoring and Troubleshooting: Actively monitor the health of Kubernetes clusters, quickly troubleshoot issues, and minimize downtime.
  • Security and Compliance: Ensure robust security measures, including proper implementation of TLS/SSL certificates and mutual TLS, and compliance with industry standards.
  • Documentation and Reporting: Maintain comprehensive technical documentation for private cloud/on-prem Kubernetes environments, including architecture diagrams and operating procedures.

 

Requirements:

  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience.
  • 3-4 years of hands-on experience with Linux, containerization, and orchestration tools (e.g., Docker, Podman, Kubernetes).
  • Proven experience managing Kubernetes clusters in private cloud/on-prem environments.
  • Proficiency with TLS/SSL certificates and mutual TLS.
  • Strong understanding of Kubernetes resources like Deployments, StatefulSets, ConfigMaps, and RBAC.
  • Familiarity with observability and GitOps practices.
  • Strong problem-solving and analytical skills, with a focus on resolving complex infrastructure issues.
  • Excellent communication and collaboration skills, with experience working in cross-functional teams.
  • Ability to adapt quickly in a fast-paced, evolving technological landscape.
  • Experience with Service Mesh tools (e.g., Linkerd, Istio).
  • Familiarity with DevOps tools (e.g., Jenkins, Ansible, Terraform).
  • Knowledge of storage and networking solutions for on-prem environments
  • Self-motivated, with the ability to work independently.
  • Strong team player with a collaborative approach.
  • Detail-oriented, with a focus on operational excellence and process documentation.
  • Adaptable and able to manage shifting priorities in a fast-paced environment.
  • Max. file size: 300 MB.
  • Please complete the math question to prove you are human.

Related Jobs

NetSuite Developer   USA, Remote
March 18, 2025
Lead Data Engineer   Dallas, TX - 5 Days/week In-Office
March 4, 2025
Site Reliability Engineer - Kubernetes   Dallas, TX - 5 days/week In-Office
February 5, 2025
Cloud Infrastructure Engineer   Dallas, United States of America
January 9, 2025