Kubernetes SRE (Private Cloud)

Posted · Add Comment
Career Techniques Inc
Published
February 5, 2025
Location
Dallas, TX - 5 days/week In-Office
Category
 
Job Type

Description

As a Kubernetes Site Reliability Engineer (SRE) with a specific focus on private cloud/on-prem environments, you will design, implement, and maintain highly reliable and scalable Kubernetes clusters in private cloud and on-premise environments. You will work closely with developers, infrastructure engineers, and security teams to ensure that these clusters meet performance, security, and compliance standards.

Responsibilities:

  • Kubernetes Architecture and Deployment: Design, implement, and manage Kubernetes clusters in private cloud/on-prem environments, optimizing for performance and scalability.
  • Infrastructure Optimization: Continuously evaluate and enhance on-prem Kubernetes infrastructure to ensure high performance and efficient resource utilization.
  • Automation and Deployment: Use Helm charts and other automation tools to streamline the deployment process and ensure consistent builds.
  • Monitoring and Troubleshooting: Actively monitor the health of Kubernetes clusters, quickly troubleshoot issues, and minimize downtime.
  • Security and Compliance: Ensure robust security measures, including proper implementation of TLS/SSL certificates and mutual TLS, and compliance with industry standards.
  • Documentation and Reporting: Maintain comprehensive technical documentation for private cloud/on-prem Kubernetes environments, including architecture diagrams and operating procedures.

 

Requirements:

  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience.
  • 3-4 years of hands-on experience with Linux, containerization, and orchestration tools (e.g., Docker, Podman, Kubernetes).
  • Proven experience managing Kubernetes clusters in private cloud/on-prem environments.
  • Proficiency with TLS/SSL certificates and mutual TLS.
  • Strong understanding of Kubernetes resources like Deployments, StatefulSets, ConfigMaps, and RBAC.
  • Familiarity with observability and GitOps practices.
  • Strong problem-solving and analytical skills, with a focus on resolving complex infrastructure issues.
  • Excellent communication and collaboration skills, with experience working in cross-functional teams.
  • Ability to adapt quickly in a fast-paced, evolving technological landscape.
  • Experience with Service Mesh tools (e.g., Linkerd, Istio).
  • Familiarity with DevOps tools (e.g., Jenkins, Ansible, Terraform).
  • Knowledge of storage and networking solutions for on-prem environments
  • Self-motivated, with the ability to work independently.
  • Strong team player with a collaborative approach.
  • Detail-oriented, with a focus on operational excellence and process documentation.
  • Adaptable and able to manage shifting priorities in a fast-paced environment.
  • Max. file size: 300 MB.
  • Please complete the math question to prove you are human.

Related Jobs

Site Reliability Engineer - Kubernetes   Dallas, TX - 5 days/week In-Office
January 23, 2025
Sr. Network Engineer (Datacenter)   Dallas, United States of America
January 10, 2025
Cloud Infrastructure Engineer   Dallas, United States of America
January 9, 2025
Cloud Architect   Portsmouth Orlando Miami Chicago Atlanta Dallas Ottawa Canada, United States of America
January 9, 2025
Sr. Cloud Platform Engineer   Evanston IL, Princeton NJ, Philladelphia PA., United States of America
January 1, 2025