Core - Ceph expert for cluster re-design
Randstad (Switzerland) Ltd.
Veröffentlicht:
14 Dezember 2024Pensum:
100%- Arbeitsort:Worblaufen
job details
Initial situation:
We have a Ceph solution in place which is complex, not scalable and instable on which we can't refactor as the risk is high we bring it down.
We have new machines and would like to spawn a new Ceph cluster following new architecture with latest Ceph components and best practices to reach a less complex and more stable cluster we can confidently scale, operate and maintain.
Delivery objects:
-Support in architecting and deploying high performance and highly available Kubernetes based Ceph storage solution for S3, and K8s persistent volumes supporting internal ETL, Data Analytics and AI use cases.
-Hands-on in automating deployment artefacts and developing monitoring & alerting stack to operate our cluster.
-Provide LCM plans including a set of automations that allows an upgrade in a rolling-restart mode without affecting users.
Definition of Done:
- Architectural choices are proposed and validated
- Deployments artefacts are tested to achieve efficient deployment from scratch
- Monitoring and alerting stack is deployed on PROD
- LCM plans and automated upgrades of all components across the cluster are designed, implemented, and tested on PROD
- Ceph solution is proofed on productive environment
Professional and technical framework conditions:
- Proficiency in Ceph technology deployed on Kubernetes
- Designed, deployed operated and enhanced multiple Ceph solutions
We have a Ceph solution in place which is complex, not scalable and instable on which we can't refactor as the risk is high we bring it down.
We have new machines and would like to spawn a new Ceph cluster following new architecture with latest Ceph components and best practices to reach a less complex and more stable cluster we can confidently scale, operate and maintain.
Delivery objects:
-Support in architecting and deploying high performance and highly available Kubernetes based Ceph storage solution for S3, and K8s persistent volumes supporting internal ETL, Data Analytics and AI use cases.
-Hands-on in automating deployment artefacts and developing monitoring & alerting stack to operate our cluster.
-Provide LCM plans including a set of automations that allows an upgrade in a rolling-restart mode without affecting users.
Definition of Done:
- Architectural choices are proposed and validated
- Deployments artefacts are tested to achieve efficient deployment from scratch
- Monitoring and alerting stack is deployed on PROD
- LCM plans and automated upgrades of all components across the cluster are designed, implemented, and tested on PROD
- Ceph solution is proofed on productive environment
Professional and technical framework conditions:
- Proficiency in Ceph technology deployed on Kubernetes
- Designed, deployed operated and enhanced multiple Ceph solutions