Job Description
IntroductionIBM Cloud Object Storage is designed for the most demanding data needs. Our team is dedicated to building and maintaining a highly available and scalable storage solution that powers businesses around the world. We are looking for a passionate DevOps Engineer to join our team and help ensure the reliability, performance, and efficiency of our services.
Your Role and ResponsibilitiesAs an engineer within the IBM Cloud Object Storage (COS) group, you will play a critical role in deploying, maintaining and improving the reliability and performance of our cloud storage solutions. You will leverage your systems administration, DevOps, and networking expertise to ensure our systems are robust, scalable, and efficient. The successful candidate will work directly with his team and various teams within IBM Cloud to ensure the stability and availability of the Cloud Object Storage offering. IBM is seeking creative and responsible talent who can work well with others and are able to build a service that is enterprise-class. If you want the opportunity to grow your technical and professional skills while helping customers succeed, IBM is the place for you. If you are exciting for the opportunity to dig into challenging operational issues and to help customers build the next generation of web applications, the IBM Cloud is a great fit. Key Responsibilities:
- The DevOps Engineer should continuously consider the best way to deploy and maintain our storage offerings. You will be actively looking for ways to drive success and communicate to leadership how to get there:
- System Administration: Manage and maintain Linux-based servers, ensuring optimal performance, security, and availability.
- DevOps Practices: Implement CI/CD pipelines, automate deployment processes, and manage infrastructure as code using tools like Jenkins, GitHub and Ansible.
- Monitoring and Incident Management: Implementation of monitoring tools to proactively identify and resolve issues. Participate in incident response.
- Networking Expertise: Troubleshoot and optimize network configurations, ensuring low-latency and high-availability connections across distributed systems.
- Collaboration and Communication: Work closely with development teams to ensure smooth deployments and provide feedback for continuous improvement.
- Documentation: Maintain clear and comprehensive documentation for systems, processes, and incident resolutions using Atlassian JIRA, Confluence, and/or Monday.
- Audits – be prepared to support audits by providing evidence or being interviewed as required
Required Technical and Professional Expertise
Required Skills: - Technical: First and foremost, strong grasp of computer science and deep technical understanding of Cloud Storage Infrastructure.
- Communicative: Candidate needs to have good communication skills and be able to explain to the development teams why they must jump through extra security hoops.
- Collaborative: Candidate needs to be able to collaborate with architects, developers, and non-technical stakeholders to drive security solutions across the organization.
- Respected: Candidate should have a good track record as a security professional in the industry. They will be expected to establish trust and respect with the COS service development teams.
- Growth Mindset: The world of security is highly dynamic, and IBM is a company that thrives on innovation and maturation, our Security and Compliance Lead must possess a growth mindset to keep up with the ever-changing security landscape and seek opportunities to increase their breadth and depth of security topics.
Required Qualifications:
- Bachelor’s degree in computer science, Information Technology, or a related field.
- 5+ years of experience in a Site Reliability Engineering, Systems Administration, or DevOps role.
- Strong knowledge of Linux systems administration
- Strong experience shell scripting (ie. bash, Python).
- Experience with cloud platforms (IBM Cloud, AWS, Azure, etc.) and cloud storage technologies.
- Proficiency in CI/CD tools (e.g., Jenkins, GitLab CI) and infrastructure automation (e.g., Terraform, Ansible).
- Familiarity with networking concepts (TCP/IP, DNS, load balancing, firewalls).
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack).
- Proficient in using Atlassian JIRA for project management and issue tracking.
- Strong problem-solving skills and a proactive approach to identifying and resolving issues.
- Excellent communication skills and the ability to work collaboratively in a team environment.
Preferred Technical and Professional Expertise
Preferred Qualification:
- Familiarity with serverless services, containerization and other cloud technologies.
- Developing, maintaining and managing storage systems.
- Knowledge of Information Lifecycle Management and tiered storage models.
- Experience with container orchestration tools (e.g., Kubernetes, Docker).
- Knowledge of object storage systems and APIs (e.g., S3-compatible storage).
- Certifications in cloud technologies or relevant fields (e.g., AWS Certified Solutions Architect, Google Cloud Professional DevOps Engineer).
- Expert ability to communicate highly technical aspects to Executives, IT staff, auditors, respectively.
- Expert experience with various scripting languages (Python, Ruby, Bash, etc.).
- 7+ years of demonstrating experience in system or application administration role(s).