1. Organisational & Project Management**
- Create a communication matrix for business partners to support daily syncs, escalations, and contact details (emails, Slack, and SPOC).
- Develop documentation for software packages, including installation steps and how-to guides.
2. Operational & Infrastructure**
1. *Software & System Monitoring*
- Manage software components: install, upgrade, configure, and maintain.
- Keep the Super POD up-to-date with necessary patches and upgrades.
- Ensure peak performance and functionality of the cluster.
- Improve cluster monitoring with customised scripts for user-friendly dashboards.
- Automate tests for performance & acceptance.
2. *Application Packaging & Deployment*
- Package and deploy applications on the cluster infrastructure.
- Conduct testing to ensure smooth operation.
3. *Ongoing Support*
- Investigate and resolve reported issues and tickets, collaborating with NVIDIA (if necessary).
- Address user system-related requests and establish a clear problem management process.
3. AI-Applications Advisory
2. *Application Packaging & Deployment*
- Package and deploy applications on the cluster infrastructure.
- Conduct testing to ensure smooth operation.
3. *Ongoing Support*
- Investigate and resolve reported issues and tickets, collaborating with NVIDIA (if necessary).
- Address user system-related requests and establish a clear problem management process.
3. AI-Applications Advisory
- Offer guidance on best practices and techniques for training, inference, and optimising delivery of deep learning applications.
- Assist in designing application architectures prioritising performance and optimisation, including GPU optimisation, quantification, and inference parallelisation.
اطلب مساعدة الخبراء لكتابة سيرة ذاتية مميزة.