Job Purpose:
To ensure the effective design and implementation of observability solutions that support the Unified IT Command Center in monitoring and improving the performance of IT services and infrastructure.
Responsibilities:
• Define and recommend observability principles and architectures, ensuring seamless monitoring across applications, infrastructure, and networks.
• Assist in the development of SLIs, SLOs, and SLAs, ensuring they align with organizational objectives and enhance operational resilience.
• Guide the integration of monitoring technologies into workflows to provide end-to-end visibility and early detection of performance issues.
• Regularly assess the effectiveness of observability systems and recommend improvements to enhance service reliability and efficiency.
• Collaborate with cross-functional teams to ensure observability systems meet the needs of the Unified IT Command Center.
• Provide technical guidance and mentorship to staff on observability best practices and methodologies.
• Conduct ongoing evaluations of observability processes and architectures to ensure scalability with organizational growth.
• Lead initiatives to optimize the use of monitoring technologies to support proactive incident resolution and performance management.
• Architect and deploy Dynatrace in on-premise environments (Managed version), ensuring optimal performance, scalability, and security.
• Lead the design and implementation of enterprise-wide Dynatrace dashboarding strategies, including custom dashboards for IT and business stakeholders.
• Define best practices for ITOM integrations between Dynatrace and ITSM platforms (ServiceNow, BMC) for automated incident handling.
• Develop API-based integrations to automate data extraction, event correlation, and reporting across observability platforms.