Infrastructure and Operations Lead
At Horizontal Digital, we hold ourselves to one key belief: You’re only as good as your worst customer experience.
This mantra is what drives our digital consultancy to think beyond the easy answers and instead create websites,
apps, portals, and other experiences that solve customer needs for Fortune 500 companies in intuitive and
empathetic ways. And we make this lofty standard a reality by fusing strategy, data, design, and technology
together to arrive at solutions that set the bar higher for everyone.
We use these values to fuel superior results:
Lock arms
We forge relationships that make our impact 1,000x stronger. This means working across departments and
engaging both our clients and our communities to deliver the greatest good.
Show hustle
We’re not ones to sit on our hands and wait. Instead, we anticipate opportunities, collectively roll up our sleeves
and find ways to execute the exceptional.
Embrace change
From new technologies to workplace philosophies, we welcome the unexpected and constantly anticipate what’s
next.
Elevate empathy
We listen before we take action. This means understanding a variety of perspectives and holding ourselves to a
higher standard of accountability.
Never settle
We motivate each other to push past the easy answers and collectively arrive at bigger, more inspiring ideas.
But enough about us. Let’s talk about you.
As an Infrastructure and Operations Lead, you will be part of a global team of experienced Application and
Infrastructure engineers. You will have a unique opportunity in flexibility to develop your own talents by improving
your analysis and troubleshooting skillsets and have the option to work closely with varying team disciplines and
expand your knowledge in other domains of web application and environment management.
What you’ll do:
• Utilize ITSM tools like BMC Remedy (Amer Portal), Jira Service Desk, Webex and Microsoft Teams to
communicate directly with clients or team members regarding findings and recommendations.
• Utilize monitoring tools like App Dynamics and New Relic to proactively monitor customer’s platform and
respond to alerts generated from these monitoring tools to make sure that the SLAs are met, via emails and
tickets.
• Analyze and review system logs to troubleshoot and determine the root cause of issues reported by the
monitoring tool/client.
• Work with the customer’s IT Teams/other internal teams on infrastructure maintenance activities like SSL
Certificate Updates, windows security patching and server reboots.
• Work with the customer’s IT Teams on error/incident resolution on a per-ticket basis.
• Work with the customer’s IT Teams/other internal teams on providing support for their planned/unplanned
activities like push notifications, DR failover and failback.
• Educate yourself on client’s infrastructure platform to offer solutions in the moment and forward-thinking
recommendations.
• Learn and develop your experience with tools, technologies, and platforms like Vercel, Netlify, Sitecore,
Coveo, Solr, Contenful, Order Cloud, JSS, and a variety of custom solutions.
• Collaborate with development teams and other stakeholders to identify potential risks around a planned
update/upgrade activity
• Work with customer’s teams and internal teams to manage and supervise activities related to performance
optimization of current platform which may include but not limited to even traffic distribution between load
balanced servers, finding and resolving bottlenecks if any, ensuring HA and monitoring.
• Work with the customer’s team to provide suggestions/recommendations for improving end user experience
which may include but not limited to working on caching strategies, analyze and check if any other tools like
CDN/WAF are required and if already there how it can be optimized.
• Work with the customer’s team on Security Audits and Scans and work with internal team for scheduling and
applying resolutions/mitigation for found vulnerabilities
• Work with the customer’s team on activities like DR testing and failover and validation of backup by restoring it
in DR or any other isolated environment.
• Work with the Customer’s team and be part of platform updates/deployments using CI/CD pipelines and make
sure smooth working of such pipelines.
• Manage and update on-call and shift rotation of team members.
• Manage and supervise daily team activities and other scheduled activities and be active part of such activities.
• Manage technical escalations within the team
• Manage efficient KT and Documentation within the team.
Who you are:
• A collaborative individual who is not afraid to work directly with team members or leaders in pursuit of high quality solutions and communicate effectively and provide clear feedback.
• A driven self-starter interested in expanding their knowledge base and will make good use of learning materials
during downtime.
• A friendly and communicative partner for our clients who understand that not all clients are as technically
oriented as yourself.
• A seasoned leader who can mentor and guide junior team members
• An analytical engineer who does not shy away from difficult problems and prides themselves in fully
understanding the situation faced before providing a recommended solution.
What you bring:
• Minimum 6-8 years of experience in managing Self Hosted Infrastructure.
• Minimum 6-8 years of experience in working with multiple OS flavors like Windows and Linux.
• Minimum 2-3 years of experience in managing a team of engineers is a must.
• Experience with CI/CD technologies is a must.
• Experience with Azure DevOps, GitHub, or other code repository tools is a plus.
• Experience with Sitecore or CMS is a plus.
• Experience with Vercel or Netlify is a plus.
• Experience with Firewalls, WAF, CDN is a plus
• Experience with SecOps is a plus
The above description is not designed to cover or contain a comprehensive listing of activities, duties or
responsibilities that are required of the employee for this job. Duties, responsibilities, and activities may change at
any time with or without notice