At Nielsen, we are passionate about our work to power a better media future for all people by providing powerful insights that drive client decisions and deliver extraordinary results. Our talented, global workforce is dedicated to capturing audience engagement with content - wherever and whenever it’s consumed. Together, we are proudly rooted in our deep legacy as we stand at the forefront of the media revolution. When you join Nielsen, you will join a dynamic team committed to excellence, perseverance, and the ambition to make an impact together. We champion you, because when you succeed, we do too. We enable your best to power our future. Seeking a strong, resourceful and committed support engineer to maintain and ensure the proper working status of our AWS-based production system and services. The candidate must have a proven record of supporting AWS-based backend systems by setting up monitoring and alerting and performing troubleshooting as needed. The candidate must have hands-on experience with analyzing large sets of data based on trending trying to pinpoint potential production issues. Creating reports from data in S3 buckets or SQL databases using queries, both on-demand and on an on-going basis is a strong requirement for the position. The candidate must demonstrate and expand the use of best practices with AWS cost control and production efficiency.
Responsibilities
Monitor, maintain and support the teams production assets in AWS which includes Collections and Crediting
Perform reviews of our systems, track AWS updates and notifications, assess whether such updates impact our systems, generally ensure the health and functional order of all our nonprod, stage and production assets in AWS
As this is a large system processing large amounts of data 24x7 with daily downstream delivery contracts, ensure such daily delivery commitments are met per SLAs with NO delay or miss
Automate any manual processes or introduce new processes and best practices to ensure the proper functional order of our AWS-based production environment
Execute post deployment impact analysis and monitoring following our production releases to ensure no quality escape
Perform data analysis at the request of both our own team’s senior members as well as external stakeholders such as Product and Data Science. Such analysis will involve collecting, combining, joining multiple data sets from multiple tables coming from our databases, S3 buckets or MDL (Media Data Lake) environments and highlighting any anomalies or trend breaks
Engage appropriate members of the team (leaders, developers) when alerts are reported
Able to perform a first level of troubleshooting and analysis on problems assisting and supporting further in-depth analysis by the developers themselves
Monitor AWS systems performance and advise of any necessary infrastructure changes; ensure our assets (especially the production ones) are not impacted
Must be able to make themselves available at any time an alert in production is reported, assess severity and engage the team as needed
Document processes and procedures in RunBooks or any other type of documentation (for support and audit purposes)
Keep and maintain good records and logs of releases, upgrades, issues and actions
Key Skills
Bachelor's in Computer Engineering, Science or other related technical field
At least 4 years of professional, hands-on experience in supporting and maintaining large AWS systems and services, ensuring quality and proper working order
Good knowledge of the AWS ecosystem and services
SQL programming for writing queries for data checks and analysis
Knowledge of Python programming
Knowledge of AWS security best practices, including IAM roles, security groups
Very good knowledge of AWS Monitoring, Alerting and Automation concepts and tools. Experience in both setting up and using such tools on an ongoing basis
Any prior experience in designing datasets and visualizations with tools like Superset, Grafana is a definite PLUS
Knowledge of Unix and Windows environments
Resourceful, self-starter, proactive and a team-player. Able to quickly assess possible problems and take quick decisions to protect our environment and our data. The successful candidate will own support and is expected to work independently without hand holding
Detail oriented, problem solver
Effective communication; ability to describe and explain potential or existing issues and problems efficiently and accurately. Very good writing skills
Experience working in an Agile environment (scrum teams)