We are looking for a Sr. Data Engineer, Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. We are looking for an excellent Sr. Data engineer with extensive data engineering experience for our data science and reporting needs surrounding NVIDIA's cloud services.
SRE's culture of diversity, intellectual curiosity, problem solving and openness is important to our success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to build an environment that provides the support and mentorship needed to learn and grow.
What you’ll be doing:
Design and deliver essential, high-performance services and libraries.
Build streaming data pipelines for collecting & processing data from multiple data sources: from the point of ingestion to useful insights
Design and build data architecture
Partner with our other engineering and business teams to integrate your amazing innovations and algorithms into our production systems
Automate everything for measuring, testing, updating, monitoring and alerting
What we need to see:
Bachelors or Master’s degree in Computer Science or a related technical field (or equivalent experience)
8+ years of software engineering experience
Passion about Big data and large scale distributed systems
Expert knowledge with building and operating multi-petabyte data lakes
Excellent SW development skills in one or more: Java/Scala/Python/Go
Experience with building real time streaming applications with kafka etc
Strong interpersonal skills including the ability to identify and communicate data driven insights
Ways to stand out from the crowd:
Contributions to open source
Experience with operating large scale distributed systems with strong SLAs