We are a dedicated team developing Large Language Models, one of the most prestigious and advanced ongoing Natural Language Processing projects in the world. Our team is responsible for the entire data engineering pipeline—from the collection of raw text data, through preprocessing and storage, to serving the data for model training and deployment. Additionally, our Data Engineering team is actively involved in Optical Character Recognition and image processing.
Our focus is on the rapid development and deployment of state-of-the-art information retrieval systems to meet complex information needs. As a Data Engineer, you will play a critical role in our team, owning the core data engineering tasks in our product pipeline. You will collaborate closely with cross-functional teams to provide innovative solutions to real-world problems.
To succeed in this role, you'll need a results-driven mindset, a passion for excellence, and a continuous desire to learn and improve. Your key responsibilities in this project will include: