Design target high-level data architecture and data management solutions for the TO BE data platform
Design, develop and maintain data architectures that align with the organisation's strategic goals
Recommend the best-fit technology components for the data platform in coordination with the Solution Architect
Construct robust data pipelines to support scalable data ingestion, transformation and delivery
Deploy integration system to collect data from various sources using real-time or batch ingestion technologies
Establish data governance policies and procedures to ensure data quality, consistency, privacy, and compliance with relevant regulations
Implement data security measures to safeguard sensitive information and ensure data is protected from unauthorized access or breaches
Participate in product roadmap planning, recommending solutions
Oversee the creation of data standards
Develop a high-level description of entities and relationships through the conceptual data model
Develop the conceptual data model which would be a representation of a data design to be implemented across all layers in data architecture
Develop and implement data management policies, standards, and procedures
Collaborate with client stakeholders to understand their data needs and requirements
Lead and mentor junior architects and engineers, fostering a culture of technical excellence and continuous learning and innovation
Desired Experience and Qualifications
Bachelor’s degree in computer science, information technology, or a related field (advanced degree such as master's or PhD in relevant area is a plus)
Proven experience of at least 8+ years as a Data Architect with a focus on Data & AI Transformation Architecture
Experience in building data pipelines with medallion (multi-hop) architectures
In-depth knowledge of Data Management concepts and best practices
Knowledge in industry-standard Enterprise Architecture Frameworks (e.g., TOGAF, Zachman).
In-depth understanding of industry-standard data management maturity frameworks (i.e., DAMA-DMBOK, DCAM, CMMI CERT-RMM, IBM Data Governance Council, Stanford Data Governance, and Gartner’s Enterprise Information Management)
Technical Expertise:
Programming Languages: High proficiency in Python, Java, Scala, and R.
Data Management & Databases: Oracle, SAP, SQL, NoSQL, and data warehousing solutions.
Big Data Technologies: Apache Hadoop, Spark, Kafka, and other big data tools.
Cloud Platforms:
Microsoft Fabric
Azure (Synapse Analytics, Databricks, Machine Learning, AI Search, Functions, etc.)
Databricks on Azure
AWS (S3, Redshift, SageMaker, etc.)
Google Cloud Platform (BigQuery, Dataflow, etc.)
Data Governance: Ab Initio, Informatica, Collibra, Purview, IBM InfoSphere, Great Expectations, Deepchecks, Databricks’ Unity Catalog, Delta Sharing, Catalog Explorer, Audit Logging, and Identity Management.
Data Visualization: Matplotlib, Seaborn, Tableau, and Power BI.
DevOps & MLOps: CI/CD principles, Docker, MLFlow, and Kubernetes.
Strong knowledge of data modelling, database design, data integration and ETL processes
Proficiency in SQL and NoSQL databases
Proficiency in web technologies and languages
Relevant certifications (e.g., Azure or AWS Certified Solutions Architect, TOGAF, etc.) are a plus