Qualification :
Bachelor’s or Master’s degree in Computer Science, Information Technology, Data Engineering, or a related field.
Responsibilities :
- Design, develop, and optimize data ingestion, transformation, and storage pipelines on AWS.
- Manage and process large-scale structured, semi-structured, and unstructured datasets efficiently.
- Build and maintain ETL/ELT workflows using AWS native tools such as Glue, Lambda, EMR, and Step Functions.
- Design and implement scalable data architectures leveraging Python, PySpark, and Apache Spark.
- Develop and maintain data models and ensure alignment with business and analytical requirements.
- Work closely with stakeholders, data scientists, and business analysts to ensure data availability, reliability, and quality.
- Handle on-premises and cloud data warehouse databases and optimize performance.
- Stay updated with emerging trends and technologies in data engineering, analytics, and cloud computing.
Requirements :
- Mandatory: Proven hands-on experience with AWS Data Engineering stack, including but not limited to:
- AWS Glue, S3, Redshift, EMR, Lambda, Step Functions, Kinesis, Athena, and IAM.
- Proficiency in Python, PySpark, and Apache Spark for data transformation and processing.
- Strong understanding of data modelling principles and ability to design and maintain conceptual, logical, and physical data models.
- Experience working with one or more modern data platforms: Snowflake, Dataiku, or Alteryx (Good to have not mandatory)
- Familiarity with on-prem/cloud data warehouse systems and migration strategies.
- Solid understanding of ETL design patterns, data governance, and best practices in data quality and security.
- Knowledge of DevOps for Data Engineering – CI/CD pipelines, Infrastructure as Code (IaC) using
- Terraform/CloudFormation (Good to have not mandatory)
- Excellent problem-solving, analytical, and communication skills.