AWS Data Engineer (Senior)

Infogain

Gurugram, Haryana, India

Jul 26, 2024

Jul 26, 2025

Onsite

Full-Time

6 Years

Job Description

We are seeking a highly skilled and motivated Data Engineer to join our dynamic team. The ideal candidate will possess extensive experience in ETL, Data Modeling, and Data Architecture, with proficiency in Scala and hands-on experience in stream data processing using Spark, Kafka, and Spark Structured Streaming.

Key Responsibilities

Develop Data Platforms. Build and maintain data platforms including Data Lake, cloud Data Warehouse, APIs, and both batch and streaming data pipelines.
Data Processing. Develop batch and stream processing solutions using Apache Spark, Kafka, and Spark Structured Streaming.
Orchestration. Utilize Airflow for automating and managing data workflows.
Data Transformation. Transform and cleanse raw data using Spark, SQL/PLSQL, and Scala.
Data Storage. Implement data storage solutions with Parquet/ORC formats on platforms like PostgreSQL, SQL Server, Teradata, and RDS.
Data Modeling. Optimize data storage and retrieval performance through advanced data modeling techniques, including Relational, Dimensional, and E-R modeling.
ETL Processes. Maintain data integrity and quality with robust ETL validation and error handling.
Deployment Automation. Automate deployment processes with CI/CD tools like Jenkins and Spinnaker.
Monitoring & Troubleshooting. Use DataDog and Splunk to monitor and troubleshoot data pipelines, ensuring system reliability.
Agile Methodologies. Participate in Agile practices such as Scrum/Kanban, including sprint planning, daily stand-ups, and retrospectives.
Code Review. Conduct code reviews to maintain coding standards and best practices.
Documentation. Maintain comprehensive documentation of data pipelines, schemas, and processes using Confluence.
On-Call Support. Provide on-call support for production data pipelines and resolve issues promptly.
Collaboration. Work with cross-functional teams including developers, data scientists, and operations teams to tackle complex data challenges.
Continuous Improvement. Stay current with emerging technologies and industry trends to enhance data engineering processes and tools.
Reusable Components. Contribute to the development of reusable components and frameworks to streamline data engineering tasks.
Version Control. Manage codebase with Git and use IntelliJ IDEA for efficient development and debugging.
Security Best Practices. Ensure data security by implementing access controls and handling sensitive data responsibly.

Good-to-Know Skills

Programming Languages. Python, Bash/Unix/Linux
Big Data Technologies. Hive, Avro, Apache Iceberg, Delta Format
Cloud Services. EC2, ECS, S3, SNS, SQS, CloudWatch
Databases. DynamoDB, Redis
Containerization and Orchestration. Docker, Kubernetes
CI/CD Tools. GitHub Copilot
Additional Skills. Maven, CLI/SDK

Nice-to-Have Skills

Networking. Subnets, Routes
Big Data Technologies. Flink

Experience

Years of Experience. 6-8 years

Skills

Primary Skill. Data Engineering
Sub Skills. AWS - EKS, AWS - CloudFormation, AWS-Apps, AWS-Infra, AWS DBA
Additional Skills. Python, Apache Hive, SQL

How to Apply

If you are passionate about data engineering and want to work in a cutting-edge environment, please submit your resume and a cover letter detailing your experience and qualifications.