We are on the lookout for a highly skilled and motivated AWS Data Engineer to join our dynamic team. In this role, you will play a critical part in designing, building, and maintaining scalable data pipelines, real-time streaming applications, and robust data products in a cloud-based environment. You will work alongside a collaborative team to deliver cutting-edge data solutions that drive meaningful business outcomes. This role demands a strong technical background, problem-solving expertise, and excellent teamwork abilities.
Key Responsibilities
Data Pipeline Development
- Design, develop, and maintain highly scalable and reliable data pipelines using tools like Spark and Kafka or similar platforms.
- Build and optimize data engineering pipelines for both batch and streaming processes in production environments.
Cloud Integration
- Develop and deploy data pipelines and products in AWS cloud or hybrid cloud infrastructures.
- Leverage AWS services such as EKS, CloudFormation, and AWS-Apps to build efficient, cost-effective solutions.
Team Collaboration
- Work collaboratively in a team of at least 5 engineers, adhering to agile methodologies, version control systems, and best coding practices.
- Actively participate in brainstorming sessions, code reviews, and peer programming exercises.
ETL Development
Data Processing Expertise
- Demonstrate proficiency in streaming and batch processing patterns, ensuring the ability to handle complex data scenarios.
- Maintain expertise in modern data access patterns, balancing between traditional RDBMS systems and distributed systems.
Continuous Improvement
- Implement CI/CD pipelines for streamlined deployments using tools like GitHub Actions, Jenkins, or equivalent.
- Perform thorough code reviews, manage version control, and contribute to enhancing the overall engineering process.
Required Qualifications
Technical Proficiency
- Hands-on experience with Spark, Kafka, Airflow 2.0, PySpark, and Python.
- Expertise in building data pipelines on AWS, with a proven record of deploying in production environments.
- Strong skills in at least one programming language, such as Java, Spark, or Python, with working knowledge of additional languages.
- Familiarity with modern data access patterns in both RDBMS and distributed systems.
- Project Delivery. Demonstrated ability to design and deliver projects using at least one data processing pattern (stream or batch).
- CI/CD Practices. Knowledge of continuous integration and deployment methodologies, with experience in tools like GitHub Actions or Jenkins.
Skills and Competencies
- Analytical Thinking. Strong analytical and problem-solving skills to tackle complex data challenges.
- Teamwork. Excellent communication and collaboration skills to thrive in a team-based environment.
- Adaptability. A proactive and self-driven approach to learning and adapting to evolving technologies.
- Quality Focus. Commitment to delivering high-quality, reliable, and efficient data solutions.
Preferred Experience
- Years of Experience. 6-8 years in the field of data engineering.
- Primary Skills. Expertise in Data Engineering tools and techniques.
- Additional Skills. Proficiency in AWS-EKS, CloudFormation, Apache Hive, SQL, and other relevant technologies.
About Infogain
Infogain is a leading human-centered digital platform and software engineering company headquartered in Silicon Valley. We specialize in engineering business outcomes for Fortune 500 companies and digital-native organizations across industries such as technology, healthcare, insurance, travel, telecom, and retail & CPG. Leveraging cutting-edge technologies, including cloud, microservices, automation, IoT, and AI, we accelerate experience-led transformations for our clients.