Global Fashion Group
Copac Square Building, 12 Ton Dan, TP Hồ Chí Minh
Company Size : 25-99
View moreJob Summary
Job description
Overview of job
Global Fashion Group is the leading fashion and lifestyle destination in growth markets across LATAM, SEA and ANZ. From our people to our customers and partners, we exist to empower everyone to express their true selves through fashion. Our three e-commerce platforms: Dafiti, ZALORA and THE ICONIC connect an assortment of international, local and own brands to over 800 million consumers from diverse cultures and lifestyles. GFG’s platforms provide seamless and inspiring customer experiences from discovery to delivery, powered by art & science that is infused with unparalleled local knowledge. As part of the Group’s vision is to be the #1 online destination for fashion & lifestyle in growth markets, we are committed to doing this responsibly by being people and planet positive across everything we do.
ABOUT THE TEAM
Our vision to bring affordable and sustainable fashion to customers in our markets is powered by technology and data. The GFG Data Team is an integral part of GFG and is essential to enabling a data-driven decision-making culture.
ABOUT THE ROLE
As the Lead Data Engineer in the Data team, you will be working on all aspects of Data, from Platform and Infra build out to pipeline engineering and writing tooling/services for augmenting and fronting the core platform. You will be responsible for building and maintaining the state-of-the-art data Life Cycle management platform, including acquisition, storage, processing and consumption channels. The team works closely with Data Scientists, Data Analytics, Business Intelligence and business stakeholders across the GFG in understanding and tailoring the offerings to their needs.
THE IMPACT YOU’LL MAKE
- Build and manage the data asset using some of the most scalable and resilient open source big data technologies like Airflow, Spark, Kafka, etc
- Build and manage a highly scalable, efficient Data and ML Infrastructure by adopting microservices driven design and architecture with proper DevOps principles and practices
- Design and deliver the next-gen data lifecycle management suite of tools/frameworks, including ingestion and consumption on the top of the data lake to support real-time as well as batch use cases
- Help the team in integrating various data sources across GFGs group vertical
- Build and expose metadata catalog for the Data Lake for easy exploration, profiling as well as lineage requirements
Job Requirement
WHAT WILL YOU BRING TO THE TEAM
- At least 3+ years of relevant experience in developing scalable, secured, fault tolerant, resilient & mission-critical Big Data platform.
- Able to maintain and monitor the ecosystem with high availability.
- Must have sound understanding for all Big Data components & Administration Fundamentals. Hands-on in building a complete data platform using various open source technologies.
- Must have good fundamental hands-on knowledge of Linux and building big data stack on top of AWS/GCP using Kubernetes.
- Strong understanding of big data and related technologies like HDFS, Spark, Presto, Airflow, Kafka, Apache Atlas etc.
- Good knowledge of Complex Event Processing systems like Spark Streaming, Kafka, Apache Flink, Beam etc.
- Able to drive devops best practices like CI/CD, containerization, blue-green deployments, secrets management etc in the Data ecosystem.
- Able to develop an agile platform with auto scale capability up & down as well vertically and horizontally.
- Able to develop an observability and monitoring ecosystem for all the components in use in the data ecosystem.
- Proficiency in at least one of the programming languages Java, Scala, Python or Go.
Languages
-
English
Speaking: Intermediate - Reading: Intermediate - Writing: Intermediate
Technical Skill
- AWS
- GCP
- Big Data
- Java
- Linux
- Python
- HDFS
- Apache Spark
- Scala
- Golang
- Apache Kafka
- Apache Airflow
- Apache Flink
- Apache Beam
- Apache Presto
- CI/CD
BUSINESS PROFILE