Global Fashion Group

Copac Square Building, 12 Ton Dan, TP Hồ Chí Minh

Company Size : 25-99

Job Summary

25-99

Product

Việt Nam

Lead Data Engineer

Expired

Lead Data Engineer

Global Fashion Group

Quận 4, TP Hồ Chí Minh

English
Team Leader/Supervisor
Full Time
Negotiable

Posted:24/06/2024
1

Expired

Job description

Overview of job

Global Fashion Group is the leading fashion and lifestyle destination in growth markets across LATAM, SEA and ANZ. From our people to our customers and partners, we exist to empower everyone to express their true selves through fashion. Our three e-commerce platforms: Dafiti, ZALORA and THE ICONIC connect an assortment of international, local and own brands to over 800 million consumers from diverse cultures and lifestyles. GFG’s platforms provide seamless and inspiring customer experiences from discovery to delivery, powered by art & science that is infused with unparalleled local knowledge. As part of the Group’s vision is to be the #1 online destination for fashion & lifestyle in growth markets, we are committed to doing this responsibly by being people and planet positive across everything we do.

ABOUT THE TEAM

Our vision to bring affordable and sustainable fashion to customers in our markets is powered by technology and data. The GFG Data Team is an integral part of GFG and is essential to enabling a data-driven decision-making culture.

ABOUT THE ROLE

As the Lead Data Engineer in the Data team, you will be working on all aspects of Data, from Platform and Infra build out to pipeline engineering and writing tooling/services for augmenting and fronting the core platform. You will be responsible for building and maintaining the state-of-the-art data Life Cycle management platform, including acquisition, storage, processing and consumption channels. The team works closely with Data Scientists, Data Analytics, Business Intelligence and business stakeholders across the GFG in understanding and tailoring the offerings to their needs.

THE IMPACT YOU’LL MAKE

Build and manage the data asset using some of the most scalable and resilient open source big data technologies like Airflow, Spark, Kafka, etc
Build and manage a highly scalable, efficient Data and ML Infrastructure by adopting microservices driven design and architecture with proper DevOps principles and practices
Design and deliver the next-gen data lifecycle management suite of tools/frameworks, including ingestion and consumption on the top of the data lake to support real-time as well as batch use cases
Help the team in integrating various data sources across GFGs group vertical
Build and expose metadata catalog for the Data Lake for easy exploration, profiling as well as lineage requirements

Job Requirement

WHAT WILL YOU BRING TO THE TEAM

At least 3+ years of relevant experience in developing scalable, secured, fault tolerant, resilient & mission-critical Big Data platform.
Able to maintain and monitor the ecosystem with high availability.
Must have sound understanding for all Big Data components & Administration Fundamentals. Hands-on in building a complete data platform using various open source technologies.
Must have good fundamental hands-on knowledge of Linux and building big data stack on top of AWS/GCP using Kubernetes.
Strong understanding of big data and related technologies like HDFS, Spark, Presto, Airflow, Kafka, Apache Atlas etc.
Good knowledge of Complex Event Processing systems like Spark Streaming, Kafka, Apache Flink, Beam etc.
Able to drive devops best practices like CI/CD, containerization, blue-green deployments, secrets management etc in the Data ecosystem.
Able to develop an agile platform with auto scale capability up & down as well vertically and horizontally.
Able to develop an observability and monitoring ecosystem for all the components in use in the data ecosystem.
Proficiency in at least one of the programming languages Java, Scala, Python or Go.

Languages

English

Speaking: Intermediate - Reading: Intermediate - Writing: Intermediate

Technical Skill

AWS
GCP
Big Data
Java
Linux
Python
HDFS
Apache Spark
Scala
Golang
Apache Kafka
Apache Airflow
Apache Flink
Apache Beam
Apache Presto
CI/CD

BUSINESS PROFILE