Location : Remote
Experience Level : 5 to8 years in Data Engineering, Data Pipelines, and Cloud-based Data Platforms Department : Data & AI Engineering Compensation : PKR 600,000 to 850,000 (based on experience)
Role Summary The Data Engineer will design and build large-scale, high-performance data pipelines to support segmentation, pricing simulation, and offer decisioning. They will ensure efficient data ingestion from telco systems (CDRs, usage, recharge, offer purchase), transformation, and integration with ML models and orchestration modules.
Key Responsibilities
Design and develop scalable ETL / ELT data pipelines to process 50M+ customer records daily.
Ingest data from OCS, CRM, DWH, and Adobe RT-CDP or other customer data platforms.
Build and maintain Customer Profile Store and Feature Store for real-time and batch processing.
Implement data validation, quality, and lineage frameworks.
Optimize query performance and cost efficiency for batch and streaming workloads.
Collaborate with Data Scientists to prepare model training datasets and deploy inference pipelines.
Integrate outputs with Decision Engine and Real-Time Offer Orchestration Module.
Automate pipelines using CI / CD and maintain environment configurations across Dev, UAT, and Prod.
Required Skills
Strong in SQL, PySpark, and DataFrame APIs for data transformation.
Expertise in Data Modeling (customer-level, event-level, offer-level).
Understanding of data partitioning, schema evolution, and performance tuning.
Experience in stream processing (Kafka, Spark Streaming, Kinesis).
Knowledge of data quality frameworks (e.g., Great Expectations, Deequ).
Familiarity with ETL orchestration tools (Airflow, dbt, or Dagster).
Ability to work with cloud-native data platforms and object storage.
Tools & Technologies Data Platform : Databricks, AWS Glue, Azure Data Factory, Snowflake, BigQuery Streaming : Kafka, Kinesis, Spark Streaming Storage : S3, Delta Lake, Parquet, Hive Workflow Orchestration : Airflow, dbt, Dagster, Prefect Scripting : Python, SQL, PySpark DevOps : Git, Jenkins, Terraform Monitoring & Validation : Great Expectations, Deequ, DataDog
Preferred (Nice-to-Have)
Experience with telecom datasets (Recharge, Usage, Balance, Offer Subscription).
Knowledge of DecisionRules.io, n8n, or KNIME for orchestration workflows.
Familiarity with Adobe AEP data schemas (XDM) or Pricefx integration.
Exposure to real-time microservices (REST / GraphQL APIs) for data access.
#J-18808-Ljbffr
Data Engineer • Rawalpindi Cantonment, Pakistan