This job offer is not available in your country.

Sr. Data Engineer Azure Databricks

FusemachinesIslamabad, Islamabad Capital Territory, Pakistan

16 days ago

Job description

About Fusemachines

Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. We have a robust presence in four countries and a dedicated team of over 400 full-time employees, committed to fostering AI transformation journeys for businesses worldwide.

Location : Remote (Full-time)

About The Role

This is a remote, contract position responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics).

We are looking for a skilled Senior Data Engineer with a strong background in Python, SQL, PySpark, Azure, Databricks, Synapse, Azure Data Lake, DevOps and cloud-based large-scale data applications with a passion for data quality, performance and cost optimization. The ideal candidate will develop in an Agile environment, contributing to the architecture, design, and implementation of Data products in the Aviation Industry, including migration from Synapse to Azure Data Lake. This role involves hands-on coding, mentoring junior staff and collaboration with multi-disciplined teams to achieve project objectives.

Qualification & Experience

Must have a full-time Bachelor's degree in Computer Science or similar
At least 5 years of experience as a data engineer with strong expertise in Databricks, Azure, DevOps, or other hyperscalers
5+ years of experience with Azure DevOps, GitHub
Proven experience delivering large scale projects and products for Data and Analytics as a data engineer, including migrations
Following certifications :

Databricks Certified Associate Developer for Apache Spark

Databricks Certified Data Engineer Associate

Microsoft Certified : Azure Fundamentals

Microsoft Certified : Azure Data Engineer Associate

Microsoft Exam : Designing and Implementing Microsoft DevOps Solutions (nice to have)

Required Skills / Competencies

Strong programming skills in Python (required), Scala, and writing efficient and optimized code for data integration, migration, storage, processing and manipulation

Strong understanding and experience with SQL and writing advanced SQL queries

Thorough understanding of big data principles, techniques, and best practices

Experience with scalable and distributed data processing technologies such as Spark / PySpark (Azure Databricks), DBT and Kafka

Databricks development experience with Python, PySpark, Spark SQL, Pandas, NumPy in Azure

Designing and implementing efficient ELT / ETL processes in Azure and Databricks; develop custom integration solutions as needed

Data integration from APIs, databases, flat files, event streaming

Data cleansing, transformation, and validation

Relational Databases (Oracle, SQL Server, MySQL, Postgres) and NonSQL Databases (MongoDB or similar)

Data modeling and database design principles; design efficient schemas for data architecture

Data warehousing, data lake and data lake house solutions in Azure and Databricks

Delta Lake, Unity Catalog, Delta Sharing, Delta Live Tables (DLT)

SDLC knowledge, Agile methodologies

Experience with SDLC tools : Azure DevOps, GitHub; project management (Jira / Azure Boards), source control, CI / CD (GitHub Actions, Azure Pipelines), artifact management

DevOps principles : CI / CD, IaC (Terraform, ARM), configuration management, automated testing, performance tuning, cost optimization

Cloud computing in Microsoft Azure for data and analytics : ADF, Databricks, Synapse, Data Lake, Data Lake Storage, SQL Database, etc.

Orchestration with Databricks workflows and Apache Airflow

Data structures and algorithms; strong software engineering practices

Migrating from Azure Synapse to Azure Data Lake or other technologies

Analytical skills to identify and address issues, bottlenecks, and failures

Debugging and troubleshooting in complex data environments

Data quality and governance, including data quality checks and monitoring

BI solutions (Power BI) is a plus

Strong communication skills for cross-functional collaboration

Documentation of processes and deployment configurations

Security practices : network security groups, Azure Active Directory, encryption; compliance

Ability to implement security controls in data / analytics solutions

Mentoring and coaching of team members; willingness to stay updated with trends

Ability to work independently in a rapidly changing environment

Focus on architecture, observability, testing, and reliable data pipelines

Responsibilities

Architect, design, develop, test and maintain high-performance, large-scale data architectures for data integration (batch and real-time), storage, processing, orchestration and infrastructure; ensure scalability, reliability, and performance (Databricks and Azure)

Contribute to detailed design, architectural discussions, and customer requirements

Participate in design, development, and testing of big data products

Construct and optimize Apache Spark jobs and clusters within Databricks

Migrate from Azure Synapse to Azure Data Lake or other technologies

Design schemas and data models to support modern analytics (descriptive to prescriptive)

Develop clear, maintainable code with automated testing

Collaborate with cross-functional teams to understand data requirements and deliver reusable components

Evaluate and implement new technologies to improve data integration, processing, storage and analysis

Design, implement and maintain data governance : cataloging, lineage, data quality

Monitor and optimize workloads and clusters for performance

Mentor junior team members and share best practices

Maintain documentation of solutions and configurations

Promote best practices in data engineering, governance, and quality

Ensure data quality and security

Be an active Agile team member and contribute to continuous improvement

Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws.

#J-18808-Ljbffr

Create a job alert for this search

Data Engineer • Islamabad, Islamabad Capital Territory, Pakistan