Ansar H

@ansarhayat515

Data Engineering, Azure,AWS, Databricks, Lakehouse , Spark, Fabric

Pakistan

Inglese

Alcune informazioni sono riportate in lingua inglese.

Chi sono

I am a Databricks Certified Professional and Microsoft Certified Data Engineer with 9 years of real-world experience in building scalable, high-performance data platforms. I help businesses design and implement end-to-end data solutions, including ETL/ELT pipelines, data warehousing, lakehouse architectures, and cloud-based analytics. My focus is on delivering clean, optimized, and production-ready data systems that empower better decision-making Core Skills: Azure, Databricks, AWS, EMR,Airflow, Fabric, Data Lake Storage, DeltaLake, Azure Data Factory, PostgreSQL, MySQL, SQL,Mongo,Power BI.... Continua a leggere

Competenze

Ansar H

offline •

Tempo di risposta medio: 7 ore

Consulta i miei servizi

Dati ETL

I will build and optimize scalable databricks delta lake pipelines

Dati ETL

I will build and optimize scalable microsoft fabric pipelines and onelake architecture

Portfolio

Esperienza lavorativa

Senior Data Engineer

DIS • Full time

Sep 2023 - Jan 2026 • 2 yrs 4 mos

Responsible for:  Led Databricks platform onboarding with cost analysis comparing AWS Glue and Databricks compute.  Designed and managed end-to-end data pipelines using Databricks and AWS Glue.  Utilized Apache Spark for batch and stream processing, ensuring data quality and consistency.  Integrated data from PostgreSQL and Amazon S3.  Performed ETL/ELT operations and maintained Delta Lake-based data lakes.  Developed and deployed a DIS Analytics alerting process leveraging Amazon S3 event triggers, AWS Lambda, and Amazon SNS to automatically detect and notify on duplicate records, missing source files during data ingestion in Databricks.  Built a conversion layer using Databricks Genie AI.  Exposed data via Databricks SQL API & Databricks Delta Share with secure access controls (RBAC, authentication).  Used Spark OCR to extract data from financial invoice PDFs stored in S3 and loaded results into Delta tables.  Implemented data governance practices with Unity Catalog.  Optimized Spark job performance, reducing costs by 50%.  Monitored and tuned clusters for performance and reliability.  Integrated Databricks with Power BI for reporting and dashboards.

Senior Data Consultant

Systems Limited (Regeneron) • Full time

Jul 2019 - Aug 2023 • 4 yrs 1 mo

Responsible for:  Delivered big data solutions to various clients using AWS data platform services.  Ingested data from SharePoint, SFTP, and cloud storage into the data lake using Apache NiFi and Pypark.  Performed data transformation using PySpark; deployed scripts via Jenkins and scheduled DAGs in Airflow, running on EMR clusters.  Collaborated with the Databricks team to implement solutions using AWS Databricks E2.  Provisioned S3 buckets for audit logs and Terraform state files using Terraform scripts.  Built CI/CD pipelines integrated with Bitbucket.  Created and managed Databricks workspaces and policies.  Configured Unravel with Databricks for Hive and S3 access control.  Worked with Informatica Cloud to move data into Amazon Redshift.  Developed Kafka producer/consumer applications on clusters managed with Zookeeper.  Leveraged Kafka APIs and connectors (MySQL, PostgreSQL, MongoDB) for smooth message processing.  Used Alteryx to build ETL pipelines; created functional diagrams and documented data flows in Confluence.

Data Engineer

The ENTERTAINER • Full time

Jul 2019 - Oct 2021 • 2 yrs 3 mos

Responsible for:  Worked with structured, semi-structured, and unstructured data. Designed and implemented a scalable Big Data architecture using Databricks, Data Lake, and Delta Lake, incorporating automated data quality and governance controls.  Delivered end-to-end data engineering solutions leveraging Databricks, Delta Lake, and Azure Synapse Analytics to support enterprise analytics needs.  Developed high-performance batch data pipelines using PySpark and Spark SQL to ingest and transform data from MongoDB into the enterprise data lake.  Built, scheduled, and orchestrated ETL/ELT pipelines in Azure Data Factory for Azure Synapse Analytics environments.  Designed and implemented real-time streaming data pipelines using Azure Event Hub, enabling near real-time ingestion into Delta Lake.  Created executive and operational dashboards using Tableau Desktop and Tableau Server, supporting automated and scheduled reporting.  Utilized advanced MongoDB aggregation and indexing techniques to support ad hoc and analytical reporting use cases.

Invia un messaggio a Ansar H

ViaTempo di risposta medio: 7 ore

Ti serve un approccio creativo?

Cerchi esperti in tecnologia?

Vuoi raggiungere e convertire i consumatori?

Cerchi scrittori?

Porta avanti la tua attività in maniera furba

Ansar H

Portfolio

Esperienza lavorativa