Salman Azhar,巴基斯坦拉合尔的开发商
Salman is available for hire
Hire Salman

Salman Azhar

Verified Expert  in Engineering

Data Engineering Developer

Location
Lahore, Pakistan
Toptal Member Since
June 6, 2022

Salman是一位勇于挑战的数据工程师,拥有7年以上开发大数据解决方案和使用各种亚马逊云服务的专业经验, including AWS StepFunctions, Amazon Redshift, and AWS Data Pipeline. In addition, 他有设计架构流和实现数据湖的经验, data ingestion workflows, and data warehouses on AWS. 萨尔曼总是在寻找具有挑战性的机会,并为自己在每个项目上都加倍努力而感到自豪.

Portfolio

Scoutbee
Python, Apache Kafka, Kafka Streams,数据湖,数据湖设计,Delta湖...
Zaavya
SQL, Python, AWS步进函数,Amazon Simple Queue Service (SQS), AWS Lambda...
Systems limited
SQL, AWS步进函数,AWS Lambda, Python, Pentaho, Microsoft Power BI...

Experience

Availability

Part-time

Preferred Environment

Slack, Amazon Web Services (AWS), Data Pipelines, PySpark, SQL, Data Warehousing, Pipelines, Python, Databases, Data Analytics, Database Design, Data Analysis

The most amazing...

...我开发的消化管道有能力消化任何类型的食物, format, 数据结构实时化,支持流和批处理加载.

Work Experience

Senior Data Engineer

2020 - 2022
Scoutbee
  • 使用SSOT实践在AWS上设计并开发了一个数据湖.
  • 开发了一个REST API微服务,通过服务所有机器学习模型,避免了用模型版本更新链接服务.
  • 实现并设计了一个多源多汇流摄取平台.
Technologies: Python, Apache Kafka, Kafka Streams,数据湖,数据湖设计,Delta湖, Databricks, Elasticsearch, Neo4j, Graph Databases, Spark, Data Engineering, Data Architecture, ETL, Snowflake, Programming, Analytics, Databases, Data Structures, Data Modeling, Python 3, Data Pipelines, PySpark, SQL, Data Warehousing, Data Analytics, Data, Amazon Web Services (AWS), Data Warehouse Design, Data Quality, Data Cleaning, Git, Redshift, Apache Spark, Amazon Athena, AWS Glue, Amazon Simple Queue Service (SQS), Database Design, Big Data, Big Data Architecture, Data Analysis, Data Matching, Spark Streaming, Spark Structured Streaming, Exploratory Data Analysis

Senior Data Engineer

2019 - 2020
Zaavya
  • 基于AWS无服务器架构设计了一个云原生解决方案.
  • 创建了配置驱动的数据管道工作流,它使用多个组件来组织业务流程和数据流.
  • 将数据从本地数据库迁移到公司的AWS数据中心.
  • 在AWS Glue中对传入数据进行自动数据编目.
  • 设计并实现事务监视器和错误处理流.
Technologies: SQL, Python, AWS步进函数,Amazon Simple Queue Service (SQS), AWS Lambda, AWS Glue, Elasticsearch, Graph Databases, Amazon Kinesis, Amazon S3 (AWS S3), Data Engineering, Data Architecture, ETL, Programming, Analytics, Databases, Data Structures, Data Modeling, Python 3, Data Pipelines, PySpark, Data Warehousing, Data Analytics, Data, Spark, Data Lakes, Data Lake Design, Amazon Web Services (AWS), Data Warehouse Design, Data Quality, Data Cleaning, Git, Apache Spark, Amazon Athena, Delta Lake, Database Design, Big Data, Big Data Architecture, Data Analysis, Data Matching, Spark Streaming, Spark Structured Streaming, Exploratory Data Analysis

Senior Data Engineer

2018 - 2019
Systems limited
  • 建立了美国大选美联社数据的数据模型.
  • 设计工作流程并使用Pentaho进行实现.
  • 使用SQL作为源,在Power BI上创建交互式仪表板.
  • 使用AWS实现了一个混合云和本地数据摄取平台.
Technologies: SQL, AWS步进函数,AWS Lambda, Python, Pentaho, Microsoft Power BI, Amazon QuickSight, Amazon DynamoDB, Elasticsearch, Data Engineering, Data Architecture, ETL, Programming, Analytics, Databases, Data Structures, Data Modeling, Python 3, Data Pipelines, PySpark, Data Warehousing, Data Analytics, Data, Spark, Data Lakes, Data Lake Design, Amazon Web Services (AWS), Data Warehouse Design, Data Quality, Data Cleaning, Git, Redshift, Apache Spark, Delta Lake, Database Design, Big Data, Big Data Architecture, Data Analysis, Data Matching, Exploratory Data Analysis

Data Engineer

2016 - 2018
NorthBay Solutions
  • 在AWS S3上开发数据湖,并在Apache Spark框架上工作,处理和处理tb级的数据.
  • 使用Amazon QuickSight可视化和分析数据, Amazon Athena, Amazon Redshift和AWS Glue中的编目数据.
  • 参与了关于架构的讨论,并开发了将内部数据仓库引入Amazon Redshift的管道.
  • 利用Amazon Redshift进行数据建模和开发数据集市和视图.
Technologies: SQL, Python, Spark, AWS Glue, AWS Lambda, Amazon EC2, Amazon RDS, Redshift, Git, Jira, Amazon S3 (AWS S3), Apache Spark, Amazon QuickSight, Amazon Athena, Data Engineering, ETL, Programming, Analytics, Databases, Data Modeling, Python 3, Data Pipelines, PySpark, Data Warehousing, Data Analytics, Data, Data Lakes, Data Lake Design, Amazon Web Services (AWS), Data Warehouse Design, Data Quality, Data Cleaning, Data Architecture, Database Design, Big Data, Big Data Architecture, Data Analysis, Exploratory Data Analysis

Software Engineer

2015 - 2016
Netsol
  • Wrote SQL scripts, data definition languages, data manipulation languages, 以及Netsol财务套件的存储程序.
  • 为所有Netsol的金融产品制定产品基准.
  • 与团队合作实现整个会计系统的自动化.
  • 为所有正在进行的会计事件开发数据操作和处理脚本.
Technologies: SQL, Databases, Data, Jira, Scrum, Data Engineering, Programming, Analytics, Data Modeling, Python 3, Data Pipelines, Amazon Web Services (AWS), Data Warehousing, Data Analytics, Spark, Data Lakes, ETL, Data Quality, Data Cleaning, Git, Exploratory Data Analysis

Neiman Marcus -智能数据平台

Neiman Marcus, a retail company, 想要使用多种数据存储技术创建一个可操作的数据中心, 哪些是根据企业的数据结构和消费选择的. 数据中心由许多处理组件组成,这些组件使用数据管道连接在一起.

Medicare – Data Platform

联邦医疗保险是一家拥有数十亿保险的美国联邦医疗保险机构, pharmaceutical, and medical records. 他们需要一个高效的平台来摄取、管理和查询他们的数据. 最终产品需要采用FHIR格式,以符合美国政府的标准.

美联社-竞选活动

我曾为美国非营利新闻机构美联社(The Associated Press)做过一个项目. 该项目的重点是构建数据模型,并设计用于获取和存储竞选数据的管道. 我使用Microsoft SQL Server和Amazon Redshift来存储数据,并将它们连接到Power BI以启用实时报告.

S&P Global Ratings – Data Lake

S&P Global Ratings是一家发布金融研究和分析的美国信用评级机构. 他们在Oracle中有超过15tb的数据, 数据摄取的量导致了性能问题. 我在Amazon S3上创建了一个数据湖,多个数据源使用它来转储数据. 该项目的ETL框架是在PySpark中设计和开发的.

Languages

Python 3, SQL, Python, Snowflake

Frameworks

Spark, Apache Spark, Spark结构化流

Libraries/APIs

PySpark, Spark Streaming

Tools

AWS Glue, Git, AWS Step Functions, Jira, Amazon Simple Queue Service (SQS), Amazon Athena, Slack, Microsoft Power BI, Kafka Streams, Amazon QuickSight

Paradigms

ETL, Database Design, Scrum

Platforms

AWS Lambda, Amazon Web Services (AWS), Amazon EC2, Pentaho, Apache Kafka, Databricks

Storage

Databases, Data Pipelines, Redshift, Data Lakes, Data Lake Design, Elasticsearch, Amazon S3 (AWS S3), Amazon DynamoDB, Graph Databases, Neo4j

Other

Programming, Analytics, Data Modeling, Data Warehousing, Data Analytics, Data, Delta Lake, Data Engineering, Data Warehouse Design, Data Cleaning, Data Architecture, Big Data, Big Data Architecture, Data Analysis, Data Matching, Exploratory Data Analysis, Data Structures, Machine Learning, Amazon RDS, Data Quality, Computer Science, Pipelines, Amazon Kinesis, Streaming

2018 - 2020

计算机科学硕士学位

拉合尔管理科学大学-拉合尔,巴基斯坦

2011 - 2015

计算机工程学士学位

巴基斯坦拉合尔国立计算机与新兴科学大学

JULY 2020 - JULY 2023

AWS Certified Solutions Architect

Amazon Web Services

AUGUST 2017 - AUGUST 2020

AWS Certified Developer

Amazon Web Services

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

Start your risk-free talent trial

与你选择的人才一起工作,试用最多两周. 只有当你决定雇佣他们时才付钱.

Top talent is in high demand.

Start hiring