We just announced our $3M Pre-Seed. Watch our — launch video.
Lead Data Engineer | Dual Data Scientist | Building Scalable Data Platforms & Intelligent Systems | Gen AI & Advanced Analytics | 7x Certified
Airflow Certified /DAGAUTH
GCP Cloud Engineer and Data Engineer Certificate (ACG)
Certification
Specialization in Statistics and Deep Learning & Specialization in Python and Ex. Data Science(2016)
I have extensive experience in leveraging a wide range of cloud and data-related services across Amazon Web Services (AWS) and Google Cloud Platform (GCP), alongside proficiency in infrastructure and data orchestration tools. On AWS, I am skilled in using Amazon EC2, S3, EBS, RDS, DynamoDB, and Redshift. My expertise includes machine learning with SageMaker, analytics with Athena, QuickSight, and EMR, as well as data warehousing with Redshift. I ensure robust security using IAM, KMS, and GuardDuty, and have experience with AWS Lambda, VPC, CloudFormation, Amazon Q for natural language querying, and Amazon Bedrock for generative AI applications.
On GCP, I have significant experience with Compute Engine, Cloud Storage, Persistent Disk, Cloud SQL, Firestore, and BigQuery. I utilize AI Platform for machine learning and Dataflow and Dataproc for analytics. My security management skills include Cloud IAM, KMS, and Security Command Center.
Additionally, I am proficient in using Terraform for infrastructure as code across both AWS and GCP, enabling automated and consistent infrastructure deployment. I have experience with Apache Airflow for orchestrating complex data workflows, DBT (Data Build Tool) for transforming data within data warehouses, and Fivetran for automated data integration. These skills enable me to effectively architect, deploy, and manage comprehensive cloud and data solutions.
Tools: Airflow, BigQuery, Cloud Function, AWS Lambda, AWS RedshiftMicrosoft, PowerBi, SAP, dbt
Languages: python, SQL
Built a scalable data platform using AWS, Serverless, Snowflake, and AWS Specific tools such as Lambda, Glue, S3, LakeFormation, Data Zones, enabling centralized analytics for manufacturing, operations, and finance teams. Led design and orchestration of end-to-end data pipelines with DBT and Airflow, including metadatadriven architecture for data governance and lifecycle management. Collaborated cross-functionally with data scientists, analysts, and engineering teams to align data infrastructure with business priorities and compliance standards. Enabled ML & GenAI initiatives by integrating Amazon Bedrock, LangChain, and Streamlit to deploy LLMpowered AI agents and Q&A chatbots for internal automation. Optimized analytics use cases across supply chain and production workflows, directly contributing to increased revenue through improved real-time insights. Key Tech: AWS (Lambda, Redshift, Glue, ECS, EKS, Bedrock, Q Business), Python, SQL, DBT, Terraform, Kubernetes, Airflow, Streamlit, HuggingFace, Diffusion Models.
I have experience in managing and developing data pipelines within cloud services, specifically AWS and GCP. My role includes actively monitoring these pipelines using management tools like Airflow. Additionally, I have expertise in implementing secure and scalable API endpoints to provide access to the processed data. As part of my responsibilities, I take a proactive approach to enhance our infrastructure by suggesting, experimenting with, and adopting new tools and technologies. I am committed to embracing development best practices and ensuring code quality by mentoring junior team members. Furthermore, I believe in the importance of knowledge sharing within the organization, whether it's within the team, across the company, or with stakeholders. My ultimate aim is to enhance marketing efficiency by automating decision-making processes and reporting. I focus on achieving campaign and acquisition cost metrics, specifically with respect to CPA, which has resulted in generating millions of Euros at various levels. Some of the tools and technologies I'm proficient in include Terraform for Infrastructure as Code (IAC), Google BigQuery, Cloud Functions, VertexAI, Looker, GitHub Actions CI/CD and Airflow, dbt. I have also worked with Docker and Atlantis as part of my infrastructure management and development tasks. Additionally, part of the Hiring/Tech Interview Committee for DeliveryHero Data Engineers/Analysts. GenAI Amazon GenOS/ AWS Bedrock / Amazon Q (preview based) / Amazon OpenSearch Service
Work in a cross-functional team of engineers and data scientists focused on improving the efficiency of the marketing activities. Create new products to measure the impact of campaigns across different channels. Develop a new data-science-backed solution to optimize the spending of our marketing budget. Improve the existing models and data products owned by the squad such as marketing attribution, online & offline measurement, customer lifetime value, media mix models, and more. Create data pipelines in an environment with VMs and Cloud services (AWS and GCP). Proactively improve our infrastructure, suggesting, trying and adopting new tools. Embrace development best practices and ensure code quality. Share your work and knowledge at many levels (team, company, stakeholders). All this with the goal to increase the marketing efficiency with automated decisions and reporting. Tech Stack: python3 (+ data packages, e.g. pandas), SQL and NoSQL, Airflow, Docker, Terraform, Spark AWS, and Google Cloud services. Use predictive modeling to increase and optimize customer experiences, revenue generation, ad targeting, and other business outcomes. Implements data structures using best practices in data modeling, processes, and technologies. Designs develop and tests BI solutions such as databases, data warehouses, queries and views, reports, and dashboards. Performs data conversions, imports, and exports of data within and between internal and external software systems. Merges BI platforms with enterprise systems and applications. Enhances the performance of business intelligence tools by defining data to filter and index. Documents new and existing models, solutions, and implementations. Workflows, Glue Jobs, ETL automation, Data Migrations, Dashboard Manipulation, Machine learning Analytics, AWS Services, Algorithmic Intelligence. Through Makeen
Activities and societies: contact: [email protected] C01 - Developments in Computer Science C02 - Parallel and Distributed Systems C03 - Image Processing/Pattern Recognition and Graphics C04 - Software Systems and Languages C05 - Information Processing and Management C06 - Scientific Computation and Algorithms C07 - Artificial Intelligence and Human-Machine Communication DROP OUT
Grade: A Activities and societies: Programming, sportsmanship hockey, boxing, basketball, gymnastics 3 semesters : 3.6 CGPA + / 4.0 + (post major semesters :after 4th semester) out of total 8 semesters Programming , developer, Helping Juniors, Developing skills , doing Sports. If You think university matters(than you can pick based on my qualification). EchSee cam
Claim it to keep it up to date, or request removal. We're happy to help either way.






Chat with Clera and we'll introduce you to the right opportunities.
This profile is based on publicly available information. Shafay is not affiliated with or endorsed by Clera. Privacy policy.