Open to Work  ·  Pune, India
Saikat
Das
Senior Data Engineer  ·  7.5+ Years

I architect petabyte-scale data platforms — Medallion Lakehouses, real-time streaming pipelines, and AI-augmented ETL workflows — across Databricks, Snowflake, AWS & Azure.

PySparkDatabricks SnowflakeDelta Lake KafkaAirflow Claude CodeOracle SQL Medallion Arch
Saikat Das
Databricks
Certified ✦
Saikat Das
Senior Data Engineer  ·  7.5+ Years
3PB+
Data Built
$45K
Cost Saved
99.9%
Reliability
3PB+
Lakehouse
Architected
70%
BI Latency
Reduced
$45K
Annual Cloud
Savings
35%
Modelling Time
Cut via AI
45%
DBU Cost
Reduction
99.5%
Data Accuracy
Rate
01 — Stack

Technology

🏗️
Data Platforms
DatabricksSnowflakeDelta LakeDBT
Processing
PySparkApache SparkAdvanced SQLPL/SQL
🔄
Streaming
KafkaSpark Structured StreamingDelta Live Tables
⏱️
Orchestration
Apache AirflowInformatica IICSSnapLogic
☁️
Cloud
AWS S3Azure DatabricksGCPSnowpipe
🤖
AI Tooling
Claude CodeAgentic AICursor AIMLflow
🗄️
Databases
Oracle RDBMSAWS RedshiftAlation
🛡️
Governance & BI
Unity CatalogGreat ExpectationsPower BI
02 — Experience

Career

Proton.ai
Senior Data Engineer  ·  US-Based SaaS  ·  Remote
Sep 2025 – Present
🌐 Remote
  • Architected a 3PB+ Data Lakehouse on AWS S3 with Medallion Architecture, enabling scalable BI and ML workloads at petabyte scale.
  • Reduced DBU consumption by 45% and improved BI query latency by 70% via Liquid Clustering and optimized shuffle partitions.
  • Built real-time feature engineering pipelines ensuring zero-downtime recommendation engine refreshes.
DatabricksAWS S3PySparkMLflowLiquid Clustering
Workday Inc.
Senior Data Engineer  ·  Enterprise Architecture & AI
Oct 2024 – Sep 2025
📍 Pune, India
  • Architected real-time Medallion Lakehouse on Snowflake with Bronze→Silver→Gold zones, integrating Oracle ERP & Salesforce feeds.
  • Built dimensional & 3NF models across finance and HR; standardized business entities enterprise-wide.
  • Leveraged Claude Code & Agentic AI to auto-generate DDLs, schema evolution scripts & PySpark transforms — cutting modelling time by 35%.
  • Built Snowpipe + Python connector pipelines for Oracle extracts with full incremental capture & schema consistency.
SnowflakeOracle ERPSnowpipeClaude Code3NF ModellingAgentic AI
UBS
Senior Data Engineer  ·  Asset Management  ·  Gold Certified
Mar 2022 – Oct 2024
📍 Pune, India
  • Designed Databricks Medallion Lakehouse processing 500GB+ daily trade data from Oracle RDBMS & on-prem warehouses.
  • Established gold-zone semantic models & lineage tracking for full MIFID II & Basel III compliance.
  • Migrated Oracle PL/SQL into vectorized PySpark — reducing annual cloud costs by $45,000.
  • Implemented Data Contract framework via Great Expectations, preventing 200+ daily schema drift incidents.
Azure DatabricksOracle PL/SQLMIFID IIBasel IIIGreat Expectations
Infosys
Data Engineer  ·  Sys Eng → Sr Sys Eng → Tech Analyst  ·  3 yrs 4 mos
Nov 2018 – Feb 2022
📍 Bengaluru, India
  • Improved pipeline efficiency by 45% using Informatica IICS & SnapLogic for automated data ingestion and integration.
  • Delivered ETL migration in PySpark/SparkSQL with a 99.5% data accuracy rate.
  • Built AI-interactive Power BI dashboards with Q&A — boosting stakeholder adoption by 65% and marketing ROI by 25%.
  • Cut data discrepancies by 96% using Alation-based validation and regular data audits.
PySparkInformatica IICSPower BIAlationSnapLogic
03 — Impact

By The Numbers

🏗️
3PB+
Data Lakehouse Architected at Proton.ai
AWS S3 · Medallion Architecture · Real-Time ML Pipelines
70%
BI Query Latency Reduced
Liquid Clustering · Partitioning
💰
$45K
Annual Cloud Savings
Oracle → PySpark · UBS
🤖
35%
Modelling Time Cut via AI
Claude Code · Agentic Automation
📉
45%
DBU Cost Reduction
Databricks Optimization
99.5%
Data Accuracy Rate
ETL Pipelines · Infosys
04 — Credentials

Certifications

Certified Data Engineer Professional
Databricks
Associate Developer — Apache Spark 3.0
Databricks
SnowPro Core Certified
Snowflake
Exploratory Data Analysis for ML
IBM / Coursera
ETL Pipelines — Shell, Airflow & Kafka
IBM / Coursera
Power BI Fundamentals for Finance
Microsoft
Innovation Award — Agentic AI Automation for ETL
Workday Inc. · 2024
Employee of Quarter Q2 & Best Team Delivery 2022–23
UBS Asset Management
Gold UBS Certified Data Engineer
UBS · Internal Certification
Let's build
something great

Open to Senior Data Engineering roles, contracts & consulting — globally. Always happy to connect.