关于

Data-AI

关于我

首页
关于

向下滚动

online

shicong Ying

高级数据与 AI 工程师

数据工程 ML / AI MLOps

~/pipeline

> dbt build --select mart.customer_360+

Tech Stack

Spark PyTorch Airflow Kafka K8s Flink MLflow Kestra Terraform InfluxDB Docker

Live Stream

164316 events/s

7 延迟

2.7 GB/s

所在地

新加坡

专注方向

数据与 AI

联系我

💻 SELECT insight FROM chaos WHERE clarity = 'engineered'

作者

Experienced Big Data Engineer with 3+ years of expertise in designing and optimizing batch and real-time data pipelines, enhancing petabyte-scale data warehouse performance, and driving strategic insights for measurable commercial success. Proficient in big data frameworks including Spark, Flink, and Kafka; skilled with cloud platforms such as Alibaba Cloud, Google Cloud, and Amazon Web Services; and adept at dimensional modeling with a strong focus on analytics efficiency and governance.

工作经历

🏢 Tamira Tech,Singapore — Data Warehouse Engineer (Full-time)
02/2026 – Present
- Built real-time and offline data pipelines for image search models, improving recommendation satisfaction and boosting user NPS score (+3%).
- Unified commercial and core metrics pipelines, developed AB testing datasets, and improved analysis efficiency (+25%).
- Reconstructed algorithm data warehouse models, boosting BI efficiency (+30%) and reducing maintenance time (-10%).
- Developed Spark log analysis tools to optimize resource bottlenecks (+50% optimization efficiency).
🏢 Poizon (DeWU),Shanghai — Data Warehouse Engineer (Full-time)
07/2024 – 02/2026
- Built real-time and offline data pipelines for image search models, improving recommendation satisfaction and boosting user NPS score (+3%).
- Unified commercial and core metrics pipelines, developed AB testing datasets, and improved analysis efficiency (+25%).
- Reconstructed algorithm data warehouse models, boosting BI efficiency (+30%) and reducing maintenance time (-10%).
- Developed Spark log analysis tools to optimize resource bottlenecks (+50% optimization efficiency).
🏢 Poizon (DeWU),Shanghai — Big Data Developer (Intern)
05/2023 – 09/2023
- Migrated and optimized 500+ big data tasks on Galaxy platform, improving execution efficiency (+20%).
- Ensured post-migration data accuracy using SQL & Python; supported search algorithm warehouse model design.
🏢 NetEase Cloud Music — Big Data Developer (Intern)
09/2022 – 04/2023
- Designed core metrics for user and behavior analysis, contributing to strategy development.
- Optimized event tracking and introduced cold data archiving, reducing storage costs (-15%).

项目经历

EMR Spark Task Performance & Error Analysis Tool — Feature Development
02/2025 - 04/2025
- Developed automated log parsing tools for performance bottlenecks and error localization.
- Improved debugging and optimization speed significantly by creating analysis modules.
Image Search Evaluation & NPS Feedback Mechanism — Development
11/2024 - 12/2024
- Designed batch sampling platforms for algorithm evaluation, increasing monthly efficiency (+31 person-days).
- Built an NPS feedback mechanism pipeline to enhance user experience evaluation.
Poizon Push New Product Commercialization — Data Development
08/2024 – 10/2024
- Developed 15 ADS reports and established key commercial metrics (PVR, ASN), achieving SLA (+97%).
- Supported data pipeline enhancements, ensuring stable and scalable operations.

技能

Data Warehousing & Modeling: Expert in dimensional modeling (star/snowflake), designing PB-scale data solutions.
Big Data Tools: Proficient in Spark, Flink, Kafka, Hadoop, and distributed frameworks.
Programming Languages: Java, Python, SQL for big data analysis; MySQL/PostgreSQL for high-performance queries.
Data Governance: Skilled in pipeline orchestration, metadata management, and data monitoring.
Soft Skills: Agile development, cross-team collaboration, and expert documentation.

隐私与评论

This website does not track visitor behavior nor require sensitive personal information (e.g., real names, phone numbers, etc.).