Wenhao Pei

Data Scientist & Engineer with expertise in AI workflow development, agent systems, and scalable data pipeline architecture. Currently building a browser-native agent runtime as an independent research project.

Professional Experience

Independent AI Systems Developer

Self-directed Research & Development

Aug 2024 – Present
  • Designed and built a browser-native agent runtime from scratch — turning AI chat interfaces into full execution workspaces with 50+ integrated tools
  • Engineered custom protocol (ΩHERE/ΩBATCH) solving content fidelity and escaping problems that break standard agent tool-calling pipelines
  • Built production automation pipelines: news video generation (AI script → TTS → image gen → FFmpeg → YouTube), tutorial recording, and multi-service orchestration
  • Developed Chrome Extension architecture intercepting SSE/WebSocket streams for real-time model output parsing and tool dispatch
  • Managed cloud infrastructure (Oracle Cloud ARM servers) with PM2 process management, SSH automation, and Cloudflare Workers deployment
  • Technologies: TypeScript, Node.js, Chrome Extension APIs, CDP, FFmpeg, Next.js, MCP Protocol, Python

Data Engineer Intern

URMC Wilmot Cancer Center, Rochester, NY

Mar 2024 – Jul 2024
  • Designed and optimized data pipelines using Apache Spark for biomedical research datasets, reducing processing time by 30%
  • Developed automated data collection frameworks integrating structured and unstructured clinical data sources
  • Built API-based data access infrastructure enabling seamless retrieval for research study teams
  • Collaborated with researchers and clinicians to ensure data accuracy and compliance with research data standards

Lead of Data Group

SUNY Buffalo State – Professional Data Lab

Feb 2024 – May 2024
  • Led web scraping and API project, directing a team in building scalable data collection pipelines using Python
  • Architected end-to-end data workflows integrating multiple external APIs for business intelligence applications
  • Mentored team members on version control, code documentation, and reproducible data science practices

Automation Engineer (Remote)

May Digital Music Publisher, Seattle, WA

Feb 2017 – Dec 2022
  • Led development of automated deployment pipelines, reducing deployment time by 50% through workflow optimization
  • Designed monitoring dashboards tracking system performance, model accuracy, and data quality metrics
  • Developed automated testing and validation processes ensuring data integrity across distributed systems
  • Collaborated cross-functionally to identify bottlenecks and optimize workflows

Education

M.S. in Data Science & Analytics

SUNY Buffalo State University, Buffalo, NY

Jan 2023 – May 2024

Advanced Statistical Methods, Machine Learning, Data Architecture, Big Data Analytics, Database Management

B.A. in Music Composition

Henan Normal University, China

2006 – 2010

Interdisciplinary foundation combining creative problem-solving with analytical thinking

Technical Skills

AI & Agent Systems
LLM integration, agent runtime design, prompt engineering, MCP protocol, TTS/STT pipelines, tool orchestration
Programming
TypeScript, Python, JavaScript, Node.js, R, SQL, Bash
Data Engineering
Apache Spark, Airflow, Kafka, ETL/ELT pipelines, data warehousing, PostgreSQL, MongoDB
Web & Browser
Chrome Extension, CDP, SSE/WebSocket, Next.js, React, REST APIs
Infrastructure
Docker, Oracle Cloud, AWS Lambda, Cloudflare Workers, PM2, Linux server admin
ML & Analytics
TensorFlow, PyTorch, predictive modeling, NLP, deep learning, statistical analysis

Target Roles

AI / ML EngineerApplied AI EngineerData EngineerData ScientistAI Infrastructure EngineerFull-Stack Engineer (AI focus)

Genspark Agent Runtime

After completing my master’s degree, I dedicated my time to an ambitious independent project: building a browser-native agent runtime that transforms AI chat interfaces into fully executable workspaces. This system includes custom protocol design, multi-plane task dispatch, 50+ integrated tools, and production-running automation pipelines — from automated video generation to multi-service orchestration. It represents the intersection of my data engineering background, AI expertise, and systems thinking.

View Full Project Details →