ML Engineer / Independent Researcher / Vibe Coder · Toronto

Hi, I'm Arthur.

Research-oriented ML engineer building agent systems, multimodal learning workflows, and graph-based representation learning. Interested in post-training, ICL, RL, imitation learning, evolution-based approaches, and world-model-inspired research.

Agentic Coding Long-Horizon Task Automation RL World Models Evolution

01 Blog

02 About

I'm an ML engineer working on large-scale applied ML — across post-training, agent systems, in-context learning, and graph-based representation learning. Earlier in my career I built ML systems for finance, and conducted HCI / human-in-the-loop ML research focused on multimodal interaction.

My current interests sit at the intersection of agentic coding, long-horizon task automation, RL, world models, and evolution-based methods. I'm particularly interested in how agents can plan, execute, and self-correct over horizons longer than a single prompt — and in foundational architectures that let them learn from their own trajectories.

When I'm not training models, I write about agentic coding workflows and tooling. Recent notes live below.

03 Experience

Machine Learning Software Engineer

Google · Toronto Aug 2022 — Present

Large-scale applied ML across post-training, agent systems, in-context learning, and graph-based representation learning. Core areas: retrieval / search / clustering, representation learning, GNNs, agents, post-training, and RL.

  • Agent systems, RL & imitation learning. Built end-to-end LLM agents for production workflows (sequential, parallel, sub-agent, critic-agent designs) and large-scale RL / imitation learning experiments on 10B+ user-interaction trajectories, spanning offline and online settings.
  • Post-training & ICL. Fine-tuned Gemini encoder models via SFT pipelines in JAX for downstream tasks; improved agent quality with retrieval over multimodal embeddings and vector stores for few-shot ICL.
  • Infrastructure & scale. Operate ML systems at Google-scale: petabyte-scale graph and feature pipelines with second-level latency; distributed training in JAX on TPU pods; online ANN serving over 1B+ learned vectors; production data flows on Pub/Sub, Kafka, and Spanner. Comfortable owning the path from raw event streams through training, evaluation, and online serving.
  • Representation learning & retrieval. Built GNN-based representation learning on heterogeneous graphs with 50B+ edges, with ANN retrieval over 1B+ learned vectors for downstream agent and recommendation surfaces.
  • Agentic coding influence. Have been using and promoting the Plan–Define–Act workflow for agentic coding since May 2025; authored an internal doc on Gemini CLI and agentic coding that reached 12k+ unique views within Google/Alphabet, and delivered tech talks on GenAI developer tooling.
JAXTensorFlowTF-GNNGeminiADKSpannerPub/SubKafkaGCP

Machine Learning Engineer (Senior Associate)

J.P. Morgan · Hong Kong Dec 2021 — Aug 2022

ML systems for large-scale financial time-series, focused on anomaly detection, feedback-driven model improvement, and end-to-end production ML pipelines.

  • Designed large-scale anomaly detection workflows for financial time-series in collaboration with Portfolio Managers and Quants.
  • Built a feedback-loop system to address distribution shift and pattern discovery, iteratively improving model accuracy and stability.
  • Engineered end-to-end ML pipelines: data processing, feature engineering, training, tuning, evaluation, versioning, and serving.
PythonPyTorchPySparkPandasSageMaker

HCI & Human-in-the-Loop ML Researcher

HKUST-DT SyMLab · Hong Kong Jun 2019 — Sep 2021

HCI and human-in-the-loop ML research on multimodal interaction, wearable input, and human-drone control — focused on subtle one-handed drone interaction using touch, force, and IMU sensing.

  • Designed and evaluated four multimodal control schemes for one-handed drone interaction, identifying a method that outperformed a two-handed commercial baseline on task completion time.
  • Built a ring-sized wearable controller enabling joystick-like drone navigation through subtle thumb interaction and wrist pitch/roll gestures.
  • Conducted user studies on task completion time, perceived workload, and interaction footprint to derive design guidelines for compact multimodal interfaces.

Computer Vision Research Intern

ASTRI · Hong Kong Jun 2017 — Aug 2017

CV research and engineering for visual recognition on edge devices — dataset construction, transfer learning, model compression, and deployment to mobile and embedded platforms. Worked on car model classification with Inception-v3 transfer learning.

04 Projects & Research

World Models, Multimodal Learning & Agents

2021 — Present

Self-directed study and implementation work on world models, LLMs, and agent systems. Reimplemented ideas from World Models (Ha et al.) with VAE-based latent representations, and explored TinyWorld — a minimal Genie-1-inspired replication.

World ModelsVAEVLMVLA

NeurIPS 2024 LLM Merging Challenge

2024 · Project Lead

Led a small team in the NeurIPS 2024 LLM Merging Competition. Implemented and benchmarked Linear, SLERP, TIES, Git Re-Basin, and MoE-based merging; explored evolutionary merging and FunSearch-inspired search.

LLM MergingSLERPTIESPyTorch

Deep RL for Sonic — PPO & Transfer Learning

2018 — 2019

Re-implemented DQN, A2C, and PPO from scratch. Built a parallel data-collection and evaluation pipeline; improved sample efficiency with reward / feature engineering and a VAE-based world encoder for sparse-reward environments.

PPODQNA2CVAE

AI Guide Dog

2017 — 2018

Multimodal assistive vision system for visually impaired users — image classification, object detection, and image-to-text scene description deployed across mobile, cloud, and embedded components.

CVTensorFlowEdgeFirebase

05 Publications

  1. How subtle can it get? A Trimodal Study of Ring-sized Interfaces for One-handed Drone Control

    A. Yau, Lik Hang Lee, Zheng Li, Tristan Braud, Yi-Hsuan Ho, Pan Hui

    UbiComp · ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2020

  2. MyoKey: Surface Electromyography and Inertial Motion Sensing-based Text Entry in AR

    Young D. Kwon, Kirill A. Shatilov, Lik-Hang Lee, Serkan Kumyol, Kit-Yung Lam, A. Yau, Pan Hui

    PerCom · IEEE International Conference on Pervasive Computing and Communications, 2020

  3. One-thumb Text Acquisition on Force-assisted Miniature Interfaces for Mobile Headsets

    Lik Hang Lee, Yiming Zhu, A. Yau, Tristan Braud, Xiang Su, Pan Hui

    PerCom · IEEE International Conference on Pervasive Computing and Communications, 2020

  4. HIBEY: Hide the Keyboard in Augmented Reality

    Lik Hang Lee, Kit Yung Lam, A. Yau, Tristan Braud, Pan Hui

    PerCom · IEEE International Conference on Pervasive Computing and Communications, 2019