Yunong Liu
Research Engineer at Luma AI
Recently, my work has focused on multimodal generation and structured visual generation that connects pixels to code, agents, and people. At Luma AI, I lead Layering, which turns flat generated or uploaded media into editable components across vector, text, and pixel representations, so outputs can be revised, verified, and reused across multiple turns. Alongside this, I work on post-training, reward modeling, and evaluation: how to define, measure, and improve what "good" output means for real user workflows. Before Layering, I built the RL workflow for Ray3 and worked on diffusion RL across Ray3 and Uni-1, including experiments around video-generation conditioning, data ablations, caption variance, OCR, aesthetic, and motion reward signals, plus early data and evaluation work.
Previously, I completed my M.S. in Computer Science at Stanford, advised by Jiajun Wu, where I led work on 4D grounding of assembly instructions in internet videos. I received my BEng in Electronics and Computer Science from the University of Edinburgh, ranking 2nd in my cohort.
Structured visual generation
Reward modeling
Multi-turn multimodal agents
4D grounding