Yunong Liu
Research Engineer at Luma AI
👨💻 I am a Research Engineer at Luma AI, where I work on video and image generation and understanding — spanning large-scale training of diffusion and flow matching models, multi-modal conditioning, and controllable visual synthesis. I am broadly interested in agentic models capable of tool use and multi-step reasoning, more controllable and expressive generative systems, and editable infographics and visual media. My past work has also explored applying online reinforcement learning (GRPO) to align flow matching models with human preferences across compositional and aesthetic objectives, including designing reward models and developing techniques to mitigate reward over-optimization.
🎓 Previously, I obtained my Master's degree in Computer Science at Stanford University (GPA: 3.99), advised by Jiajun Wu. I worked closely with Manling Li, Weiyu Liu, and Chris Eyzaguirre on projects including grounding 4D assembly instructions in internet videos and probing emergence in generative video models. I also had the pleasure of working with Juan Carlos Niebles and Stefan Stojanov.
🏫 I received my BEng in Electronics and Computer Science from the University of Edinburgh (joint degree with Honors), where I ranked 2nd in my degree class. During my time there, I was fortunate to work with Bonnie Webber on natural language processing research.
🌍 I also spent a wonderful semester and summer at the University of Texas at Austin for my exchange program and summer research internship, where I worked with Dhiraj Murthy on misinformation detection.
🎭 Fun fact: I have a twin sister who works in a completely different field - museology! It's fascinating how we've taken such divergent paths in our careers.
Research Interests
- Reinforcement Learning for Generative Models: Applying RL to improve video and image generation models — using reward signals to shape generation quality, controllability, and alignment with human intent
- Agentic AI Systems: Building and studying agentic models with tool-calling capabilities for complex multi-step reasoning and autonomous interaction with external environments
- Controllable Image and Video Generation: Developing more precise control mechanisms for visual synthesis, enabling fine-grained, user-guided generation with greater semantic and structural fidelity
- Editable Infographics and Visual Media: Creating systems for interactive editing of infographics and generated visual content, bridging the gap between AI generation and human-centric design workflows
Publications
CaptionQA: Is Your Caption as Useful as the Image Itself?
Shijia Yang*, Yunong Liu*, Bohan Zhai*, Ximeng Sun, Zicheng Liu, Emad Barsoum, Manling Li, Chenfeng Xu.
(*equal contribution)Introduced CaptionQA, a utility-based benchmark evaluating image captions across 4 domains (Natural, Document, E-commerce, Embodied AI) with 33,027 multiple-choice questions. Found substantial gaps (9.2-32.4%) between image and caption utility even for state-of-the-art MLLMs, revealing that standard QA-on-image benchmarks miss critical information loss in caption generation.
Taming generative video models for zero-shot optical flow extraction
Seungwoo Kim*, Khai Loong Aw*, Klemen Kotar*, Cristobal Eyzaguirre, Wanhee Lee, Yunong Liu, Jared Watrous, Stefan Stojanov, Juan Carlos Niebles, Jiajun Wu, Daniel L. K. Yamins.
We prompt a generative video model to extract optical flow with zero labels and no fine-tuning. KL-tracing computes the KL-divergence between clean and perturbed prediction logits to trace motion—a statistical counterfactual probe building on perturb-and-track (CWM). Applied to LRAS, whose local tokenizer and random-access decoding enable fine-grained control, our method achieves state-of-the-art results on TAP-Vid and generalizes to challenging in-the-wild YouTube clips, outperforming specialized baselines.
IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos
Yunong Liu, Cristobal Eyzaguirre, Manling Li, Shubh Khanna, Juan Carlos Niebles, Vineeth Ravi, Saumitra Mishra, Weiyu Liu*, Jiajun Wu*.
Created first dataset aligning real-world assembly videos with 3D models and instruction manuals. Developed methods for dense spatio-temporal correspondences in unconstrained videos.
COVID-19 Misinformation Detection: Machine-Learned Solutions to the Infodemic
Yunong Liu*, Nikhil Kolluri*, Dhiraj Murthy.
(*equal contribution)Developed hybrid framework combining machine learning with crowdsourced annotations to combat COVID-19 misinformation. Developed systematic comparison framework across classical models (SVM, LR, BNB) and pre-trained models (BERT, RoBERTa, XLNet) on 7 dataset combinations.
Toy Projects that I enjoyed doing
EMo-Mask: Emotional Controllable Motion Generation
Course Project on CS348I Computer Graphics in the Era of AI (2024 Winter) | Project Page
Just Dance Everywhere (Best Course Project Award)
EE379K Introduction to Computer Vision Course (2022 Spring) at UT Austin | Project Page
Gates Building 3rd Floor Render (A+ Grade & Course Showcase Feature)
CS 148: Introduction to Computer Graphics and Imaging (Autumn 2024)|Project Page | | Course Showcase
Awards
- UT Austin Cockrell School of Engineering Fellowship (Declined), Apr 2023
- Turing Scheme Funding, January 2022
- Leadership in Student Opportunities Edinburgh Award, July 2021
- 1st Year Class Medal, awarded to the top overall student in 1st year Electronics and Electrical Engineering Discipline at University of Edinburgh, July 2020
Other Experience
Student Tech Intern, Semiconductor Manufacturing Anomaly Detection, NXP Semiconductors, Tianjin, China, May 2021 - Aug 2021
Deep Learning Intern, Wind Farm Performance AI Optimization, Zealen AI (Startup), Beijing, China, Aug 2021 - Sep 2021
Research Assistant - NLP Project, University of Edinburgh, Worked with Prof Bonnie Webber on discourse relation analysis, July 2022 - Jan 2023
Teaching Demonstrator, University of Edinburgh, InfBase, Sep 2022 - Jan 2023
Programme Representative, University of Edinburgh, Sep 2022 - June 2023
Global Buddies Program Mentor, University of Edinburgh, Sep 2021 - June 2022
Volunteer, Sri Lanka Wildlife Conservation Society (SLWCS), Jul 2019 - Aug 2019