Selena Song

Selena Song

About

I am a incoming PhD student at University of California, Santa Cruz, working with Prof. Yuyin Zhou and Prof. Cihang Xie. My research interests lie in Video Generation, Agentic models, and AI for healthcare.

Previously, I received my Master's degree from The University of Tokyo in 2025. I earned my Bachelor's degree in Physics from Fudan University in 2023.

Research Interests

Video Generation

Developing controllable and physically consistent video generation methods that model scene dynamics, temporal coherence, and long-horizon visual behavior.

Agentic Models

Studying agentic AI systems that can reason, plan, interact with multimodal environments, and adapt their behavior through structured feedback.

AI for Healthcare

Applying machine learning and multimodal foundation models to healthcare problems, with an emphasis on reliable, interpretable, and clinically meaningful AI systems.

Selected Publications

View All →

Learning Plug-and-play Memory for Guiding Video Diffusion Models

Selena Song, Ziming Xu, Zijun Zhang, Kun Zhou, Jiaxian Guo, Lianhui Qin, Biwei Huang

A plug-and-play memory module for video diffusion models that enhances physical rule adherence and video fidelity through targeted guidance using low-/high-pass filters.

MMA: Benchmarking multi-modal large language model in ambiguity contexts

Selena Song, Ru Wang, Liang Ding, Mingming Gong, Yusuke Iwasawa, Yutaka Matsuo, Jiaxian Guo

ICLR 2025 Workshop on Navigating and Addressing Data Problems for Foundation Models

A benchmark evaluating multi-modal large language models' ability to resolve ambiguities in text using visual context, revealing significant performance gaps.

Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization

Ru Wang, Wei Huang, Selena Song, Haoyu Zhang, Yusuke Iwasawa, Yutaka Matsuo, Jiaxian Guo

CPAL 2025

Investigates Chain-of-Thought reasoning to enhance out-of-distribution generalization in language models, revealing the importance of CoT granularity and sample efficiency.