Hello! I'm Seung Hyun Lee, Ph.D. candidate working with Prof. Stella X. Yu at CSE, University of Michigan. Before joining Stella's group, my research focused on using ambient sound for visual synthesis. I worked as a Google Student Researcher, exploring ways to improve image generation and aesthetic quality of image cropping. My research goal is to develop a physically understandable foundation model in an unsupervised manner, bridging human perception and machine understanding. Feel free to reach out to chat more about this.

Contact: seungle [at] umich [dot] edu

News

Segment Hierarchy for Depth Estimation Oct 2025

Pass the prelim in Umich's CSE department with this topic.

Publications

📚 Foundation Models

SHED Light on Segmentation for Depth Estimation

Under review

Authors: Coming Soon

⭐ Quality Signals

Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation (work done during an internship at Google Research)

ECCV 2024 Oral (2.3%)

This work introduces a novel multi-reward RL framework to optimize text-to-image generation, balancing different quality signals.

Authors: Seung Hyun Lee, Yinxiao Li, Junjie Ke, Innfarn Yoo, Han Zhang, Jiahui Yu, Qifei Wang, Fei Deng, Glenn Entis, Junfeng He, Gang Li, Sangpil Kim, Irfan Essa, Feng Yang
Cropper: Vision-Language Model for Image Cropping through In-Context Learning (work done during an internship at Google Research)

CVPR 2025

VLM with in-context learning enhances free-form, subject-aware, aspect-ratio aware cropping, without training.

Authors: Seung Hyun Lee*, Jijun Jiang*, Yiran Xu*, Zhuofang Li*, Junjie Ke, Yinxiao Li, Junfeng He, Steven Hickson, Katie Datsenko, Sangpil Kim, Ming-Hsuan Yang, Irfan Essa, Feng Yang

🎶 Sound-Guided Visual Synthesis

Sound-Guided Semantic Image Manipulation (co-worked with NVIDIA)

CVPR 2022

Authors: Seung Hyun Lee, Wonseok Roh, Wonmin Byeon, Sang Ho Yoon, Chanyoung Kim, Jinkyu Kim*, Sangpil Kim*
Sound-Guided Semantic Video Generation (co-worked with NVIDIA)

ECCV 2022

Authors: Seung Hyun Lee, Gyeongrok Oh, Wonmin Byeon, Chanyoung Kim, Won Jeong Ryoo, Sang Ho Yoon, Jihyun Bae, Jinkyu Kim*, Sangpil Kim*
Robust Sound-Guided Image Manipulation (co-worked with NVIDIA)

Neural Networks 2024

Authors: Seung Hyun Lee*, Hyung-gun Chi*, Gyeongrok Oh, Wonmin Byeon, Sang Ho Yoon, Hyunje Park, Wonjun Cho, Jinkyu Kim*, Sangpil Kim*
Audio-guided implicit neural representation for local image stylization (co-worked with NVIDIA)

Computational Visual Media 2024

Authors: Seung Hyun Lee, Chanyoung Kim, Wonmin Byeon, Sang Ho Yoon, Jinkyu Kim*, Sangpil Kim*
Soundini: Sound-Guided Diffusion for Natural Video Editing

Arxiv

Authors: Seung Hyun Lee, Sieun Kim, Innfarn Yoo, Feng Yang, Donghyeon Cho, Youngseo Kim, Huiwen Chang, Jinkyu Kim*, Sangpil Kim*

Awards & Honors

Travel Grant for CVPR 2025 / ECCV 2024 from Google Research
Pytorch Open Source Contribution, Pytorch Tutorial Translation (lead menti) 2021
Google Developer Student Clubs – University of Seoul, Korea 2021
Software Maestro 11th – Best Software talent discovery program, Korea

Academic Activities

Teaching Assistant

[Spring 23] Machine Learning at Korea University