Jaehun Jung

I'm a Ph.D student in computer science at the University of Washington, advised by Yejin Choi. I am also a part-time student researcher in Nvidia Research.

My research focuses on how to train and evaluate a model with a model, with minimal human supervision. I am specifically excited in

Data Synthesis and Data Selection with Language Models: How do we define good data? Can we use this insight to generate synthetic data that are diverse, correct, and are helpful to model generalization?

Science of Automated Evaluation: Can we use a model to reliably evaluate other models? How can we guarantee the automated rubrics will align with ours?

Previously I was an undergrad at Seoul National University, advised by Professor U Kang and Jinwook Seo. I was also a part-time researcher in Kakao Enterprise, where I worked on knowledge-grounded dialogue agents.

Email / CV / Scholar / Twitter / Github

Research

	Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning Jaehun Jung, Seungju Han, Ximing Lu, Skyler Hallinan, David Acuna, Shrimai Prabhumoye, Mostafa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Yejin Choi preprint*, 2025 project page / paper We show that data diversity (measured by our proposed metric G-Vendi) strongly predicts how the model generalizes after training. We leverage this finding to strategically diversify synthetic reasoning data. Our resulting datasets, despite generated by 32B LLM, leads to better performance in OOD than R1-671B generated & human-verified datasets.
	Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning. Ximing Lu, Seungju Han, David Acuna Marrero, Hyunwoo Kim Jaehun Jung, Shrimai Prabhumoye, Niklas Muennighoff, Mostafa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Yejin Choi preprint*, 2025 paper / bibtex Search-guided distillation mitigates under-thinking & over-thinking of reasoning models, while simultaneously improving the accuracy.
	Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement Jaehun Jung, Faeze Brahman, Yejin Choi ICLR, 2025 (Oral, Top 1.8%) paper / bibtex We enhance LLM judges with a statistically rigorous guarantee of human agreement. We further extend this guarantee to propose Cascaded Selective Evaluation, where we start from a small cost-effective model as a judge, and escalate to a stronger model only when necessary—all while guaranteeing high agreement with humans.
	Information-Theoretic Distillation for Reference-less Summarization Jaehun Jung, Ximing Lu, Liwei Jiang, Faeze Brahman, Peter West, Pang Wei Koh, Yejin Choi COLM, 2024 paper / bibtex Can small models excel at summarization without imitating LLM or human-written references? We present InfoSumm, a framework to distill a powerful summarizer that outperforms order-of-magnitude larger LLM summarizers, solely based on the information-theoretic objective for summarization.
	Impossible Distillation for Paraphrasing and Summarization: How to Make High-quality Lemonade out of Small, Low-quality Models Jaehun Jung, Peter West, Liwei Jiang, Faeze Brahman, Ximing Lu, Jillian Fisher, Taylor Sorensen, Yejin Choi NAACL, 2024 paper / data / bibtex It is possible to generate a high-quality dataset for sentential paraphrasing and summarization directly from an off-the-shelf LM, even when it is impossible for the LM itself to reliably perform these tasks.
	Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models Jillian Fisher, Ximing Lu, Jaehun Jung, Liwei Jiang, Zaid Harchaoui, Yejin Choi NAACL, 2024 (Oral Presentation) paper / github / bibtex We introduce JamDec, an inference-time algorithm for authorship obfuscation that is domain-agnostic, controllable, yet does not require human supervision.
	Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning Ximing Lu, Faeze Brahman, Peter West, Jaehun Jung, ..., Xiang Ren, Sean Welleck, Yejin Choi EMNLP, 2023 paper / github / bibtex Can we adapt LLMs without fine-tuning? We propose using a lightweight adapter (e.g. GPT-2) during decoding time, efficiently tailoring even the strongest proprietary LLMs toward user-defined reward.
	STEER: Unified Style Transfer with Expert Reinforcement Skyler Hallinan, Faeze Brahman, Ximing Lu, Jaehun Jung, Sean Welleck, Yejin Choi Findings of EMNLP, 2023 paper / github / bibtex We propose a text style transfer framework from arbitrary source style to many target styles via large-scale data generation with expert-guided decoding and offline RL.
	Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations Jaehun Jung, Lianhui Qin, Sean Welleck, Faeze Brahman, Chandra Bhagavatula, Ronan Le Bras, Yejin Choi EMNLP, 2022 (Oral Presentation) paper / github / bibtex We improve LM reasoning by generating abductive and recursive explanations from language models, then formulating inference as a satisfiability problem over these generations.
	Learning to Walk across Time for Interpretable Temporal Knowledge Graph Completion Jaehun Jung, Jinhong Jung, U Kang KDD, 2021 paper / github / bibtex A novel GNN for temporal KG is proposed that encodes an interpretable graph substructure for knowledge graph completion.
	AttnIO: Knowledge Graph Exploration with In-and-Out Attention Flow for Knowledge-Grounded Dialogue Jaehun Jung, Bokyung Son, Sungwon Lyu EMNLP, 2020 paper / video / bibtex We present a novel decoder model based on attention flow that learns to explore KG and retrieve a relevant knowledge path to ground a dialogue agent.
	DataHalo: A Customizable Notification Visualization System for Personalized and Longitudinal Interactions Guhyun Han, Jaehun Jung, Youngho Kim Jinwook Seo CHI, 2023 paper / bibtex DataHalo implements a customizable notification visualization system for mobile devices, providing prolonged ambient visualizations based on time-varying importance model to enable longitudinal interaction with the notifications.

Website design by Jon Barron