Taehyun Cho

I am a PhD student at the Cognitive Machine Learning Laboratory, part of the Department of Electrical and Computer Engineering at Seoul National University. My PhD advisor is Jungwoo Lee. I have a BS in Mathematics from Korea University.

E-mail: talium@cml.snu.ac.kr

GitHub  /  Google Scholar  /  LinkedIn  /  CV

profile photo

Research Interest

My academic research centers on sequential decision-making under uncertainty, particularly in the context of human feedback. I have extensively studied distributional reinforcement learning (distRL), reinforcement learning from human feedback (RLHF), and regret analysis, aiming to bridge theory and practice. Inspired by how humans make decisions, I seek to mathematically model and optimize human-in-the-loop systems, uncovering both theoretical insights and practical algorithms for robust decision-making. I’m currently interested in reasoning LLM agents and regret-based decision theory.
I’m seeking postdoctoral opportunities in theoretical foundations of reinforcement learning or reasoning-oriented LLM research.

Preprints

project image

An Axiomatization of Process Score Model: Your Process-level Feedback is Not a Reward


Taehyun Cho, Suhwan Kim, Seungyub Han, Seokhun Ju, Dohyeong Kim, Kyungjae Lee, Jungwoo Lee
Work In Progress
project image

Off-policy Direct Preference Optimization with Monotonic Improvement Guarantee


Seungyub Han*, Taehyun Cho*, Seokhun Ju, Dohyeong Kim, Kyungjae Lee, Jungwoo Lee
Work In Progress
project image

Policy Optimization with Process Regret Model


Suhwan Kim*, Taehyun Cho*, Seungyub Han, Seokhun Ju, Dohyeong Kim, Kyungjae Lee, Youngsoo Jang, Geonhyeong Kim, Yujin Kim, Moontae Lee, Jungwoo Lee"
Work In Progress



International Conference

project image

Policy-labeled Preference Learning: Is Preference Enough for RLHF?


Taehyun Cho*, Seokhun Ju*, Seungyub Han, Dohyeong Kim, Kyungjae Lee, Jungwoo Lee
ICML 2025 spotlight (Top 2.6%)
paper / arxiv /
project image

Bellman Unbiasedness: Toward Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation


Taehyun Cho, Seungyub Han, Kyungjae Lee, Seokhun Ju, Dohyeong Kim, Jungwoo Lee
ICML 2025
paper / arxiv /
project image

Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees


Dohyeong Kim, Taehyun Cho, Seungyub Han, Hojun Chung, Kyungjae Lee, Songhwai Oh
NeurIPS 2024
paper / arxiv /
project image

Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion


Taehyun Cho, Seungyub Han, Heesoo Lee, Kyungjae Lee, Jungwoo Lee
NeurIPS 2023
paper / arxiv /
project image

SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning


Dohyeok Lee, Seungyub Han, Taehyun Cho, Jungwoo Lee
NeurIPS 2023
paper / arxiv / code /
project image

On the Convergence of Continual Learning with Adaptive Methods


Seungyub Han, Yeongmo Kim, Taehyun Cho, Jungwoo Lee
UAI 2023
paper / arxiv /
project image

Adaptive Methods for Nonconvex Continual Learning


Seungyub Han, Yeongmo Kim, Taehyun Cho, Jungwoo Lee
NeurIPS 2022 Optimization for Machine Learning Workshop
paper /
project image

Perturbed Quantile Regression for Distributional Reinforcement Learning


Taehyun Cho, Seungyub Han, Heesoo Lee, Kyungjae Lee, Jungwoo Lee
NeurIPS 2022 Deep RL Workshop
paper /
project image

Chebyshev polynomial codes: Task entanglement-based coding for distributed matrix multiplication


Sangwoo Hong, Heecheol Yang, Youngseok Yoon, Taehyun Cho, Jungwoo Lee
ICML 2021
paper / arxiv /



International Journal

project image

Optimized shallow neural networks for sum-rate maximization in energy harvesting downlink multiuser NOMA systems


Haesung Kim, Taehyun Cho, Jungwoo Lee, Wongae Shin, H Vincent Poor
IEEE Journal on Selected Areas in Communications
paper / arxiv /
project image

An Efficient Neural Network Architecture for Rate Maximization in Energy Harvesting Downlink Channels


Haesung Kim, Taehyun Cho, Jungwoo Lee, Wonjae Shin, H Vincent Poor
2020 IEEE International Symposium on Information Theory (ISIT)
paper / arxiv /



Design and source code from Jon Barron's website