Research Interest
My academic research centers on sequential decision-making under uncertainty, particularly in the context of human feedback.
I have extensively studied distributional reinforcement learning (distRL), reinforcement learning from human feedback (RLHF), and regret analysis,
aiming to bridge theory and practice. Inspired by how humans make decisions, I seek to mathematically model and optimize human-in-the-loop systems,
uncovering both theoretical insights and practical algorithms for robust decision-making.
I’m currently interested in reasoning LLM agents and regret-based decision theory.
I’m seeking postdoctoral opportunities in theoretical foundations of reinforcement learning or reasoning-oriented LLM research.
|
|
An Axiomatization of Process Score Model: Your Process-level Feedback is Not a Reward
Taehyun Cho, Suhwan Kim, Seungyub Han, Seokhun Ju, Dohyeong Kim, Kyungjae Lee, Jungwoo Lee
Work In Progress
|
|
Off-policy Direct Preference Optimization with Monotonic Improvement Guarantee
Seungyub Han*, Taehyun Cho*, Seokhun Ju, Dohyeong Kim, Kyungjae Lee, Jungwoo Lee
Work In Progress
|
|
Policy Optimization with Process Regret Model
Suhwan Kim*, Taehyun Cho*, Seungyub Han, Seokhun Ju, Dohyeong Kim, Kyungjae Lee, Youngsoo Jang, Geonhyeong Kim, Yujin Kim, Moontae Lee, Jungwoo Lee"
Work In Progress
|
|
Policy-labeled Preference Learning: Is Preference Enough for RLHF?
Taehyun Cho*, Seokhun Ju*, Seungyub Han, Dohyeong Kim, Kyungjae Lee, Jungwoo Lee
ICML 2025 spotlight (Top 2.6%)
paper /
arxiv /
|
|
Bellman Unbiasedness: Toward Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation
Taehyun Cho, Seungyub Han, Kyungjae Lee, Seokhun Ju, Dohyeong Kim, Jungwoo Lee
ICML 2025
paper /
arxiv /
|
|
Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees
Dohyeong Kim, Taehyun Cho, Seungyub Han, Hojun Chung, Kyungjae Lee, Songhwai Oh
NeurIPS 2024
paper /
arxiv /
|
|
Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion
Taehyun Cho, Seungyub Han, Heesoo Lee, Kyungjae Lee, Jungwoo Lee
NeurIPS 2023
paper /
arxiv /
|
|
SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning
Dohyeok Lee, Seungyub Han, Taehyun Cho, Jungwoo Lee
NeurIPS 2023
paper /
arxiv /
code /
|
|
On the Convergence of Continual Learning with Adaptive Methods
Seungyub Han, Yeongmo Kim, Taehyun Cho, Jungwoo Lee
UAI 2023
paper /
arxiv /
|
|
Adaptive Methods for Nonconvex Continual Learning
Seungyub Han, Yeongmo Kim, Taehyun Cho, Jungwoo Lee
NeurIPS 2022 Optimization for Machine Learning Workshop
paper /
|
|
Perturbed Quantile Regression for Distributional Reinforcement Learning
Taehyun Cho, Seungyub Han, Heesoo Lee, Kyungjae Lee, Jungwoo Lee
NeurIPS 2022 Deep RL Workshop
paper /
|
|
Chebyshev polynomial codes: Task entanglement-based coding for distributed matrix multiplication
Sangwoo Hong, Heecheol Yang, Youngseok Yoon, Taehyun Cho, Jungwoo Lee
ICML 2021
paper /
arxiv /
|
|
Optimized shallow neural networks for sum-rate maximization in energy harvesting downlink multiuser NOMA systems
Haesung Kim, Taehyun Cho, Jungwoo Lee, Wongae Shin, H Vincent Poor
IEEE Journal on Selected Areas in Communications
paper /
arxiv /
|
|
An Efficient Neural Network Architecture for Rate Maximization in Energy Harvesting Downlink Channels
Haesung Kim, Taehyun Cho, Jungwoo Lee, Wonjae Shin, H Vincent Poor
2020 IEEE International Symposium on Information Theory (ISIT)
paper /
arxiv /
|
|