Personal Interest
Strategic Games: I am a high-level Teamfight Tactics(TFT) player, having ranked within the top 9 during a competitive season. I treat strategic games as experimental laboratories for studying decision-making under uncertainty. Competitive multi-agent environments such as TFT require continuous adaptation, opponent modeling, and risk-sensitive planning. I often reflect on my own gameplay to analyze how strategies are formed, refined, and adjusted in response to stochastic dynamics and strategic interaction. I am particularly interested in formalizing how such strategies emerge in competitive settings and in translating these principles into reinforcement learning agents operating under uncertainty.
Behavioral Economics and Cognitive Science: Behavioral economics and cognitive science play a central role in how I think about human decision-making. Many reinforcement learning formulations focus primarily on maximizing expected return, abstracting away inference-time constraints and bounded rationality. I am interested in understanding how human biases, risk perception, and cognitive limitations shape actual decisions. This perspective motivates my research on modeling human-aligned decision processes that go beyond expectation maximization and instead account for uncertainty, regret, and subjective evaluation of outcomes.
Philosophy: Since 2020, I have participated in regular philosophical discussions through a group called "Sunday Salon". Our discussions began with themes inspired by the French "Baccalauréat" examination topics and have gradually expanded toward questions related to artificial intelligence and its societal implications. These discussions have shaped how I think about human uniqueness, responsibility, and counterfactual reflection. I am particularly interested in understanding what aspects of human reasoning—such as regret, moral evaluation, and narrative self-reflection—remain difficult to formalize, and how these elements might inform the development of human-aligned AI systems.
|
Research Interest
My academic research focuses on sequential decision-making under uncertainty, particularly in the context of human feedback.
I have extensively studied distributional reinforcement learning (distRL), reinforcement learning from human feedback (RLHF), and regret analysis,
aiming to bridge theory and practice.
Drawing inspiration from how humans make decisions, I aim to develop mathematical models and optimize for human-in-the-loop systems,
uncovering both theoretical insights and practical algorithms for robust decision-making.
Currently, I’m interested in reasoning LLM agents and regret-based decision theory.
I’m actively looking for research scientist or postdoctoral opportunities in theoretical foundations of reinforcement learning or reasoning LLM research.
|
|
Feb 2026 - I was selected for the Sejong Science Fellowship for the project "Distributional Regret Analysis for Human-Aligned Interactive Agentic AI under High Uncertainty."
|
|
Feb 2026 - I received my Ph.D. in Electrical and Computer Engineering at Seoul National University and was honored with the Distinguished Dissertation Award.
|
|
A Distributional Perspective on Human-Aligned Decision Making under Uncertainty
Taehyun Cho
Department of Electrical and Computer Engineering, Seoul National University Distinguished Dissertation Award
paper /
|
|
A Regret Minimization Framework on Preference Learning in Large Language Models
Suhwan Kim*, Taehyun Cho*, Youngsoo Jang, Geonhyeong Kim, Yujin Kim, Moontae Lee, Jungwoo Lee
Submitted to ICML 2026
|
|
An Axiomatization of Process Score Model: Your Process-level Feedback is Not a Reward
Taehyun Cho, Suhwan Kim, Seungyub Han, Seokhun Ju, Dohyeong Kim, Kyungjae Lee, Jungwoo Lee
Work In Progress
|
|
Off-policy Direct Preference Optimization with Monotonic Improvement Guarantee
Seungyub Han*, Taehyun Cho*, Seokhun Ju, Dohyeong Kim, Kyungjae Lee, Jungwoo Lee
Work In Progress
|
|
Policy-labeled Preference Learning: Is Preference Enough for RLHF?
Taehyun Cho*, Seokhun Ju*, Seungyub Han, Dohyeong Kim, Kyungjae Lee, Jungwoo Lee
ICML 2025 Spotlight (Top 2.6%)
paper /
arxiv /
|
|
Bellman Unbiasedness: Toward Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation
Taehyun Cho, Seungyub Han, Kyungjae Lee, Seokhun Ju, Dohyeong Kim, Jungwoo Lee
ICML 2025
paper /
arxiv /
|
|
Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees
Dohyeong Kim, Taehyun Cho, Seungyub Han, Hojun Chung, Kyungjae Lee, Songhwai Oh
NeurIPS 2024
paper /
arxiv /
|
|
Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion
Taehyun Cho, Seungyub Han, Heesoo Lee, Kyungjae Lee, Jungwoo Lee
NeurIPS 2023
paper /
arxiv /
|
|
SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning
Dohyeok Lee, Seungyub Han, Taehyun Cho, Jungwoo Lee
NeurIPS 2023
paper /
arxiv /
code /
|
|
On the Convergence of Continual Learning with Adaptive Methods
Seungyub Han, Yeongmo Kim, Taehyun Cho, Jungwoo Lee
UAI 2023
paper /
arxiv /
|
|
Adaptive Methods for Nonconvex Continual Learning
Seungyub Han, Yeongmo Kim, Taehyun Cho, Jungwoo Lee
NeurIPS 2022 Optimization for Machine Learning Workshop
paper /
|
|
Perturbed Quantile Regression for Distributional Reinforcement Learning
Taehyun Cho, Seungyub Han, Heesoo Lee, Kyungjae Lee, Jungwoo Lee
NeurIPS 2022 Deep RL Workshop
paper /
|
|
Chebyshev polynomial codes: Task entanglement-based coding for distributed matrix multiplication
Sangwoo Hong, Heecheol Yang, Youngseok Yoon, Taehyun Cho, Jungwoo Lee
ICML 2021
paper /
arxiv /
|
|
Optimized shallow neural networks for sum-rate maximization in energy harvesting downlink multiuser NOMA systems
Haesung Kim, Taehyun Cho, Jungwoo Lee, Wongae Shin, H Vincent Poor
IEEE Journal on Selected Areas in Communications
paper /
arxiv /
|
|
An Efficient Neural Network Architecture for Rate Maximization in Energy Harvesting Downlink Channels
Haesung Kim, Taehyun Cho, Jungwoo Lee, Wonjae Shin, H Vincent Poor
2020 IEEE International Symposium on Information Theory (ISIT)
paper /
arxiv /
|
|
Sejong Science Fellowship
Distributional Regret Analysis for Human-Aligned Interactive Agentic AI under High Uncertainty
Funded by National Research Foundation of Korea
(
5-year program
)
|
Education & Research Experience |
|
LG AI Research
Research Intern, Superintelligence Lab
2024.12 - 2025.05
|
|
Seoul National University
Ph.D./M.S. in Electrical and Computer Engineering
2020.03 - 2026.02
|
|
Korea University
B.S. in Mathematics
2013.03 - 2020.02
|
|
Conference Reviewer: ICML, NeurIPS, ICLR, AAAI
|
|