Senior AI Research Engineer · Nota Inc.

Hancheol Park

I work on efficient foundation models, MoE-aware quantization, model compression, on-device AI systems, and reliable NLP.

News

2026 only, in chronological order.

  1. Published an AWS Technical Blog post on LLM model quantization techniques for AWS Inferentia. Read
  2. Released NotaMoEQuant versions of Solar-Open-100B for efficient MoE-based LLM deployment. INT4 NVFP4
  3. Won 1st Place in Track C and the Overall Grand Prize at the NVIDIA Nemotron Hackathon Seoul. NVIDIA recap Interview
  4. Released the arXiv preprint Value-and-Structure Alignment for Routing-Consistent Quantization of Mixture-of-Experts Models. arXiv PDF
  5. Two MoE quantization papers were accepted to AdaptFM @ ICML 2026. News

About

I am a Senior AI Research Engineer at Nota Inc. My work spans efficient LLM/VLM systems, quantization, pruning, knowledge distillation, model porting, NPU/GPU-aware optimization, vLLM-based serving, and uncertainty-aware NLP. I received my Ph.D. in Computer Science from KAIST, advised by Jong C. Park.

Efficient LLMs/VLMs MoE Quantization Model Compression On-device AI Reliable NLP Human-centric CV

Publications

Selected papers. Add local PDFs under papers/ whenever needed.

DREAM-MoE: Downstream Routing Error-Aware Margin-Preserving Quantization for Mixture-of-Experts Large Language Models

Hancheol Park, Geonho Lee, Tae-Ho Kim. AdaptFM @ ICML, 2026.

SRA-MoE: Output-Aware Selective Router Alignment for MoE Quantization

Geonho Lee, Hancheol Park, Seunghyun Lee, Jungwook Choi, Tae-Ho Kim. AdaptFM @ ICML, 2026.

Value-and-Structure Alignment for Routing-Consistent Quantization of Mixture-of-Experts Models

Hancheol Park, Geonho Lee, Tairen Piao, Tae-Ho Kim. arXiv preprint, 2026.

Nota AI at GenAI Detection Task 1: Unseen Language-Aware Detection System for Multilingual Machine-Generated Text

Hancheol Park, Jaeyeon Kim, Geonmin Kim, Tae-Ho Kim. GenAIDetect @ COLING, 2025.

Where do LLMs Encode the Knowledge to Assess the Ambiguity?

Hancheol Park, Geonmin Kim. COLING Industry Track, 2025.

Assessing the Answerability of Queries in Retrieval-Augmented Code Generation

Geonmin Kim, Jaeyeon Kim, Hancheol Park, Wooksu Shin, Tae-Ho Kim. arXiv preprint, 2024.

Self-Knowledge Distillation for Learning Ambiguity

Hancheol Park, Soyeong Jeong, Sukmin Cho, Jong C. Park. arXiv preprint, 2024.

Cluster Self-Refinement for Enhanced Online Multi-Camera People Tracking

Jeongho Kim, Wooksu Shin, Hancheol Park, Donghyuk Choi. AI City Challenge Workshop @ CVPR, 2024.

Road Object Detection Robust to Distorted Objects at the Edge Regions of Images

Wooksu Shin, Donghyuk Choi, Hancheol Park, Jeongho Kim. AI City Challenge Workshop @ CVPR, 2024.

Deep Model Compression Also Helps Models Capture Ambiguity

Hancheol Park, Jong C. Park. ACL Long Paper, 2023.

Question-Answering in a Low-resourced Language: Benchmark Dataset and Models for Tigrinya

Fitsum Gaim, Wonsuk Yang, Hancheol Park, Jong C. Park. ACL Long Paper, 2023. Outstanding Paper Award

Addressing the Occlusion Problem in Multi-Camera People Tracking with Human Pose Estimation

Jeongho Kim*, Wooksu Shin*, Hancheol Park*, Jongwon Baek. AI City Challenge Workshop @ CVPR, 2023. * Equal contribution

Earlier publications
  • Hancheol Park, Kyo-Joong Oh, Ho-Jin Choi, Gahgene Gweon. Constructing a Paraphrase Database for Agglutinative Languages. Data & Knowledge Engineering, 2019.
  • Huije Lee, Hancheol Park, Wonsuk Yang, Jong C. Park. Detection of Non-Standard Meaning Usage with Word Embedding. HCIK, 2018.
  • Wonsuk Yang, Hancheol Park, Jong C. Park. Neural Theorem Prover with Word Embedding for Efficient Automatic Annotation. Journal of KIISE, 2017.
  • Hancheol Park, Jung-Ho Kim, Jong C. Park. Addressing Low-Resource Problems in Statistical Machine Translation of Manual Signals in Sign Language. Journal of KIISE, 2017.
  • Hancheol Park, Gahgene Gweon, Jeong Heo. Affix Modification-Based Bilingual Pivoting Method for Paraphrase Extraction in Agglutinative Languages. BigComp, 2016. AFNLP Best Asian Paper Award
  • Hancheol Park, Gahgene Gweon. Initiating Moderation in Problematic Smartphone Usage Patterns. CHI Extended Abstracts, 2015.

Selected Projects

Sovereign AI Foundation Model Project

Technical owner and lead developer for MoE-specific compression, INT4/NVFP4 quantization, and expert pruning for Solar-Open models.

2025 - Present

LLM Porting and Optimization for Qualcomm NPUs

Optimization and porting workflows for Llama, Qwen, and EXAONE targeting Qualcomm NPU execution environments.

2025

Hybrid LLM System for SK Telecom

Hybrid routing system between mobile SLMs and server-side LLMs based on query difficulty, showcased at MWC 2025.

2024

Efficient VLMs for On-device Industrial Safety

Lightweight VLMs under 4B parameters deployed on Snapdragon-based mobile and QRB5165 industrial platforms.

2024

Awards & Honors

Experience

Nota Inc.

Senior AI Research Engineer

Sep. 2020 - Present

Education

Korea Advanced Institute of Science and Technology (KAIST)

Ph.D. in Computer Science
Thesis: Capturing Ambiguity in Natural Language Understanding Tasks with Information from Internal Layers.
Advisor: Jong C. Park

Feb. 2024