Lvfang Tao 陶履方

Lvfang Tao is a Senior Researcher at Tencent HY, focusing on Multimodal Reinforcement Learning. Prior to this role, he was a core contributor to the reinforcement-learning platform AI Arena, where he led algorithm exploration. He joined Tencent in 2022 after receiving his Master of Science degree from the School of Electronic and Computer Engineering (SECE) at Peking University. His work focuses on mid-training and post-training for large language models—including synthetic data, on-policy distillation, RLVR, and agentic reinforcement learning—as well as the development of RL environments and agents. Since 2019, he has led projects on from-scratch and continual pretraining across text, speech, and image modalities.

陶履方，2022 年毕业于北京大学信息工程学院并获得理学硕士学位，同年加入腾讯。现任腾讯混元多模态强化学习高级研究员，此前曾担任腾讯开悟平台核心贡献者并负责算法探索工作。个人研究兴趣包括大模型的中训练与后训练（涉及合成数据、on-policy 蒸馏、RLVR、Tool-Use / Agentic RL 等议题），以及强化学习环境、智能体与平台系统的研发。自 2019 年起，他在文本、语音和图像等模态上积累了大规模从零预训练与增量预训练的经验。

news

Dec 15, 2024	Presenting Tinytron’s double-1st-place solutions (model compression & pretrain tracks) as Project Lead in NeurIPS 2024 competition workshop “Edge-Device Large Language Model Competition”. Workshop link.
Jul 21, 2023	“AdaNIC: Towards Practical Neural Image Compression via Dynamic Transform Routing” accepted to ICCV 2023 (first author). Paper link.
Dec 08, 2022	Presenting Team TEG-AutoML’s 2nd-place solution as Project Lead in NeurIPS 2022 competition workshop “AutoML Decathlon: Diverse Tasks, Modern Methods, and Efficiency at Scale”. Workshop link.

selected publications

Tinytron Technical Report

Lvfang Tao, Renjie Mao, Yongguang Lin, and 3 more authors

Dec 2024

Blog
Automl decathlon: Diverse tasks, modern methods, and efficiency at scale

Nicholas Roberts, Samuel Guo, Cong Xu, and 8 more authors

In NeurIPS 2022 Competition Track, Dec 2023

PDF
Adanic: Towards practical neural image compression via dynamic transform routing

Lvfang Tao, Wei Gao, Ge Li, and 1 more author

In Proceedings of the IEEE/CVF International Conference on Computer Vision, Dec 2023

PDF
A hardware implementation of entropy encoder for 8K video coding

Lvfang Tao and Wei Gao

In 2022 IEEE International Conference on Multimedia and Expo (ICME), Dec 2022

PDF
Efficient channel pruning based on architecture alignment and probability model bypassing

Lvfang Tao and Wei Gao

In 2021 IEEE international conference on systems, man, and cybernetics (SMC), Dec 2021

PDF
A fast view synthesis implementation method for light field applications

Wei Gao, Linjie Zhou, and Lvfang Tao

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Dec 2021

PDF