Lvfang Tao 陶履方
Reinforcement Learning · Multimodal LLMs & Agents · AutoML · Neural Compression
Lvfang Tao is a Senior Researcher at Tencent HY, focusing on Multimodal Reinforcement Learning. Prior to this role, he was a core contributor to the reinforcement-learning platform AI Arena, where he led algorithm exploration. He joined Tencent in 2022 after receiving his Master of Science degree from the School of Electronic and Computer Engineering (SECE) at Peking University. His work focuses on mid-training and post-training for large language models—including synthetic data, on-policy distillation, RLVR, and agentic reinforcement learning—as well as the development of RL environments and agents. Since 2019, he has led projects on from-scratch and continual pretraining across text, speech, and image modalities.
陶履方,2022 年毕业于北京大学信息工程学院并获得理学硕士学位,同年加入腾讯。现任 腾讯混元 多模态强化学习高级研究员,此前曾担任 腾讯开悟平台 核心贡献者并负责算法探索工作。个人研究兴趣包括大模型的中训练与后训练(涉及合成数据、on-policy 蒸馏、RLVR、Tool-Use / Agentic RL 等议题),以及强化学习环境、智能体与平台系统的研发。自 2019 年起,他在文本、语音和图像等模态上积累了大规模从零预训练与增量预训练的经验。
news
| Dec 15, 2024 | Presenting Tinytron’s double-1st-place solutions (model compression & pretrain tracks) as Project Lead in NeurIPS 2024 competition workshop “Edge-Device Large Language Model Competition”. Workshop link. |
|---|---|
| Jul 21, 2023 | “AdaNIC: Towards Practical Neural Image Compression via Dynamic Transform Routing” accepted to ICCV 2023 (first author). Paper link. |
| Dec 08, 2022 | Presenting Team TEG-AutoML’s 2nd-place solution as Project Lead in NeurIPS 2022 competition workshop “AutoML Decathlon: Diverse Tasks, Modern Methods, and Efficiency at Scale”. Workshop link. |
selected publications
-
- Automl decathlon: Diverse tasks, modern methods, and efficiency at scaleIn NeurIPS 2022 Competition Track, Dec 2023
- Adanic: Towards practical neural image compression via dynamic transform routingIn Proceedings of the IEEE/CVF International Conference on Computer Vision, Dec 2023
- A hardware implementation of entropy encoder for 8K video codingIn 2022 IEEE International Conference on Multimedia and Expo (ICME), Dec 2022
- Efficient channel pruning based on architecture alignment and probability model bypassingIn 2021 IEEE international conference on systems, man, and cybernetics (SMC), Dec 2021
- A fast view synthesis implementation method for light field applicationsACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Dec 2021