Joya Chen (陈卓)

Hi! This is Joya, a second-year Ph.D. student at Show Lab @ NUS, supervised by Prof. Mike Shou. Currently, I'm interning at FAIR, Meta AI, working with Huiyu Wang. Previously I also worked with Zhaoyang Lv from Reality Labs Research, Meta. My research interest is in large multimodal models, particularly learning them for real-time, streaming video.

View my education background

I obtained my bachelor's degree in School of Automotive Engineering, WUT. To chase my AI dream, I took the National Postgraduate Entrance Examination and obtained the 1st place in School of Computer Science and Technology, USTC. I obtained my master's degree from here, under the supervision of Prof. Enhong Chen, Prof. Tong Xu, and Prof. Dong Liu. I also had a research assistant at CVML@NUS group, working closely with Prof. Angela Yao.

Nice to meet you: joyachen@u.nus.edu :)

Google Scholar  /  Github  /  Zhihu

profile photo
Most Recent
Core organizer of LOVEU: LOng-form VidEo Understanding Towards Multimodal AI Assistant and Copilot Workshop @ CVPR'24.

We have uploaded the recorded video:

Excellent talks given by Prof. Dima Damen, Prof. Marc Pollefeys, Dr. Chunyuan Li.

Great winner talks on Track1: Long-Term Video Question Answering and Track 2A: Text-Guided Video Editing & Track 2B: Text-to-Video Generation.
Research
VideoLLM-online: Online Video Large Language Model for Streaming Video
Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou
CVPR, 2024
Homepage: Paper, Code, Data, Demo, Checkpoints
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, ..., Mike Zheng Shou, Michael Wray
CVPR (Oral), 2024
https://ego-exo4d-data.org/
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Difei Gao, Lei Ji, Luowei Zhou, Kevin Qinghong Lin, Joya Chen, Zihan Fan, Mike Zheng Shou
arXiv, 2023
Paper / Page
UniVTG: Towards Unified Video-Language Temporal Grounding
Kevin Qinghong Lin, Pengchuan Zhang, Joya Chen, Shraman Pramanick, Difei Gao, Alex Jinpeng Wang, Rui Yan, Mike Zheng Shou
ICCV, 2023
Paper / Code / Demo
Affordance Grounding from Demonstration Video to Target Image
Joya Chen, Difei Gao, Kevin Qinghong Lin, Mike Zheng Shou
CVPR, 2023
Paper / Code
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
Joya Chen*, Kai Xu*, Yuhui Wang, Yifei Cheng, Angela Yao (*Equal contribution)
ICLR, 2023
OpenReview / arXiv / Code
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant
Benita Wong*, Joya Chen*, You Wu*, Stan Weixian Lei, Dongxing Mao, Difei Gao, Mike Zheng Shou (*Equal contribution)
ECCV, 2022
Paper / Page / Code / Challenge@CVPR'22
Is Heuristic Sampling Necessary in Training Deep Object Detectors?
Joya Chen, Dong Liu, Tong Xu, Shiwei Wu, Yifei Chen, Enhong Chen
IEEE Transactions on Image Processing, 2021
Paper / Code
Linking the Characters: Video-oriented Social Graph Generation via Hierarchical-cumulative GCN
Shiwei Wu, Joya Chen, Tong Xu, Liyi Chen, Lingfei Wu, Yao Hu, Enhong Chen
ACM MM (Oral), 2021
Paper
Engineering
Ranked 1st in HO-3D Leaderboard in Mesh Error/AUC and F@15mm metrics in Dec. 2020
Ranked 1st in PASCAL VOC Object Detection Competition 3 Leaderboard in Sep. 2018
Internship
Worked as a AI research scientist intern at FAIR, Meta AI from Dec. 2023 to May. 2024
Worked as a computer vision research intern in Tencent from Jun. 2018 to Nov. 2019

Thanks go to Jon Barron's website!