Welcome to Joya Chen's Homepage!

Joya Chen （陈卓）

Hi! This is Joya, a second-year Ph.D. student at Show Lab @ NUS, supervised by Prof. Mike Shou. Currently, I'm interning at FAIR, Meta AI, working with Huiyu Wang. Previously I also worked with Zhaoyang Lv from Reality Labs Research, Meta. My research interest is in large multimodal models, particularly learning them for real-time, streaming video.

View my education background

I obtained my bachelor's degree in School of Automotive Engineering, WUT. To chase my AI dream, I took the National Postgraduate Entrance Examination and obtained the 1st place in School of Computer Science and Technology, USTC. I obtained my master's degree from here, under the supervision of Prof. Enhong Chen, Prof. Tong Xu, and Prof. Dong Liu. I also had a research assistant at CVML@NUS group, working closely with Prof. Angela Yao.

Nice to meet you: joyachen@u.nus.edu :)

Google Scholar / Github / Zhihu

Most Recent

Core organizer of LOVEU: LOng-form VidEo Understanding Towards Multimodal AI Assistant and Copilot Workshop @ CVPR'24.

We have uploaded the recorded video:

Excellent talks given by Prof. Dima Damen, Prof. Marc Pollefeys, Dr. Chunyuan Li.

Great winner talks on Track1: Long-Term Video Question Answering and Track 2A: Text-Guided Video Editing & Track 2B: Text-to-Video Generation.

Research

VideoLLM-online: Online Video Large Language Model for Streaming Video
Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou
CVPR, 2024
Homepage: Paper, Code, Data, Demo, Checkpoints

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, ..., Mike Zheng Shou, Michael Wray
CVPR (Oral), 2024
https://ego-exo4d-data.org/

	AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn Difei Gao, Lei Ji, Luowei Zhou, Kevin Qinghong Lin, Joya Chen, Zihan Fan, Mike Zheng Shou arXiv, 2023 Paper / Page
	UniVTG: Towards Unified Video-Language Temporal Grounding Kevin Qinghong Lin, Pengchuan Zhang, Joya Chen, Shraman Pramanick, Difei Gao, Alex Jinpeng Wang, Rui Yan, Mike Zheng Shou ICCV, 2023 Paper / Code / Demo
	Affordance Grounding from Demonstration Video to Target Image Joya Chen, Difei Gao, Kevin Qinghong Lin, Mike Zheng Shou CVPR, 2023 Paper / Code
	DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training Joya Chen, Kai Xu, Yuhui Wang, Yifei Cheng, Angela Yao (Equal contribution) ICLR*, 2023 OpenReview / arXiv / Code
	AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant Benita Wong, Joya Chen, You Wu, Stan Weixian Lei, Dongxing Mao, Difei Gao, Mike Zheng Shou (Equal contribution) ECCV, 2022 Paper / Page / Code / Challenge@CVPR'22
	Is Heuristic Sampling Necessary in Training Deep Object Detectors? Joya Chen, Dong Liu, Tong Xu, Shiwei Wu, Yifei Chen, Enhong Chen IEEE Transactions on Image Processing, 2021 Paper / Code
	Linking the Characters: Video-oriented Social Graph Generation via Hierarchical-cumulative GCN Shiwei Wu, Joya Chen, Tong Xu, Liyi Chen, Lingfei Wu, Yao Hu, Enhong Chen ACM MM (Oral), 2021 Paper

Engineering

	Ranked 1st in HO-3D Leaderboard in Mesh Error/AUC and F@15mm metrics in Dec. 2020
	Ranked 1st in PASCAL VOC Object Detection Competition 3 Leaderboard in Sep. 2018

Internship

Worked as a AI research scientist intern at FAIR, Meta AI from Dec. 2023 to May. 2024

Worked as a computer vision research intern in Tencent from Jun. 2018 to Nov. 2019

Thanks go to Jon Barron's website!