Yuxiao Chen(陈宇骁)
PhD Candidate @ Rutgers

Brief Bio. I am a Ph.D. candidate at Rutgers University, supervised by Prof. Dimitris N. Metaxas. Before that, I obtained my M.S. degree in Computer Science from Univeristy of Rochester in 2018, supervised by Prof. Jiebo Luo. I received my B.E degree in Software Engineering from Prof. Sun Yat-sen University.

My current research interests lie primarily in (1) Vision-language model learning (e.g., large-scale multimodal pre-training, image-text matching), (2) Skeleton-based human action understanding (e.g., human pose modeling, estimation, and representation learning), and (3) Data mining in social media.

617 Bowser Rd
Piscataway, NJ 08854
Rutgers University
Email: yc984 [at] rutgers [dot] edu
[Curriculum Vitae]



Sept. 2018 - Present
Rutgers, The State University of New Jersey - New Brunswick. Piscataway, NJ, USA
Ph.D. in Computer Science
Sept. 2016 - May. 2018
Univeristy of Rochester. Rochester, NY, USA
M.S. in Computer Science
Sept. 2012 - Jul. 2016
Sun Yat-sen University, Guangzhou, China
B.Eng. in Software Engineering


Sept. 2018 - Present
Rutgers, The State University of New Jersey - New Brunswick. Piscataway, NJ, USA
Research Assistant. Supervised by Prof. Dimitris N. Metaxas
  • Human/hand pose modeling, estimation and representation learning
  • Sign language video understanding
May. 2023 - Aug. 2023
NEC Laboratories America Inc. Princeton, NJ, USA
Research Intern. Mentor: Dr. Kai Li
  • Open-set video action localization and grounding
May. 2022 - Aug. 2022
AML Research, Bytedance. Seattle, WA, USA
Research Intern. Mentor: Dr. Jianbo Yuan, Dr. Yu Tian, and Dr. Xinyu Li
  • Large-scale multimodal (image-text) models pre-training
May. 2020 - Aug. 2020
Softline Discovery, Amazon. Seattle, WA, USA
Applied Scientist Intern. Mentor: Dr. Jianbo Yuan
  • Image-text matching/retrieval
Sept. 2016 - May. 2018
University of Rochester. Rochester, NY, USA
Research Assistant. Supervised by Prof. Jiebo Luo
  • Data mining in social media
May. 2017 - Aug. 2017
Youtu Lab, Tencent. Shanghai, China
Research Intern. Mentor: Dr. Pai Peng
  • Medical images analysis

Selected Publications

* indicates equal contributions. Please check Google Scholar for the full list of my publications.

  • Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
  • Yuxiao Chen, Kai Li, Wentao Bao, Deep Patel, Yu Kong, Martin Renqiang Min, Dimitris N. Metaxas.

    Under Review

  • Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection.
  • Wentao Bao, Kai Li, Yuxiao Chen, Deep Patel, Martin Renqiang Min, Yu Kong

    Under Review

  • Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
  • Yuxiao Chen, Jianbo Yuan, Yu Tian, Shijie Geng, Xinyu Li, Ding Zhou, Dimitris N. Metaxas, Hongxia Yang

    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

  • HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
  • Shijie Geng, Jianbo Yuan, Yu Tian, Yuxiao Chen, Yongfeng Zhang

    International Conference on Learning Representations (ICLR), 2023

  • More Than Just Attention: Improving Cross-Modal Attentions with Contrastive Constraints for Image-Text Matching
  • Yuxiao Chen, Jianbo Yuan, Long Zhao, Tianlang Chen, Rui Luo, Larry Davis, Dimitris N. Metaxas

    IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023

  • Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning
  • Yuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Dimitris N. Metaxas

    European Conference on Computer Vision (ECCV), 2022.

  • Knowledge as Priors: Cross-Modal Knowledge Generalization for Datasets without Superior Knowledge
  • Long Zhao, Xi Peng, Yuxiao Chen, Mubbasir Kapadia, Dimitris N. Metaxas.

    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

  • Construct dynamic graphs for hand gesture recognition via spatial-temporal attention
  • Yuxiao Chen, Long Zhao, Xi Peng, Jianbo Yuan, Dimitris N. Metaxas.

    British Machine Vision Conference (BMVC), 2019

  • Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM
  • Yuxiao Chen*, Jianbo Yuan*, Quanzeng You, Jiebo Luo

    ACM Multimedia Conference (ACM MM),2018

  • Mining the Relationship between Emoji Usage Patterns and Personality
  • Weijian Li Yuxiao Chen*, Tianran Hu, Jiebo Luo

    AAAI International Conference on Web and Social Media (ICWSM), 2018

  • When E-commerce Meets Social Media: Identifying and Mining Business on WeChat Moment Using Bilateral-Attention Task-driven LSTM
  • Tianlang Chen, Yuxiao Chen, Jiebo Luo

    The World Wide Web Conference (WWW), 2018.

  • A Selfie is Worth a Thousand Words: Mining Personal Patterns behind User Selfie-posting Behaviours
  • Tianlang Chen, Yuxiao Chen, Jiebo Luo

    The World Wide Web Conference (WWW), 2017.