Sheng Shen

a Ph.D. student in BAIR, EECS at the University of California, Berkeley.

profile Berkeley, California
Email: sheng.s@berkeley.edu
Google Scholar
Github
Twitter
Linkedin

I studied Natural Language Processing.
At Berkeley, I am advised by Prof. Kurt Keutzer and Prof. Trevor Darrell. I also work closely with Prof. Dan Klein and Prof. Michael Mahoney. Prior to Berkeley, I received my bachelor degree in computer science from Peking University, advised by Prof. Xuanzhe Liu.

Publications

  • K-LITE: Learning Transferable Visual Models with External Knowledge NeurIPS 2022
  • Staged Training for Transformer Language Models ICML 2022
  • How Much Can CLIP Benefit Vision-and-Language Tasks? ICLR 2022
  • Multitask prompted training enables zero-shot task generalization ICLR 2022
  • Learned Token Pruning for Transformers KDD 2022
  • Reservoir Transformers ACL 2021
  • ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning AAAI 2021
  • Noisy Self-Knowledge Distillation for Text Summarization NAACL 2021
  • PowerNorm: Rethinking Batch Normalization in Transformers ICML 2020
  • Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers ICML 2020
  • Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT AAAI 2020
  • Pragmatically Informative Text Generation NAACL 2019 short
  • Ermes: Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification WWW 2019
  • Experience

    Allen Institute for Artificial Intelligence, Research Intern
    Advised by Iz Beltagy, Matthew Peters and Jesse Dodge, May. 2021 - Aug. 2021

    Facebook AI Research, Research Intern
    Advised by Douwe Kiela and Michael Auli, May. 2020 - Dec. 2020

    Berkeley AI Research, Junior Specialist II
    Advised by Prof. Kurt Keutzer, Prof. Dan Klein and Prof. Michael Mahoney, Jun. 2019 - May. 2020

    Tencent AI Lab, Research Intern
    Advised by Yaliang Li and Wei Fan, Apr. 2018 - Sept. 2018

    University of Illinois at Urbana-Champaign, Research Intern
    Advised by Prof. Aditya Parameswaran, Jun. 2017 - Sept. 2017