I am a 2nd-year PhD student at the Language Technology Lab, University of Cambridge, supervised by Professor Nigel Collier. I recently am working on two things: (1) multi-modal NLP: connecting language with knowledge and perception; (2) self-supervised NLP: specialising language models to be good at given tasks without labels. I am a member of Clare Hall and a Trust Scholar funded by Grace & Thomas C.H. Chan Cambridge Scholarship. I also spend(t) time doing research at Amazon, EPFL, and Waterloo.


  • [Aug 27th, 2021] Four papers accepted to the main conference of EMNLP 2021. See you in Punta Cana (hopefully?).
  • [Aug 27th, 2021] SapBERT is integrated into NVIDIA’s deep learning toolkit NeMo as its entity linking module (thank you NVIDIA!). They even wrote a tutorial – check out this Google Colab.
  • [May 6th, 2021] A cross-lingual extension of SapBERT will appear at ACL-IJCNLP 2021.
  • [April 16th, 2021] Happy to have given a talk about SapBERT (our recent NAACL paper) at AstraZeneca’s NLP seminar. Here are the slides.
  • [April 15th, 2021] We released Mirror-BERT, a fast, effective, and self-supervised approach for transforming masked language models to universal language encoders.


University of Cambridge
PhD student, Computation, Cognition and Language
Master of Philosophy (2020), (Computational) Linguistics
University of Waterloo
Bachelor of Mathematics (2019), Computer Science


senior PC member: IJCAI 2021
PC member/reviewer: IJCAI-ECAI 2022, ACL-IJCNLP 2021 (outstanding reviewer), AAAI 2021 (top 25% PC member), WACV 2021, ACL-IJCNLP 2021 SRW, ACL 2020 SRW, AACL-IJCNLP 2020 SRW
volunteer: ACL-IJCNLP 2021, NAACL 2021, AAAI 2021


fl399 [at] cam [dot] ac [dot] uk