I research large language models (LLMs) at Contextual AI & after September as a PhD student at Stanford. I did my bachelor’s at Peking University. I’ve worked in Chinese, English, Japanese, French & German, but I don’t have enough parameters so I forgot much of the last two 😅

My research focuses on advancing LLM capabilities, where I’ve had the chance to work with great collaborators on:

🧙🏻‍♂️ Scaling: Scaling up LLMs is critical to improve them. I study how it works (data-constrained scaling laws) & apply it to create open LLMs (OLMo, StarCoder, BLOOM).

🫡 Alignment/Instruction-following: Models need to follow instructions to be truly useful. I study improving it e.g. multilingual (BLOOMZ/mT0), coding (OctoPack), embedding (GRIT).

🤔 Other: I’ve also had the chance to work on the largest text embedding benchmark (MTEB) & build multimodal models winning 2nd in Meta’s Hateful Memes Challenge (Blog).

I love this field & am very optimistic about the future of AI ❤️ Apart from AI, I am very interested in health (🏊, 🎾, 🏃, 🌸).

Feel free to reach out, I’m especially interested in possible research collaborations! I’m also happy to advise self-motivated people on a research project - I have many ideas that I think could be very impactful & we could work on together :)