<aside> ✉️ E-Mail
</aside>
<aside> <img src="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/93fd7543-4be2-4d67-8382-e0f84ce7e893/Twitter_social_icons_-_circle_-_blue.png" alt="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/93fd7543-4be2-4d67-8382-e0f84ce7e893/Twitter_social_icons_-_circle_-_blue.png" width="40px" /> Twitter
</aside>
<aside> <img src="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/a03de0ac-c13b-41b1-94c9-954166c44607/github-mark.png" alt="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/a03de0ac-c13b-41b1-94c9-954166c44607/github-mark.png" width="40px" /> Github
</aside>
<aside> <img src="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/3eccf282-5ed6-4735-b2b4-4818e29a1907/google-scholar-doctor-of-philosophy-university-google-logo-google-743c4083f26230cd6698b60687d7d963.png" alt="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/3eccf282-5ed6-4735-b2b4-4818e29a1907/google-scholar-doctor-of-philosophy-university-google-logo-google-743c4083f26230cd6698b60687d7d963.png" width="40px" /> Google Scholar
</aside>
<aside> 💫 I am a PhD student at Language Technologies Institute at Carnegie Mellon University and a contributor at EleutherAI. My research aims to make language technologies more capable, interpretable, and ultimately safe and useful.
My work involves understanding how language models work and novel methods to expand their capabilities. To date, I have worked on inducing zero-shot model capabilities through multitask finetuning approaches, observing model training dynamics, and investigate methods to extend models to other languages.
Aside of building better models, I believe equitable and accessible language technologies hinges upon well governed open-source artifacts. As such, I strive to advocate for open source initiatives (initiatives like EleutherAI).
Sometime ago, I cofounded an OCR and information extraction solutions startup that was acquired in 2022 by Datasaur, Inc.
</aside>
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling Stella Biderman*, Hailey Schoelkopf* and 11 others including Lintang Sutawika Fortieth International Conference on Machine Learning (ICML), 2023
Emergent and Predictable Memorization in Large Language Models Stella Biderman, USVSN Sai Prashanth, Lintang Sutawika, Hailey Schoelkopf, Quentin Anthony, Shivanshu Purohit, Edward Raf arXiv preprint arXiv:2304.11158, 2023
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting Zheng-Xin Yong, Hailey Schoelkopf, and 12 others including Lintang Sutawika arXiv preprint arXiv:2212.09535, 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, and 386 others including Lintang Sutawika arXiv preprint arXiv:2211.05100, 2022.
Crosslingual Generalization through Multitask Finetuning Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, and 15 others arXiv preprint arXiv:2211.01786, 2022.
What Language Model to Train if You Have One Million GPU Hours?
Teven Le Scao*, Thomas Wang*, Daniel Hesslow*, Lucile Saulnier*, Stas Bekman*, and 13 others including Lintang Sutawika Findings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
Multitask Prompted Training Enables Zero-Shot Task Generalization Victor Sanh*, Albert Webson*, Colin Raffel*, Stephen H. Bach*, and 37 others including Lintang Sutawika 10th International Conference on Learning Representations (ICLR), 2022.