Jiawei Liu

I am a final-year Ph.D. student at University of Illinois Urbana-Champaign, working with Lingming Zhang. My research explores how Software Engineering and Programming Language can benefit and benefit from Machine Learning and its systems, with a focus on improving software reliability and developer productivity.

Software Engineering with Language Models

Developing models and evaluators to generate high-quality code:

Code evaluation: correctness [EvalPlus], efficiency [EvalPerf], without hard verifiers [CodeFavor]
Training models [StarCoder2] to reason [Code-R1] and follow diverse instructions [Magicoder]
Code editing should be real-time and can be largely accelerated by multi-layer speculation [Blazedit]

Software Engineering for ML Systems

Building automated tools to improve the reliability of ML systems and to simplify its deployment:

Automating test program synthesis [NeuRI][NNSmith][Tzer], which has found 300+ critical bugs for daily ML frameworks and compilers
Engineering ML systems and compilers productively using top-down development [TapML] and pattern languages [Relax]

My research has been generously supported by Amazon PhD Fellowship, Illinois Innovation Award, Yee Memorial Fellowship, and grants from Amazon and OpenAI. My work (i) finds 300+ critical bugs automatically in ML systems like PyTorch, winning ACM SIGSOFT Distinguished Paper Award and Distinguished Artifact Award, and (ii) builds LLMs and evaluators for code with 1M+ downloads and wide industrial adoptions.

📰 Some recent coding: R1 for Code Generation and Speculative Code Editing.

ResearchShow More

ISSTA’25 / Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging

Siyuan Feng*, Jiawei Liu*, Ruihang Lai, Charlie Ruan, Yong Yu, Lingming Zhang, Tianqi Chen

Proc. ACM Softw. Eng. 2 (ISSTA). Jun 2025

PAPER Bib Slides Artifact

@article{feng2025productively,
  author = {Feng, Siyuan and Liu, Jiawei and Lai, Ruihang and Ruan, Charlie and Yu, Yong and Zhang, Lingming and Chen, Tianqi},
  title = {Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging},
  year = {2025},
  issue_date = {July 2025},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  volume = {2},
  number = {ISSTA},
  url = {https://doi.org/10.1145/3728957},
  doi = {10.1145/3728957},
  journal = {Proc. ACM Softw. Eng.},
  month = jun,
  articleno = {ISSTA080},
  numpages = {23},
  keywords = {Developer Productivity, Machine Learning Systems, Software Testing},
}

ICML’24 / Magicoder: Empowering Code Generation with OSS-Instruct

Yuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, Lingming Zhang

Forty-first International Conference on Machine Learning. Jun 2024
Adopted by Meta Llama 3.1, Google CodeGemma, and IBM Granite
PAPER Bib Slides
@inproceedings{wei2023magic, title = {Magicoder: Empowering Code Generation with {OSS}-Instruct}, author = {Wei, Yuxiang and Wang, Zhe and Liu, Jiawei and Ding, Yifeng and Zhang, Lingming}, booktitle = {Forty-first International Conference on Machine Learning}, year = {2024}, url = {https://openreview.net/forum?id=XUeoOBid3x}, }

Pre-print / StarCoder 2 and The Stack v2: The Next Generation

Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei and 56 more authors

arXiv preprint arXiv:2402.19173. Jun 2024

PAPER Bib

@article{Lozhkov2024StarCoder2A,
  title = {StarCoder 2 and The Stack v2: The Next Generation},
  author = {Lozhkov, Anton and Li, Raymond and Allal, Loubna Ben and Cassano, Federico and Lamy-Poirier, Joel and Tazi, Nouamane and Tang, Ao and Pykhtar, Dmytro and Liu, Jiawei and Wei, Yuxiang and Liu, Tianyang and Tian, Max and Kocetkov, Denis and Zucker, Arthur and Belkada, Younes and Wang, Zijian and Liu, Qian and Abulkhanov, Dmitry and Paul, Indraneil and Li, Zhuang and Li, Wen-Ding and Risdal, Megan L. and Li, Jia and Zhu, Jian and Zhuo, Terry Yue and Zheltonozhskii, Evgenii and Dade, Nii Osae Osae and Yu, Wenhao and Krauss, Lucas and Jain, Naman and Su, Yixuan and He, Xuanli and Dey, Manan and Abati, Edoardo and Chai, Yekun and Muennighoff, Niklas and Tang, Xiangru and Oblokulov, Muhtasham and Akiki, Christopher and Marone, Marc and Mou, Chenghao and Mishra, Mayank and Gu, Alexander and Hui, Binyuan and Dao, Tri and Zebaze, Armel and Dehaene, Olivier and Patry, Nicolas and Xu, Canwen and McAuley, Julian and Hu, Han and Scholak, Torsten and Paquet, S{\'e}bastien and Robinson, Jennifer and Anderson, Carolyn Jane and Chapados, Nicolas and Patwary, Mostofa and Tajbakhsh, Nima and Jernite, Yacine and Ferrandis, Carlos Mu{\~n}oz and Zhang, Lingming and Hughes, Sean and Wolf, Thomas and Guha, Arjun and von Werra, Leandro and de Vries, Harm},
  journal = {arXiv preprint arXiv:2402.19173},
  year = {2024},
}

NeurIPS’23 / Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation

Jiawei Liu*, Chunqiu Steven Xia*, Yuyao Wang, Lingming Zhang

Thirty-seventh Conference on Neural Information Processing Systems. Jun 2023
1M dataset downloads; integrated by various major companies
PAPER Bib Slides Website 🤗 HF
@inproceedings{liu2023is, title = {Is Your Code Generated by Chat{GPT} Really Correct? Rigorous Evaluation of Large Language Models for Code Generation}, author = {Liu, Jiawei and Xia, Chunqiu Steven and Wang, Yuyao and Zhang, Lingming}, booktitle = {Thirty-seventh Conference on Neural Information Processing Systems}, year = {2023}, url = {https://openreview.net/forum?id=1qvx610Cu7}, }
FSE’23 / NeuRI: Diversifying DNN Generation via Inductive Rule Inference

Jiawei Liu, Jinjun Peng, Yuyao Wang, Lingming Zhang

Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Jun 2023
🏆 ACM SIGSOFT Distinguished Paper Award
PAPER Bib Slides Artifact
@inproceedings{liu2023neuri, title = {NeuRI: Diversifying DNN Generation via Inductive Rule Inference}, author = {Liu, Jiawei and Peng, Jinjun and Wang, Yuyao and Zhang, Lingming}, year = {2023}, isbn = {9798400703270}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3611643.3616337}, doi = {10.1145/3611643.3616337}, booktitle = {Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering}, pages = {657--669}, numpages = {13}, location = {San Francisco, CA, USA}, series = {ESEC/FSE 2023}, }

ASPLOS’23 / NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers

Jiawei Liu*, Jinkun Lin*, Fabian Ruffy, Cheng Tan, Jinyang Li, Aurojit Panda, Lingming Zhang

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2. Jun 2023

🏆 Distinguished Artifact Award

PAPER Bib Poster Slides Artifact

@inproceedings{liu2023nnsmith,
  title = {NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers},
  author = {Liu, Jiawei and Lin, Jinkun and Ruffy, Fabian and Tan, Cheng and Li, Jinyang and Panda, Aurojit and Zhang, Lingming},
  year = {2023},
  isbn = {9781450399166},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3575693.3575707},
  doi = {10.1145/3575693.3575707},
  booktitle = {Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2},
  pages = {530--543},
  numpages = {14},
  keywords = {Deep Learning Compilers, Compiler Testing, Fuzzing},
  location = {Vancouver, BC, Canada},
  series = {ASPLOS 2023},
}

OOPSLA’22 / Coverage-guided tensor compiler fuzzing with joint IR-pass mutation

Jiawei Liu, Yuxiang Wei, Sen Yang, Yinlin Deng, Lingming Zhang

Proceedings of the ACM on Programming Languages 6 (OOPSLA1). Apr 2022

PAPER Bib Slides Artifact
@article{liu2022coverage, title = {Coverage-guided tensor compiler fuzzing with joint IR-pass mutation}, author = {Liu, Jiawei and Wei, Yuxiang and Yang, Sen and Deng, Yinlin and Zhang, Lingming}, journal = {Proceedings of the ACM on Programming Languages}, volume = {6}, number = {OOPSLA1}, pages = {1--26}, year = {2022}, publisher = {ACM New York, NY, USA}, url = {https://doi.org/10.1145/3527317}, doi = {10.1145/3527317}, month = apr, articleno = {73}, }

Awards & Honors

Illinois Innovation Award ($20K) 2025

Amazon AICE Ph.D. Fellowship ($70K) 2025

Jane Street Fellowship Honorable Mention 2025

Proposal, Amazon Nova AI Challenge ($250K) 2024

OpenAI Researcher Access Program 2024

Machine Learning and Systems Rising Stars 2024

Warren W. Yee Memorial Fellowship 2024

ACM SIGSOFT Distinguished Paper Award (FSE'23) 2023

Distinguished Artifact Award (ASPLOS'23) 2023

Service

Organizing: LLM4Code@ICSE (Publicity Chair)

Program Committee/Reviewer: ASE, TSE, TOSEM, NeurIPS, ICLR

Artifact Evaluation Committee: PLDI, OSDI, ATC

Invited Talk

NLP+SE Seminar, UT Austin: Smelling the Quality of LLM-generated Code Mar 2025

Programming Systems, Uber: Evaluating LLMs for Correct & Efficient Code Generation Sept 2024

ARiSE Lab, Columbia University: Simplify the Making of Great Software in the ML Era April 2024

Snowflake GenAI: Rigorous Evaluation of LLMs for Code (Slides) Feb 2024

AST Lab, ETH Zürich: Generating Test-Cases for ML Compilers (Slides) Jan 2024

GAI4SE, NC State University: LLMs for Software Testing (Guest Lecture) Nov 2023

Apache TVM Conference: Automating DL Compiler Bug Finding with NNSmith Mar 2023

SAMPL, University of Washington: Coverage-Guided Tensor Compiler Fuzzing May 2022