MLC-LLM: Machine Learning Compilation for Large Language Models
[repo]
MLC-LLM is a universal solution that allows any language models to be deployed natively on a diverse set of hardware backends and native applications, plus a productive framework for everyone to further optimize model performance for their own use cases.
|
Research (*indicates equal contribution)
|
Optimal Kernel Orchestration for Tensor Programs with Korch
[pdf]
Muyan Hu*, Ashwin Venkatram*, Shreyashri Biswas*, Balamurugan Marimuthu*, Bohan Hou, Gabriele Oliaro, Haojie Wang, Liyan Zheng, Xupeng Miao, Jidong Zhai, Zhihao Jia
The 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '24)
|
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
[pdf]
Ruihang Lai*, Junru Shao*, Siyuan Feng*, Steven S Lyubomirsky*, Bohan Hou, Wuwei Lin, Zihao Ye, Hongyi Jin, Yuchen Jin, Jiawei Liu, Lesheng Jin, Yaxing Cai, Ziheng Jiang, Yong Wu, Sunghyun Park, Prakalp Srivastava, Jared G Roesch, Todd C Mowry, Tianqi Chen
|
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
[pdf]
Siyuan Feng*, Bohan Hou*, Hongyi Jin, Wuwei Lin, Junru Shao, Ruihang Lai, Zihao Ye, Lianmin Zheng, Cody Hao Yu, Yong Yu, and Tianqi Chen
The 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '23)
|
Tensor Program Optimization with Probabilistic Programs
[pdf]
Junru Shao, Xiyou Zhou, Siyuan Feng, Bohan Hou, Ruihang Lai, Hongyi Jin, Wuwei Lin, Masahiro Masuda, Cody Hao Yu, Tianqi Chen
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022
|
|