MLC-LLM: Machine Learning Compilation for Large Language Models
[repo]
MLC-LLM is a universal solution that allows any language models to be deployed natively on a diverse set of hardware backends and native applications, plus a productive framework for everyone to further optimize model performance for their own use cases.
|
Research (*indicates equal contribution)
|
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
[pdf]
Siyuan Feng*, Bohan Hou*, Hongyi Jin, Wuwei Lin, Junru Shao, Ruihang Lai, Zihao Ye, Lianmin Zheng, Cody Hao Yu, Yong Yu, and Tianqi Chen
The 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '23)
|
Tensor Program Optimization with Probabilistic Programs
[pdf]
Junru Shao, Xiyou Zhou, Siyuan Feng, Bohan Hou, Ruihang Lai, Hongyi Jin, Wuwei Lin, Masahiro Masuda, Cody Hao Yu, Tianqi Chen
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), 2022
|
|