Notes on The Era of 1-Bit LLMs: All Large Language Models Are in 1.58 Bits

Disclaimer: This is part of my notes on AI research papers. I do this to learn and communicate what I understand. Feel free to comment if you have any suggestion, that would be very much appreciated. The following post is a comment on the paper The Era of 1-Bit LLMs: All Large Language Models Are in 1.58 Bits by Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, and Furu Wei....

April 13, 13133 · 3 min · Àlex Pujol Vidal

Notes on Bitnet: Scaling 1 Bit Transformers for Large Language Models

Disclaimer: This is part of my notes on AI research papers. I do this to learn and communicate what I understand. Feel free to comment if you have any suggestion, that would be very much appreciated. The following post is a comment on the paper Bitnet: Scaling 1 Bit Transformers for Large Language Models by Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Huaijie Wang, Lingxiao Ma, Fang Yang, Ruiping Wang, Yi Wu, and Furu Wei....

April 12, 121211 · 5 min · Àlex Pujol Vidal