Blogs [2024/09] Locret: Enabling Long-Context Inference on Personal Devices [2024/09] Ouroboros: Speculative Decoding that is Relatively Fast and Abusolutely Accurate [2024/09] CA-LoRA: Personal Devices-Friendly Downstream Task Adapting Non-Academic Stuff [2024/10] A Guidebook of Exchange Studies for DST@THU Students (In Chinese) | 贵系交换指南 | Unfinished [2024/04] Acceleration of LLM’s Generation (In Chinese) | 大模型推理加速