Roofline on CctoctoFX

Roofline on CctoctoFX https://pillumina.github.io/tags/roofline/ Recent content in Roofline on CctoctoFX CctoctoFX https://pillumina.github.io/imgs/icon_head.png https://pillumina.github.io/imgs/icon_head.png Hugo -- 0.148.2 en Mon, 22 Jun 2026 09:03:00 +0800 LLM 系统分析方法论（四）：M3 实战推演与 Roofline 模型 https://pillumina.github.io/posts/aiinfra/llm-computation-methodology/part-4/ Mon, 22 Jun 2026 09:03:00 +0800 https://pillumina.github.io/posts/aiinfra/llm-computation-methodology/part-4/ MiniMax M3 完整推演：从 config.json 到参数量、FLOPs、KV Cache、推理显存的全链路计算。Roofline 模型分析推理延迟，理解 FP8/INT4 量化的性能收益。