"A bigger problem: many third-party harnesses compress tool responses every 3 steps when approaching the context limit, leading to very low cache hit rates."
Fuli Luo
Xiaomi MiMo lead, ex-DeepSeek