Researchers from Hanyang University and Qualcomm AI Research developed InfiniPot, a framework enabling pre-trained Large Language Models to process arbitrarily long input contexts within strict, fixed memory constraints, without additional training. The method achieved state-of-the-art performance on long-context benchmarks like LongBench and Needle In A Haystack, extending effective context windows by over 30 times while maintaining memory and computational efficiency.
View blogQualcomm AI Research introduces Think Straight, Stop Smart (TSSS), a training-free framework designed for efficient multi-hop Retrieval-Augmented Generation (RAG) on on-device Large Language Models (LLMs). This approach achieves state-of-the-art accuracy in multi-hop question answering while substantially reducing inference time and token generation compared to existing RAG methods.
View blog