A comprehensive survey from an international research consortium led by Peking University examines Large Language Model (LLM) agents through a methodology-centered taxonomy, analyzing their construction, collaboration mechanisms, and evolution while providing a unified architectural framework for understanding agent systems across different application domains.
View blogA comprehensive survey of vision foundational models categorizes them based on prompting mechanisms into textually prompted, visually prompted, heterogeneous modality-based, and embodied models. It details their architectures, training objectives, and diverse applications, highlighting capabilities like zero-shot learning and multimodal understanding.
View blogTritonCast, a hierarchical deep learning framework, tackles the spectral bias problem in AI models to achieve unprecedented long-term stability and physical fidelity in Earth system forecasting. The model delivers stable year-long global weather predictions and multi-year climate simulations, while also demonstrating state-of-the-art accuracy in medium-range weather, high-fidelity ocean dynamics, and robust zero-shot generalization across spatial resolutions.
View blogThis paper provides a comprehensive review of 3D Gaussian Splatting (3DGS), an explicit 3D representation that achieves photorealistic novel view synthesis with real-time rendering and significantly faster training times than previous methods like NeRF, enabling new applications in 3D content creation and interaction. The review consolidates advancements across 3D reconstruction, editing, and generative tasks.
View blogResearchers at the Flatiron Institute introduced a representation-based framework using "manifold capacity" and "Geometry Linked to Untangling Efficiency" (GLUE) measures to characterize feature learning in neural networks. This approach offers a detailed understanding beyond the binary lazy-rich dichotomy, revealing distinct learning stages and strategies, and providing geometric explanations for out-of-distribution generalization failures.
View blogResearchers from Shanghai AI Lab and UCLA propose a new, refined taxonomy for multi-modal sensor fusion methods in autonomous driving perception, classifying them into strong-fusion (early, deep, late, asymmetry) and weak-fusion categories. The work also identifies and categorizes existing challenges and outlines future research directions to improve perception system robustness.
View blogA comprehensive survey categorizes and analyzes quantum algorithms for solving linear systems, detailing advancements since HHL and critically assessing their complexities and practical applications. The paper presents a novel taxonomy of QLSP solvers and revisits key limitations, highlighting algorithms that achieve optimal scaling while emphasizing persistent challenges in state preparation and output.
View blogCarney et al. construct quantum mechanical models where Newton's law of gravitation emerges from the thermodynamic properties of microscopic quantum systems, providing a framework to test if gravity is an emergent phenomenon. The models predict distinct experimental signatures, including anomalous force noise, enhanced spatial decoherence, and altered entanglement generation compared to standard quantum gravity.
View blogThe pathfinder framework provides a semantic search and synthesis tool for astronomical literature, complementing existing databases by addressing information overload. It processes natural language queries to identify relevant papers and synthesize information, achieving improved retrieval accuracy and user satisfaction on various benchmarks.
View blog