Institute for Advanced StudyTechnische Universität München
We present observations and analysis of the starburst, PACS-819, at z=1.45 (M=1010.7M_*=10^{10.7} M_{ \odot}), using high-resolution (0.10^{\prime \prime}.1; 0.8 kpc) ALMA and multi-wavelength JWST images from the COSMOS-Web program. Dissimilar to HST/ACS images in the rest-frame UV, the redder NIRCam and MIRI images reveal a smooth central mass concentration and spiral-like features, atypical for such an intense starburst. Through dynamical modeling of the CO J=5--4 emission with ALMA, PACS-819 is rotation-dominated thus has a disk-like nature. However, kinematic anomalies in CO and asymmetric features in the bluer JWST bands (e.g., F150W) support a more disturbed nature likely due to interactions. The JWST imaging further enables us to map the distribution of stellar mass and dust attenuation, thus clarifying the relationships between different structural components, not discernable in the previous HST images. The CO J = 5 -- 4 and FIR dust continuum emission are co-spatial with a heavily-obscured starbursting core (<1 kpc) which is partially surrounded by much less obscured star-forming structures including a prominent arc, possibly a tidally-distorted dwarf galaxy, and a clump, either a sign of an ongoing violent disk instability or a recently accreted low-mass satellite. With spatially-resolved maps, we find a high molecular gas fraction in the central area reaching 3\sim3 (MgasM_{\text{gas}}/MM_*) and short depletion times (Mgas/SFRM_{\text{gas}}/SFR\sim 120 Myrs) across the entire system. These observations provide insights into the complex nature of starbursts in the distant universe and underscore the wealth of complementary information from high-resolution observations with both ALMA and JWST.
This paper offers a comprehensive review of the fundamental architectural components—Perception, Reasoning, Memory, and Execution—necessary for building autonomous Large Language Model (LLM) agents. It synthesizes state-of-the-art techniques to enhance the execution of complex automation tasks, providing a structured framework for future development.
Researchers from Georgia Institute of Technology, Harbin Institute of Technology, Google DeepMind, and others compiled a comprehensive survey of humanoid locomotion and manipulation. It integrates traditional model-based methods with learning-based techniques and explores the emerging role of foundation models, highlighting the critical importance of whole-body tactile feedback.
Swin-Unet, a pure Transformer-based U-shaped encoder-decoder network from Technische Universität München, Fudan University, and Huawei Technologies, improves medical image segmentation by effectively capturing long-range dependencies. It achieved an average Dice-Similarity Coefficient of 79.13% and a Hausdorff Distance of 21.55 on the Synapse multi-organ CT dataset, showing more precise boundary predictions compared to prior methods.
1,893
In recent years, diffusion models trained on equilibrium molecular distributions have proven effective for sampling biomolecules. Beyond direct sampling, the score of such a model can also be used to derive the forces that act on molecular systems. However, while classical diffusion sampling usually recovers the training distribution, the corresponding energy-based interpretation of the learned score is often inconsistent with this distribution, even for low-dimensional toy systems. We trace this inconsistency to inaccuracies of the learned score at very small diffusion timesteps, where the model must capture the correct evolution of the data distribution. In this regime, diffusion models fail to satisfy the Fokker--Planck equation, which governs the evolution of the score. We interpret this deviation as one source of the observed inconsistencies and propose an energy-based diffusion model with a Fokker--Planck-derived regularization term to enforce consistency. We demonstrate our approach by sampling and simulating multiple biomolecular systems, including fast-folding proteins, and by introducing a state-of-the-art transferable Boltzmann emulator for dipeptides that supports simulation and achieves improved consistency and efficient sampling. Our code, model weights, and self-contained JAX and PyTorch notebooks are available at this https URL.
Researchers at the Institute for Advanced Study clarify that pure quantum states in holographic theories, particularly those dual to "baby universes" connected by wormholes, generally do not converge in the large N limit. This finding resolves a paradox concerning the purity of boundary states versus the mixed nature of their bulk gravitational duals, while contrasting with black hole microstates which do converge to mixed states under specific conditions.
Researchers at Princeton University developed AutoCompressors, a method to efficiently extend the effective context window of pre-trained language models like OPT and Llama-2. The approach involves adapting these models to generate compact 'summary vectors' that distill information from long texts, allowing them to process documents tens of thousands of tokens long, improve in-context learning efficiency, and enable more efficient retrieval-augmented generation.
293
Researchers at TUM introduced "Dynamical Alignment," a principle demonstrating that Spiking Neural Networks (SNNs) can achieve performance comparable to Artificial Neural Networks (ANNs) by dynamically encoding input signals. This approach led to 96.98% accuracy on MNIST and reduced the SNN-ANN performance gap by an average of 72% on CIFAR-10, identifying bimodal computational modes that offer insights into neural efficiency and adaptability.
2
We study the realization of supergroup gauge theories using negative branes in string theory. We show that negative branes are intimately connected with the possibility of timelike compactification and exotic spacetime signatures previously studied by Hull. Isolated negative branes dynamically generate a change in spacetime signature near their worldvolumes, and are related by string dualities to a smooth M-theory geometry with closed timelike curves. Using negative D3 branes, we show that SU(0N)SU(0|N) supergroup theories are holographically dual to an exotic variant of type IIB string theory on dS3,2×Sˉ5dS_{3,2} \times \bar S^5, for which the emergent dimensions are timelike. Using branes, mirror symmetry and Nekrasov's instanton calculus, all of which agree, we derive the Seiberg-Witten curve for N=2 SU(NM)\mathcal N=2 ~SU(N|M) gauge theories. Together with our exploration of holography and string dualities for negative branes, this suggests that supergroup gauge theories may be non-perturbatively well-defined objects, though several puzzles remain.
Edward Witten's 'Introduction to Black Hole Thermodynamics' provides a comprehensive synthesis of black hole physics, tracing its evolution from classical general relativity and quantum field theory in curved spacetime to modern holographic insights from the Institute for Advanced Study. It clarifies how quantum effects endow black holes with thermodynamic properties, including temperature and entropy, and how advancements like the Ryu-Takayanagi formula and the Page curve help reconcile black hole evaporation with quantum unitarity.
A theoretical analysis reaffirms the Strong CP problem's reality, refuting recent arguments that it is illusory, and establishes that gauged discrete P or CP symmetries offer consistent solutions that naturally arise within quantum gravity frameworks. The work demonstrates that the observable combination of θˉ+θ\bar{\theta} + \theta must be dynamically determined and calculable, rather than an arbitrary value from superselection sectors.
We extend field-level inference to jointly constrain the cosmological parameters {A,ωcdm,H0}\{A,\omega_{\rm cdm},H_0\}, in both real and redshift space. Our analyses are based on mock data generated using a perturbative forward model, with noise drawn from a Gaussian distribution with a constant power spectrum. This idealized setting, where the field-level likelihood is exactly Gaussian, allows us to precisely quantify the information content in the nonlinear field on large scales. We find that field-level inference accurately recovers all cosmological parameters in both real and redshift space, with uncertainties consistent with perturbation theory expectations. We show that these error bars are comparable to those obtained from a joint power spectrum and bispectrum analysis using the same perturbative model. Finally, we perform several tests using the Gaussian field-level likelihood to fit the mock data where the true noise model is non-Gaussian, and find significant biases in the inferred cosmological parameters. These results highlight that the success of field-level inference critically depends on using the correct likelihood, which may be the primary challenge for applying this method to smaller scales even in the perturbative regime.
·
How well does a classic deep net architecture like AlexNet or VGG19 classify on a standard dataset such as CIFAR-10 when its width --- namely, number of channels in convolutional layers, and number of nodes in fully-connected internal layers --- is allowed to increase to infinity? Such questions have come to the forefront in the quest to theoretically understand deep learning and its mysteries about optimization and generalization. They also connect deep learning to notions such as Gaussian processes and kernels. A recent paper [Jacot et al., 2018] introduced the Neural Tangent Kernel (NTK) which captures the behavior of fully-connected deep nets in the infinite width limit trained by gradient descent; this object was implicit in some other recent papers. An attraction of such ideas is that a pure kernel-based method is used to capture the power of a fully-trained deep net of infinite width. The current paper gives the first efficient exact algorithm for computing the extension of NTK to convolutional neural nets, which we call Convolutional NTK (CNTK), as well as an efficient GPU implementation of this algorithm. This results in a significant new benchmark for the performance of a pure kernel-based method on CIFAR-10, being 10%10\% higher than the methods reported in [Novak et al., 2019], and only 6%6\% lower than the performance of the corresponding finite deep net architecture (once batch normalization, etc. are turned off). Theoretically, we also give the first non-asymptotic proof showing that a fully-trained sufficiently wide net is indeed equivalent to the kernel regression predictor using NTK.
Large multimodal models (LMMs) have recently gained attention due to their effectiveness to understand and generate descriptions of visual content. Most existing LMMs are in English language. While few recent works explore multilingual image LMMs, to the best of our knowledge, moving beyond the English language for cultural and linguistic inclusivity is yet to be investigated in the context of video LMMs. In pursuit of more inclusive video LMMs, we introduce a multilingual Video LMM benchmark, named ViMUL-Bench, to evaluate Video LMMs across 14 languages, including both low- and high-resource languages: English, Chinese, Spanish, French, German, Hindi, Arabic, Russian, Bengali, Urdu, Sinhala, Tamil, Swedish, and Japanese. Our ViMUL-Bench is designed to rigorously test video LMMs across 15 categories including eight culturally diverse categories, ranging from lifestyles and festivals to foods and rituals and from local landmarks to prominent cultural personalities. ViMUL-Bench comprises both open-ended (short and long-form) and multiple-choice questions spanning various video durations (short, medium, and long) with 8k samples that are manually verified by native language speakers. In addition, we also introduce a machine translated multilingual video training set comprising 1.2 million samples and develop a simple multilingual video LMM, named ViMUL, that is shown to provide a better tradeoff between high-and low-resource languages for video understanding. We hope our ViMUL-Bench and multilingual video LMM along with a large-scale multilingual video training set will help ease future research in developing cultural and linguistic inclusive multilingual video LMMs. Our proposed benchmark, video LMM and training data will be publicly released at this https URL.
We establish the conditions under which a conservation law associated with a non-invertible operator may be realized as a symmetry in quantum mechanics. As established by Wigner, all quantum symmetries must be represented by either unitary or antiunitary transformations. Relinquishing an implicit assumption of invertibility, we demonstrate that the fundamental invariance of quantum transition probabilities under the application of symmetries mandates that all non-invertible symmetries may only correspond to {\it projective} unitary or antiunitary transformations, i.e., {\it partial isometries}. This extends the notion of physical states beyond conventional rays in Hilbert space to equivalence classes in an {\it extended, gauged Hilbert space}, thereby broadening the traditional understanding of symmetry transformations in quantum theory. We discuss consequences of this result and explicitly illustrate how, in simple model systems, whether symmetries be invertible or non-invertible may be inextricably related to the particular boundary conditions that are being used.
The sphere partition function is one of the simplest euclidean gravity computations. It is usually interpreted as count of states. However, the one loop gravity correction contains a dimension dependent phase factor, iD+2i^{D+2}, which seems confusing for such an interpretation. We show that, after including an observer, this phase gets mostly cancelled for the quantity that should correspond to a count of states. However, an overall minus sign remains.
Since the publication of the first International AI Safety Report, AI capabilities have continued to improve across key domains. New training techniques that teach AI systems to reason step-by-step and inference-time enhancements have primarily driven these advances, rather than simply training larger models. As a result, general-purpose AI systems can solve more complex problems in a range of domains, from scientific research to software development. Their performance on benchmarks that measure performance in coding, mathematics, and answering expert-level science questions has continued to improve, though reliability challenges persist, with systems excelling on some tasks while failing completely on others. These capability improvements also have implications for multiple risks, including risks from biological weapons and cyber attacks. Finally, they pose new challenges for monitoring and controllability. This update examines how AI capabilities have improved since the first Report, then focuses on key risk areas where substantial new evidence warrants updated assessments.
These are notes on some entanglement properties of quantum field theory, aiming to make accessible a variety of ideas that are known in the literature. The main goal is to explain how to deal with entanglement when -- as in quantum field theory -- it is a property of the algebra of observables and not just of the states.
A generative diffusion model named DiffSyn predicts viable synthesis routes for crystalline materials, particularly zeolites, by accurately modeling complex, multi-modal parameter distributions. It establishes a new state-of-the-art in synthesis prediction, notably enabling the successful experimental synthesis of an unknown UFI zeolite with an Si/Al ratio of 19.0.
13
There are no more papers matching your filters at the moment.