Universidad Industrial de Santander
Researchers from Universidad Industrial de Santander developed a privacy-preserving deep learning framework that integrates perceptual image transformation at the acquisition stage with deformable operators. This approach enables pre-trained models to process visually scrambled data with full task accuracy, matching non-private baselines (e.g., 94.1% on CIFAR-10 classification), while significantly reducing computational overhead and model parameters compared to prior methods.
7
Imaging inverse problems aims to recover high-dimensional signals from undersampled, noisy measurements, a fundamentally ill-posed task with infinite solutions in the null-space of the sensing operator. To resolve this ambiguity, prior information is typically incorporated through handcrafted regularizers or learned models that constrain the solution space. However, these priors typically ignore the task-specific structure of that null-space. In this work, we propose \textit{Non-Linear Projections of the Null-Space} (NPN), a novel class of regularization that, instead of enforcing structural constraints in the image domain, promotes solutions that lie in a low-dimensional projection of the sensing matrix's null-space with a neural network. Our approach has two key advantages: (1) Interpretability: by focusing on the structure of the null-space, we design sensing-matrix-specific priors that capture information orthogonal to the signal components that are fundamentally blind to the sensing process. (2) Flexibility: NPN is adaptable to various inverse problems, compatible with existing reconstruction frameworks, and complementary to conventional image-domain priors. We provide theoretical guarantees on convergence and reconstruction accuracy when used within plug-and-play methods. Empirical results across diverse sensing matrices demonstrate that NPN priors consistently enhance reconstruction fidelity in various imaging inverse problems, such as compressive sensing, deblurring, super-resolution, computed tomography, and magnetic resonance imaging, with plug-and-play methods, unrolling networks, deep image prior, and diffusion models.
DeepInverse is an open-source PyTorch-based library for solving imaging inverse problems. The library covers all crucial steps in image reconstruction from the efficient implementation of forward operators (e.g., optics, MRI, tomography), to the definition and resolution of variational problems and the design and training of advanced neural network architectures. In this paper, we describe the main functionality of the library and discuss the main design choices.
·
Neural Radiance Field (NeRF)-based segmentation methods focus on object semantics and rely solely on RGB data, lacking intrinsic material properties. This limitation restricts accurate material perception, which is crucial for robotics, augmented reality, simulation, and other applications. We introduce UnMix-NeRF, a framework that integrates spectral unmixing into NeRF, enabling joint hyperspectral novel view synthesis and unsupervised material segmentation. Our method models spectral reflectance via diffuse and specular components, where a learned dictionary of global endmembers represents pure material signatures, and per-point abundances capture their distribution. For material segmentation, we use spectral signature predictions along learned endmembers, allowing unsupervised material clustering. Additionally, UnMix-NeRF enables scene editing by modifying learned endmember dictionaries for flexible material-based appearance manipulation. Extensive experiments validate our approach, demonstrating superior spectral reconstruction and material segmentation to existing methods. Project page: this https URL.
1
The SoccerNet 2025 Challenges mark the fifth annual edition of the SoccerNet open benchmarking effort, dedicated to advancing computer vision research in football video understanding. This year's challenges span four vision-based tasks: (1) Team Ball Action Spotting, focused on detecting ball-related actions in football broadcasts and assigning actions to teams; (2) Monocular Depth Estimation, targeting the recovery of scene geometry from single-camera broadcast clips through relative depth estimation for each pixel; (3) Multi-View Foul Recognition, requiring the analysis of multiple synchronized camera views to classify fouls and their severity; and (4) Game State Reconstruction, aimed at localizing and identifying all players from a broadcast video to reconstruct the game state on a 2D top-view of the field. Across all tasks, participants were provided with large-scale annotated datasets, unified evaluation protocols, and strong baselines as starting points. This report presents the results of each challenge, highlights the top-performing solutions, and provides insights into the progress made by the community. The SoccerNet Challenges continue to serve as a driving force for reproducible, open research at the intersection of computer vision, artificial intelligence, and sports. Detailed information about the tasks, challenges, and leaderboards can be found at this https URL, with baselines and development kits available at this https URL.
Sparse-view computed tomography (CT) reconstruction is fundamentally challenging due to undersampling, leading to an ill-posed inverse problem. Traditional iterative methods incorporate handcrafted or learned priors to regularize the solution but struggle to capture the complex structures present in medical images. In contrast, diffusion models (DMs) have recently emerged as powerful generative priors that can accurately model complex image distributions. In this work, we introduce Diffusion Consensus Equilibrium (DICE), a framework that integrates a two-agent consensus equilibrium into the sampling process of a DM. DICE alternates between: (i) a data-consistency agent, implemented through a proximal operator enforcing measurement consistency, and (ii) a prior agent, realized by a DM performing a clean image estimation at each sampling step. By balancing these two complementary agents iteratively, DICE effectively combines strong generative prior capabilities with measurement consistency. Experimental results show that DICE significantly outperforms state-of-the-art baselines in reconstructing high-quality CT images under uniform and non-uniform sparse-view settings of 15, 30, and 60 views (out of a total of 180), demonstrating both its effectiveness and robustness.
Imaging Inverse problems aim to reconstruct an underlying image from undersampled, coded, and noisy observations. Within the wide range of reconstruction frameworks, the unrolling algorithm is one of the most popular due to the synergistic integration of traditional model-based reconstruction methods and modern neural networks, providing an interpretable and highly accurate reconstruction. However, when the sensing operator is highly ill-posed, gradient steps on the data-fidelity term can hinder convergence and degrade reconstruction quality. To address this issue, we propose UTOPY, a homotopy continuation formulation for training the unrolling algorithm. Mainly, this method involves using a well-posed (synthetic) sensing matrix at the beginning of the unrolling network optimization. We define a continuation path strategy to transition smoothly from the synthetic fidelity to the desired ill-posed problem. This strategy enables the network to progressively transition from a simpler, well-posed inverse problem to the more challenging target scenario. We theoretically show that, for projected gradient descent-like unrolling models, the proposed continuation strategy generates a smooth path of unrolling solutions. Experiments on compressive sensing and image deblurring demonstrate that our method consistently surpasses conventional unrolled training, achieving up to 2.5 dB PSNR improvement in reconstruction performance. Source code at
1
Passive hyperspectral longwave infrared measurements are remarkably informative about the surroundings. Remote object material and temperature determine the spectrum of thermal radiance, and range, air temperature, and gas concentrations determine how this spectrum is modified by propagation to the sensor. We introduce a passive range imaging method based on computationally separating these phenomena. Previous methods assume hot and highly emitting objects; ranging is more challenging when objects' temperatures do not deviate greatly from air temperature. Our method jointly estimates range and intrinsic object properties, with explicit consideration of air emission, though reflected light is assumed negligible. Inversion being underdetermined is mitigated by using a parametric model of atmospheric absorption and regularizing for smooth emissivity estimates. To assess where our estimate is likely accurate, we introduce a technique to detect which scene pixels are significantly influenced by reflected downwelling. Monte Carlo simulations demonstrate the importance of regularization, temperature differentials, and availability of many spectral bands. We apply our method to longwave infrared (8--13 μ\mum) hyperspectral image data acquired from natural scenes with no active illumination. Range features from 15m to 150m are recovered, with good qualitative match to lidar data for pixels classified as having negligible reflected downwelling.
A framework called Generalized Recorrupted-to-Recorrupted (GR2R) extends self-supervised image denoising to handle diverse noise distributions, including non-Gaussian additive noise and the Natural Exponential Family. The method achieves denoising performance comparable to fully supervised approaches across various real-world noise types, such as Poisson and Gamma, without requiring clean training data.
14
Deep-learning (DL)-based image deconvolution (ID) has exhibited remarkable recovery performance, surpassing traditional linear methods. However, unlike traditional ID approaches that rely on analytical properties of the point spread function (PSF) to achieve high recovery performance - such as specific spectrum properties or small conditional numbers in the convolution matrix - DL techniques lack quantifiable metrics for evaluating PSF suitability for DL-assisted recovery. Aiming to enhance deconvolution quality, we propose a metric that employs a non-linear approach to learn the invertibility of an arbitrary PSF using a neural network by mapping it to a unit impulse. A lower discrepancy between the mapped PSF and a unit impulse indicates a higher likelihood of successful inversion by a DL network. Our findings reveal that this metric correlates with high recovery performance in DL and traditional methods, thereby serving as an effective regularizer in deconvolution tasks. This approach reduces the computational complexity over conventional condition number assessments and is a differentiable process. These useful properties allow its application in designing diffractive optical elements through end-to-end (E2E) optimization, achieving invertible PSFs, and outperforming the E2E baseline framework.
Supervised learning techniques have proven their efficacy in many applications with abundant data. However, applying these methods to medical imaging is challenging due to the scarcity of data, given the high acquisition costs and intricate data characteristics of those images, thereby limiting the full potential of deep neural networks. To address the lack of data, augmentation techniques leverage geometry, color, and the synthesis ability of generative models (GMs). Despite previous efforts, gaps in the generation process limit the impact of data augmentation to improve understanding of medical images, e.g., the highly structured nature of some domains, such as X-ray images, is ignored. Current GMs rely solely on the network's capacity to blindly synthesize augmentations that preserve semantic relationships of chest X-ray images, such as anatomical restrictions, representative structures, or structural similarities consistent across datasets. In this paper, we introduce a novel GM that leverages the structural resemblance of medical images by learning a latent graph representation (LGR). We design an end-to-end model to learn (i) a LGR that captures the intrinsic structure of X-ray images and (ii) a graph convolutional network (GCN) that reconstructs the X-ray image from the LGR. We employ adversarial training to guide the generator and discriminator models in learning the distribution of the learned LGR. Using the learned GCN, our approach generates structure-preserving synthetic images by mapping generated LGRs to X-ray. Additionally, we evaluate the learned graph representation for other tasks, such as X-ray image classification and segmentation. Numerical experiments demonstrate the efficacy of our approach, increasing performance up to 3%3\% and 2%2\% for classification and segmentation, respectively.
MambaStyle introduces an efficient StyleGAN inversion framework that leverages Vision State-Space Models (VSSMs) to achieve superior reconstruction fidelity and editability for real images. The method significantly reduces computational costs and inference time, delivering high-quality results for both face and car image domains.
Accurate determination of the geothermal gradient is critical for assessing the geothermal energy potential of a given region. Of particular interest is the case of Colombia, a country with abundant geothermal resources. A history of active oil and gas exploration and production has left drilled boreholes in different geological settings, providing direct measurements of the geothermal gradient. Unfortunately, large regions of the country where geothermal resources might exist lack such measurements. Indirect geophysical measurements are costly and difficult to perform at regional scales. Computational thermal models could be constructed, but they require very detailed knowledge of the underlying geology and uniform sampling of subsurface temperatures to be well-constrained. We present an alternative approach that leverages recent advances in supervised machine learning and available direct measurements to predict the geothermal gradient in regions where only global-scale geophysical datasets and course geological knowledge are available. We find that a Gradient Boosted Regression Tree algorithm yields optimal predictions and extensively validate the trained model. We show that predictions of our model are within 12% accuracy and that independent measurements performed by other authors agree well with our model. Finnally, we present a geothermal gradient map for Colombia that highlights regions where futher exploration and data collection should be performed.
We present new frontiers in the modelling of the spectral energy distributions (SED) of active galaxies by introducing the radio-to-X-ray fitting capabilities of the publicly available Bayesian code AGNfitter. The new code release, called AGNfitter-rx, models the broad-band photometry covering the radio, infrared (IR), optical, ultraviolet (UV) and X-ray bands consistently, using a combination of theoretical and semi-empirical models of the AGN and host galaxy emission. This framework enables the detailed characterization of four physical components of the active nuclei: the accretion disk, the hot dusty torus, the relativistic jets/core radio emission, and the hot corona; alongside modeling three components within the host galaxy: stellar populations, cold dust, and the radio emission from the star-forming regions. Applying AGNfitter-rx to a diverse sample of 36 AGN SEDs at z<0.7 from the AGN SED ATLAS, we investigate and compare the performance of state-of-the-art torus and accretion disk emission models on fit quality and inferred physical parameters. We find that clumpy torus models that include polar winds and semi-empirical accretion disk templates including emission line features significantly increase the fit quality in 67% of the sources, by effectively reducing by 2σ2\sigma fit residuals in the 1.55μm1.5-5 \mu \rm m and 0.7μm0.7 \mu \rm m regimes.We demonstrate that, by applying AGNfitter-rx on photometric data, we are able to estimate inclination and opening angles of the torus, consistent with spectroscopic classifications within the AGN unified model, as well as black hole mass estimates in agreement with virial estimates based on Hα\alpha. The wavelength coverage and the flexibility for the inclusion of state-of-the-art theoretical models make AGNfitter-rx a unique tool for the further development of SED modelling for AGNs in present and future radio-to-X-ray galaxy surveys.
Covering from photography to depth and spectral estimation, diverse computational imaging (CI) applications benefit from the versatile modulation of coded apertures (CAs). The light wave fields as space, time, or spectral can be modulated to obtain projected encoded information at the sensor that is then decoded by efficient methods, such as the modern deep learning decoders. Despite the CA can be fabricated to produce an analog modulation, a binary CA is mostly preferred since easier calibration, higher speed, and lower storage are achieved. As the performance of the decoder mainly depends on the structure of the CA, several works optimize the CA ensembles by customizing regularizers for a particular application without considering critical physical constraints of the CAs. This work presents an end-to-end (E2E) deep learning-based optimization of CAs for CI tasks. The CA design method aims to cover a wide range of CI problems easily changing the loss function of the deep approach. The designed loss function includes regularizers to fulfill the widely used sensing requirements of the CI applications. Mainly, the regularizers can be selected to optimize the transmittance, the compression ratio, and the correlation between measurements, while a binary CA solution is encouraged, and the performance of the CI task is maximized in applications such as restoration, classification, and semantic segmentation.
SDO-FM is a foundation model using data from NASA's Solar Dynamics Observatory (SDO) spacecraft; integrating three separate instruments to encapsulate the Sun's complex physical interactions into a multi-modal embedding space. This model can be used to streamline scientific investigations involving SDO by making the enormous datasets more computationally accessible for heliophysics research and enable investigations that require instrument fusion. We discuss four key components: an ingestion pipeline to create machine learning ready datasets, the model architecture and training approach, resultant embeddings and fine-tunable models, and finally downstream fine-tuned applications. A key component of this effort has been to include subject matter specialists at each stage of development; reviewing the scientific value and providing guidance for model architecture, dataset, and training paradigm decisions. This paper marks release of our pretrained models and embedding datasets, available to the community on Hugging Face and this http URL.
This paper proposes a non-data-driven deep neural network for spectral image recovery problems such as denoising, single hyperspectral image super-resolution, and compressive spectral imaging reconstruction. Unlike previous methods, the proposed approach, dubbed Mixture-Net, implicitly learns the prior information through the network. Mixture-Net consists of a deep generative model whose layers are inspired by the linear and non-linear low-rank mixture models, where the recovered image is composed of a weighted sum between the linear and non-linear decomposition. Mixture-Net also provides a low-rank decomposition interpreted as the spectral image abundances and endmembers, helpful in achieving remote sensing tasks without running additional routines. The experiments show the MixtureNet effectiveness outperforming state-of-the-art methods in recovery quality with the advantage of architecture interpretability.
2
University of Amsterdam logoUniversity of AmsterdamCharles UniversityNew York University logoNew York UniversityUniversity of Chicago logoUniversity of ChicagoNikhefUniversity of LjubljanaINFN logoINFNCONICETUniversidade de LisboaLouisiana State UniversityRadboud UniversityColorado State UniversityCity University of New YorkGran Sasso Science InstituteSorbonne Université logoSorbonne UniversitéCase Western Reserve UniversityFermi National Accelerator LaboratoryObservatorio Pierre AugerUniversidade Federal do ABCKarlsruhe Institute of Technology logoKarlsruhe Institute of TechnologyUniversidad Nacional de La PlataMichigan Technological UniversityInstitute of Physics of the Czech Academy of SciencesUniversidade Estadual de Campinas (UNICAMP)University of AdelaideInstituto BalseiroUniversidade Federal de SergipeCNRS/IN2P3ao Paulo - USPUniversidade de SASTRONNational Centre for Nuclear ResearchUniversidade de Santiago de CompostelaHoria Hulubei National Institute for R&D in Physics and Nuclear EngineeringInstitute of Nuclear Physics, Polish Academy of SciencesLIPInstitute of Space ScienceUniversidad Industrial de Santander̈ur RadioastronomieJ. Stefan InstitutePalacky Universityao, Cie Paris-SaclayUniversidade Federal do Rio de Janeiro (UFRJ)exicoInstitut universitaire de France (IUF)e Grenoble AlpesUniversidade Federal de SUniversidade Federal do Oeste da BahiaIFLPerita Universidad AutBenemonoma de PueblaUniversidade Federal de Pelotase de Parisecnica Federico Santa MarUniversidad TCentro Brasileiro de Pesquisas FısicasUniversidade Federal Fluminense (UFF)Centro Atomico Barilocheat WuppertalUniversidad Nacional de San Agustat Siegena Degli Studi di Milanöat FreiburgCentro de Investigacia di Roma ”Tor Vergata”Instituto Galego de Fısica de Altas Enerxa del SalentoInstituto de Tecnologıas en Deteccion y de Estudios Avanzados del IPN (CINVESTAV)ıa Atomicae Libre de Bruxelles (ULB)ısica de Rosario (IFIR)ısica de La Plata (IALP)a del Piemonte Orientaleencia e Tecnologia do Espıa y Fe Savoie Mont Blancırito Santo (IFES)a di Torinoao Carlos (UFSCar)on y Astropartısica del Espacio (IAFE)Comision Nacional de Energa di Cataniaonoma de Bucaramangaın de ArequipaInstituto de Astronomıas (IGFAE)Instituto Federal de Educaıculas (ITeDA)onoma del Estado de MInstituto de F ```Universidad Nacional Aut",Universidad Aut",a",E",RWTH Aachen UniversityUniversit ",C",Onoma de M",Instituto de Astrof ```Max-Planck-Institut f":":Universidade Federal do ParanVrije Universiteit Brussel
The Pierre Auger Observatory presents the most comprehensive measurement of the Ultra-High Energy Cosmic Ray (UHECR) energy spectrum, combining different detection methods to cover declinations from 90 to +44.8 . This study confirms the "instep" feature at 10 EeV with 5.5-sigma significance and shows the UHECR energy spectrum is consistent across different sky regions.
Colorectal cancer is the third most aggressive cancer worldwide. Polyps, as the main biomarker of the disease, are detected, localized, and characterized through colonoscopy procedures. Nonetheless, during the examination, up to 25% of polyps are missed, because of challenging conditions (camera movements, lighting changes), and the close similarity of polyps and intestinal folds. Besides, there is a remarked subjectivity and expert dependency to observe and detect abnormal regions along the intestinal tract. Currently, publicly available polyp datasets have allowed significant advances in computational strategies dedicated to characterizing non-parametric polyp shapes. These computational strategies have achieved remarkable scores of up to 90% in segmentation tasks. Nonetheless, these strategies operate on cropped and expert-selected frames that always observe polyps. In consequence, these computational approximations are far from clinical scenarios and real applications, where colonoscopies are redundant on intestinal background with high textural variability. In fact, the polyps typically represent less than 1% of total observations in a complete colonoscopy record. This work introduces COLON: the largest COlonoscopy LONg sequence dataset with around of 30 thousand polyp labeled frames and 400 thousand background frames. The dataset was collected from a total of 30 complete colonoscopies with polyps at different stages, variations in preparation procedures, and some cases the observation of surgical instrumentation. Additionally, 10 full intestinal background video control colonoscopies were integrated in order to achieve a robust polyp-background frame differentiation. The COLON dataset is open to the scientific community to bring new scenarios to propose computational tools dedicated to polyp detection and segmentation over long sequences, being closer to real colonoscopy scenarios.
The detection of clinically significant prostate cancer lesions (csPCa) from biparametric magnetic resonance imaging (bp-MRI) has emerged as a noninvasive imaging technique for improving accurate diagnosis. Nevertheless, the analysis of such images remains highly dependent on the subjective expert interpretation. Deep learning approaches have been proposed for csPCa lesions detection and segmentation, but they remain limited due to their reliance on extensively annotated datasets. Moreover, the high lesion variability across prostate zones poses additional challenges, even for expert radiologists. This work introduces a second-order geometric attention (SOGA) mechanism that guides a dedicated segmentation network, through skip connections, to detect csPCa lesions. The proposed attention is modeled on the Riemannian manifold, learning from symmetric positive definitive (SPD) representations. The proposed mechanism was integrated into standard U-Net and nnU-Net backbones, and was validated on the publicly available PI-CAI dataset, achieving an Average Precision (AP) of 0.37 and an Area Under the ROC Curve (AUC-ROC) of 0.83, outperforming baseline networks and attention-based methods. Furthermore, the approach was evaluated on the Prostate158 dataset as an independent test cohort, achieving an AP of 0.37 and an AUC-ROC of 0.75, confirming robust generalization and suggesting discriminative learned representations.
There are no more papers matching your filters at the moment.