alphaXiv

Institut de Rob

28 Jul 2020

ai-for-health computer-science computer-vision-security

Large Scale Image Segmentation with Structured Loss based Deep Learning for Connectome Reconstruction

ETH Zurich University of Zurich Institut de Rob otica i Inform Janelia Research Campus atica Industrial Howard Hughes Medical Instutute UPC/CSIC Barcelona ":

We present a method combining affinity prediction with region agglomeration, which improves significantly upon the state of the art of neuron segmentation from electron microscopy (EM) in accuracy and scalability. Our method consists of a 3D U-NET, trained to predict affinities between voxels, followed by iterative region agglomeration. We train using a structured loss based on MALIS, encouraging topologically correct segmentations obtained from affinity thresholding. Our extension consists of two parts: First, we present a quasi-linear method to compute the loss gradient, improving over the original quadratic algorithm. Second, we compute the gradient in two separate passes to avoid spurious gradient contributions in early training stages. Our predictions are accurate enough that simple learning-free percentile-based agglomeration outperforms more involved methods used earlier on inferior predictions. We present results on three diverse EM datasets, achieving relative improvements over previous results of 27%, 15%, and 250%. Our findings suggest that a single method can be applied to both nearly isotropic block-face EM data and anisotropic serial sectioned EM data. The runtime of our method scales linearly with the size of the volume and achieves a throughput of about 2.6 seconds per megavoxel, qualifying our method for the processing of very large datasets.

04 Apr 2022

computer-science computer-vision-and-pattern-recognition generative-models

LISA: Learning Implicit Shape and Appearance of Hands

Meta Institut de Rob otica i Inform atica Industrial, CSIC-UPC ":

This paper proposes a do-it-all neural model of human hands, named LISA. The model can capture accurate hand shape and appearance, generalize to arbitrary hand subjects, provide dense surface correspondences, be reconstructed from images in the wild and easily animated. We train LISA by minimizing the shape and appearance losses on a large set of multi-view RGB image sequences annotated with coarse 3D poses of the hand skeleton. For a 3D point in the hand local coordinate, our model predicts the color and the signed distance with respect to each hand bone independently, and then combines the per-bone predictions using predicted skinning weights. The shape, color and pose representations are disentangled by design, allowing to estimate or animate only selected parameters. We experimentally demonstrate that LISA can accurately reconstruct a dynamic hand from monocular or multi-view sequences, achieving a noticeably higher quality of reconstructed hand shapes compared to baseline approaches. Project page: this https URL.

26 Jun 2023

computer-science computer-vision-and-pattern-recognition machine-learning

Robust Wind Turbine Blade Segmentation from RGB Images in the Wild

Institut de Rob otica i Inform atica Industrial Wind Power LAB ":

With the relentless growth of the wind industry, there is an imperious need to design automatic data-driven solutions for wind turbine maintenance. As structural health monitoring mainly relies on visual inspections, the first stage in any automatic solution is to identify the blade region on the image. Thus, we propose a novel segmentation algorithm that strengthens the U-Net results by a tailored loss, which pools the focal loss with a contiguity regularization term. To attain top performing results, a set of additional steps are proposed to ensure a reliable, generic, robust and efficient algorithm. First, we leverage our prior knowledge on the images by filling the holes enclosed by temporarily-classified blade pixels and by the image boundaries. Subsequently, the mislead classified pixels are successfully amended by training an on-the-fly random forest. Our algorithm demonstrates its effectiveness reaching a non-trivial 97.39% of accuracy.

10 Jun 2024

computer-science artificial-intelligence computer-vision-and-pattern-recognition

Generalized Nested Latent Variable Models for Lossy Coding applied to Wind Turbine Scenarios

UPC CSIC Institut de Rob otica i Inform atica Industrial Wind Power LAB Copenhagen ":

Rate-distortion optimization through neural networks has accomplished competitive results in compression efficiency and image quality. This learning-based approach seeks to minimize the compromise between compression rate and reconstructed image quality by automatically extracting and retaining crucial information, while discarding less critical details. A successful technique consists in introducing a deep hyperprior that operates within a 2-level nested latent variable model, enhancing compression by capturing complex data dependencies. This paper extends this concept by designing a generalized L-level nested generative model with a Markov chain structure. We demonstrate as L increases that a trainable prior is detrimental and explore a common dimensionality along the distinct latent variables to boost compression performance. As this structured framework can represent autoregressive coders, we outperform the hyperprior model and achieve state-of-the-art performance while reducing substantially the computational cost. Our experimental evaluation is performed on wind turbine scenarios to study its application on visual inspections

There are no more papers matching your filters at the moment.