Muroran Institute of Technology
The field of computer vision applied to videos of minimally invasive surgery is ever-growing. Workflow recognition pertains to the automated recognition of various aspects of a surgery: including which surgical steps are performed; and which surgical instruments are used. This information can later be used to assist clinicians when learning the surgery; during live surgery; and when writing operation notes. The Pituitary Vision (PitVis) 2023 Challenge tasks the community to step and instrument recognition in videos of endoscopic pituitary surgery. This is a unique task when compared to other minimally invasive surgeries due to the smaller working space, which limits and distorts vision; and higher frequency of instrument and step switching, which requires more precise model predictions. Participants were provided with 25-videos, with results presented at the MICCAI-2023 conference as part of the Endoscopic Vision 2023 Challenge in Vancouver, Canada, on 08-Oct-2023. There were 18-submissions from 9-teams across 6-countries, using a variety of deep learning models. A commonality between the top performing models was incorporating spatio-temporal and multi-task methods, with greater than 50% and 10% macro-F1-score improvement over purely spacial single-task models in step and instrument recognition respectively. The PitVis-2023 Challenge therefore demonstrates state-of-the-art computer vision models in minimally invasive surgery are transferable to a new dataset, with surgery specific techniques used to enhance performance, progressing the field further. Benchmark results are provided in the paper, and the dataset is publicly available at: this https URL.
We propose a novel medical image classification method that integrates dual-model weight selection with self-knowledge distillation (SKD). In real-world medical settings, deploying large-scale models is often limited by computational resource constraints, which pose significant challenges for their practical implementation. Thus, developing lightweight models that achieve comparable performance to large-scale models while maintaining computational efficiency is crucial. To address this, we employ a dual-model weight selection strategy that initializes two lightweight models with weights derived from a large pretrained model, enabling effective knowledge transfer. Next, SKD is applied to these selected models, allowing the use of a broad range of initial weight configurations without imposing additional excessive computational cost, followed by fine-tuning for the target classification tasks. By combining dual-model weight selection with self-knowledge distillation, our method overcomes the limitations of conventional approaches, which often fail to retain critical information in compact models. Extensive experiments on publicly available datasets-chest X-ray images, lung computed tomography scans, and brain magnetic resonance imaging scans-demonstrate the superior performance and robustness of our approach compared to existing methods.
Reliable recognition and localization of surgical instruments in endoscopic video recordings are foundational for a wide range of applications in computer- and robot-assisted minimally invasive surgery (RAMIS), including surgical training, skill assessment, and autonomous assistance. However, robust performance under real-world conditions remains a significant challenge. Incorporating surgical context - such as the current procedural phase - has emerged as a promising strategy to improve robustness and interpretability. To address these challenges, we organized the Surgical Procedure Phase, Keypoint, and Instrument Recognition (PhaKIR) sub-challenge as part of the Endoscopic Vision (EndoVis) challenge at MICCAI 2024. We introduced a novel, multi-center dataset comprising thirteen full-length laparoscopic cholecystectomy videos collected from three distinct medical institutions, with unified annotations for three interrelated tasks: surgical phase recognition, instrument keypoint estimation, and instrument instance segmentation. Unlike existing datasets, ours enables joint investigation of instrument localization and procedural context within the same data while supporting the integration of temporal information across entire procedures. We report results and findings in accordance with the BIAS guidelines for biomedical image analysis challenges. The PhaKIR sub-challenge advances the field by providing a unique benchmark for developing temporally aware, context-driven methods in RAMIS and offers a high-quality resource to support future research in surgical scene understanding.
This paper considers an active reconfigurable intelligent surface (RIS)-aided integrated sensing and communication (ISAC) system. We aim to maximize radar signal-to-interference-plus-noise-ratio (SINR) by jointly optimizing the beamforming matrix at the dual-function radar-communication (DFRC) base station (BS) and the reflecting coefficients at the active RIS subject to the quality of service (QoS) constraints of communication users (UE) and the transmit power constraints of active RIS and DFRC BS. To tackle the optimization problem, the majorization-minimization (MM) algorithm is applied to address the nonconvex radar SINR objective function, and the resulting quartic problem is solved by developing an semidefinite relaxation (SDR)-based approach. Moreover, we derive the scaling order of the radar SINR with a large number of reflecting elements. Next, the transmit power allocation problem and the deployment strategy of the active RIS are studied with a moderate number of reflecting elements. Finally, we validate the potential of the active RIS in ISAC systems compared to passive RIS. Additionally, we deliberate on several open problems that remain for future research.
Radiation therapy plays a crucial role in cancer treatment, necessitating precise delivery of radiation to tumors while sparing healthy tissues over multiple days. Computed tomography (CT) is integral for treatment planning, offering electron density data crucial for accurate dose calculations. However, accurately representing patient anatomy is challenging, especially in adaptive radiotherapy, where CT is not acquired daily. Magnetic resonance imaging (MRI) provides superior soft-tissue contrast. Still, it lacks electron density information while cone beam CT (CBCT) lacks direct electron density calibration and is mainly used for patient positioning. Adopting MRI-only or CBCT-based adaptive radiotherapy eliminates the need for CT planning but presents challenges. Synthetic CT (sCT) generation techniques aim to address these challenges by using image synthesis to bridge the gap between MRI, CBCT, and CT. The SynthRAD2023 challenge was organized to compare synthetic CT generation methods using multi-center ground truth data from 1080 patients, divided into two tasks: 1) MRI-to-CT and 2) CBCT-to-CT. The evaluation included image similarity and dose-based metrics from proton and photon plans. The challenge attracted significant participation, with 617 registrations and 22/17 valid submissions for tasks 1/2. Top-performing teams achieved high structural similarity indices (>0.87/0.90) and gamma pass rates for photon (>98.1%/99.0%) and proton (>97.3%/97.0%) plans. However, no significant correlation was found between image similarity metrics and dose accuracy, emphasizing the need for dose evaluation when assessing the clinical applicability of sCT. SynthRAD2023 facilitated the investigation and benchmarking of sCT generation techniques, providing insights for developing MRI-only and CBCT-based adaptive radiotherapy.
Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier efforts and the CholecTriplet challenge introduced in 2021 have put together techniques aimed at recognizing these triplets from surgical footage. Estimating also the spatial locations of the triplets would offer a more precise intraoperative context-aware decision support for computer-assisted intervention. This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection. It includes weakly-supervised bounding box localization of every visible surgical instrument (or tool), as the key actors, and the modeling of each tool-activity in the form of triplet. The paper describes a baseline method and 10 new deep learning algorithms presented at the challenge to solve the task. It also provides thorough methodological comparisons of the methods, an in-depth analysis of the obtained results across multiple metrics, visual and procedural challenges; their significance, and useful insights for future research directions and applications in surgery.
Multi-stain whole-slide-image (WSI) registration is an active field of research. It is unclear, however, how the current WSI registration methods would perform on a real-world data set. AutomatiC Registration Of Breast cAncer Tissue (ACROBAT) challenge is held to verify the performance of the current WSI registration methods by using a new dataset that originates from routine diagnostics to assess real-world applicability. In this report, we present our solution for the ACROBAT challenge. We employ a two-step approach including rigid and non-rigid transforms. The experimental results show that the median 90th percentile is 1,250 um for the validation dataset.
With the emergence of fluid antenna (FA) in wireless communications, the capability to dynamically adjust port positions offers substantial benefits in spatial diversity and spectrum efficiency, which are particularly valuable for mobile edge computing (MEC) systems. Therefore, we propose an FA-assisted MEC offloading framework to minimize system delay. This framework faces two severe challenges, which are the complexity of channel estimation due to dynamic port configuration and the inherent non-convexity of the joint optimization problem. Firstly, we propose Information Bottleneck Metric-enhanced Channel Compressed Sensing (IBM-CCS), which advances FA channel estimation by integrating information relevance into the sensing process and capturing key features of FA channels effectively. Secondly, to address the non-convex and high-dimensional optimization problem in FA-assisted MEC systems, which includes FA port selection, beamforming, power control, and resource allocation, we propose a game theory-assisted Hierarchical Twin-Dueling Multi-agent Algorithm (HiTDMA) based offloading scheme, where the hierarchical structure effectively decouples and coordinates the optimization tasks between the user side and the base station side. Crucially, the game theory effectively reduces the dimensionality of power control variables, allowing deep reinforcement learning (DRL) agents to achieve improved optimization efficiency. Numerical results confirm that the proposed scheme significantly reduces system delay and enhances offloading performance, outperforming benchmarks. Additionally, the IBM-CCS channel estimation demonstrates superior accuracy and robustness under varying port densities, contributing to efficient communication under imperfect CSI.
While data is distributed in multiple edge devices, Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training, while several devices are selected in each round. However, straggler devices may slow down the training process or even make the system crash during training. Meanwhile, other idle edge devices remain unused. As the bandwidth between the devices and the server is relatively low, the communication of intermediate data becomes a bottleneck. In this paper, we propose Time-Efficient Asynchronous federated learning with Sparsification and Quantization, i.e., TEASQ-Fed. TEASQ-Fed can fully exploit edge devices to asynchronously participate in the training process by actively applying for tasks. We utilize control parameters to choose an appropriate number of parallel edge devices, which simultaneously execute the training tasks. In addition, we introduce a caching mechanism and weighted averaging with respect to model staleness to further improve the accuracy. Furthermore, we propose a sparsification and quantitation approach to compress the intermediate data to accelerate the training. The experimental results reveal that TEASQ-Fed improves the accuracy (up to 16.67% higher) while accelerating the convergence of model training (up to twice faster).
·
Robotic assisted (RA) surgery promises to transform surgical intervention. Intuitive Surgical is committed to fostering these changes and the machine learning models and algorithms that will enable them. With these goals in mind we have invited the surgical data science community to participate in a yearly competition hosted through the Medical Imaging Computing and Computer Assisted Interventions (MICCAI) conference. With varying changes from year to year, we have challenged the community to solve difficult machine learning problems in the context of advanced RA applications. Here we document the results of these challenges, focusing on surgical tool localization (SurgToolLoc). The publicly released dataset that accompanies these challenges is detailed in a separate paper arXiv:2501.09209 [1].
Deep learning techniques, particularly convolutional neural networks (CNNs), have gained traction for synthetic computed tomography (sCT) generation from Magnetic resonance imaging (MRI), Cone-beam computed tomography (CBCT) and PET. In this report, we introduce a method to syn-thesize CT from MRI or CBCT. Our method is based on multi-slice (2.5D) CNNs. 2.5D CNNs offer distinct advantages over 3D CNNs when dealing with volumetric data. In the experiments, we evaluate the performance of our method for two tasks, MRI-to-sCT and CBCT-to-sCT generation. Target organs for both tasks are brain and pelvis.
Federated learning (FL) is a promising paradigm that can enable collaborative model training between vehicles while protecting data privacy, thereby significantly improving the performance of intelligent transportation systems (ITSs). In vehicular networks, due to mobility, resource constraints, and the concurrent execution of multiple training tasks, how to allocate limited resources effectively to achieve optimal model training of multiple tasks is an extremely challenging issue. In this paper, we propose a mobility-aware multi-task decentralized federated learning (MMFL) framework for vehicular networks. By this framework, we address task scheduling, subcarrier allocation, and leader selection, as a joint optimization problem, termed as TSLP. For the case with a single FL task, we derive the convergence bound of model training. For general cases, we first model TSLP as a resource allocation game, and prove the existence of a Nash equilibrium (NE). Then, based on this proof, we reformulate the game as a decentralized partially observable Markov decision process (DEC-POMDP), and develop an algorithm based on heterogeneous-agent proximal policy optimization (HAPPO) to solve DEC-POMDP. Finally, numerical results are used to demonstrate the effectiveness of the proposed algorithm.
In our series of papers, we establish infinite concentration and oscillation estimates for supercritical semilinear elliptic equations in discs. Especially, we extend the previous result by the author (N. arXiv:2404.01634) to the general supercritical case. Our growth condition is related to the one introduced by Dupaigne-Farina (J. Eur. Math. Soc. 12: 855--882, 2010) and admits two types of supercritical nonlinearities, the Trudinger-Moser type growth eupe^{u^p} with p>2p>2 and the multiple exponential one exp((exp(um)))\exp{(\cdots(\exp{(u^m)})\cdots)} with m>0m>0. In this first part, we carry out the analysis of infinite concentration phenomena on any blow-up solutions. As a result, we classify all the infinite concentration behaviors into two types which in particular shows a new behavior for the multiple exponential case. More precisely, we detect an infinite sequence of concentrating parts on any blow-up solutions via the scaling and pointwise techniques. The precise description of the limit profile, energy, and position of each concentration is given via the Liouville equation with two types of the energy recurrence formulas. This leads us to observe two types of supercritical behaviors. The behavior for the latter growth is new and can be understood as the limit case of the former one. Our concentration estimates lead to the analysis of infinite oscillation phenomena, including infinite oscillations of bifurcation diagrams, discussed in the second part.
Timely and effective feedback within surgical training plays a critical role in developing the skills required to perform safe and efficient surgery. Feedback from expert surgeons, while especially valuable in this regard, is challenging to acquire due to their typically busy schedules, and may be subject to biases. Formal assessment procedures like OSATS and GEARS attempt to provide objective measures of skill, but remain time-consuming. With advances in machine learning there is an opportunity for fast and objective automated feedback on technical skills. The SimSurgSkill 2021 challenge (hosted as a sub-challenge of EndoVis at MICCAI 2021) aimed to promote and foster work in this endeavor. Using virtual reality (VR) surgical tasks, competitors were tasked with localizing instruments and predicting surgical skill. Here we summarize the winning approaches and how they performed. Using this publicly available dataset and results as a springboard, future work may enable more efficient training of surgeons with advances in surgical data science. The dataset can be accessed from this https URL.
This paper presents the design and results of the "PEg TRAnsfert Workflow recognition" (PETRAW) challenge whose objective was to develop surgical workflow recognition methods based on one or several modalities, among video, kinematic, and segmentation data, in order to study their added value. The PETRAW challenge provided a data set of 150 peg transfer sequences performed on a virtual simulator. This data set was composed of videos, kinematics, semantic segmentation, and workflow annotations which described the sequences at three different granularity levels: phase, step, and activity. Five tasks were proposed to the participants: three of them were related to the recognition of all granularities with one of the available modalities, while the others addressed the recognition with a combination of modalities. Average application-dependent balanced accuracy (AD-Accuracy) was used as evaluation metric to take unbalanced classes into account and because it is more clinically relevant than a frame-by-frame score. Seven teams participated in at least one task and four of them in all tasks. Best results are obtained with the use of the video and the kinematics data with an AD-Accuracy between 93% and 90% for the four teams who participated in all tasks. The improvement between video/kinematic-based methods and the uni-modality ones was significant for all of the teams. However, the difference in testing execution time between the video/kinematic-based and the kinematic-based methods has to be taken into consideration. Is it relevant to spend 20 to 200 times more computing time for less than 3% of improvement? The PETRAW data set is publicly available at www.synapse.org/PETRAW to encourage further research in surgical workflow recognition.
Strange metals challenge our understanding of charge transport in metals. Here, we investigate how the strange metal phase of La2x_{2-x}Srx_xCuO4_4 is impacted by a field-induced glassy antiferromagnetic state. Using magnetic fields above 80 T, we discover a strong enhancement of the normal state magnetoresistance when entering the spin glass phase. We demonstrate that the spin glass causes insulating-like upturns in the resistivity inside the pseudogap phase, which resolves the origin of the ''metal-insulator'' crossover. In addition the strange metal phase appears at low temperatures over an extended range of magnetic fields where magnetic moments fluctuate, and disappears when these moments freeze out. We conclude that the transport properties of the strange metal phase are controlled by magnetic fluctuations persisting at the lowest temperatures.
Data science and technology offer transformative tools and methods to science. This review article highlights latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS). A large amount of data and machine learning algorithms go hand in hand. Most plasma data, whether experimental, observational or computational, are generated or collected by machines today. It is now becoming impractical for humans to analyze all the data manually. Therefore, it is imperative to train machines to analyze and interpret (eventually) such data as intelligently as humans but far more efficiently in quantity. Despite the recent impressive progress in applications of data science to plasma science and technology, the emerging field of DDPS is still in its infancy. Fueled by some of the most challenging problems such as fusion energy, plasma processing of materials, and fundamental understanding of the universe through observable plasma phenomena, it is expected that DDPS continues to benefit significantly from the interdisciplinary marriage between plasma science and data science into the foreseeable future.
We theoretically demonstrate for the first time that a single free electron in circular/spiral motion emits twisted photons carrying well defined orbital angular momentum along the axis of the electron circulation, in adding to spin angular momentum. We show that, when the electron velocity is relativistic, the radiation field contains harmonic components and the photons of l-th harmonic carry lhbar total angular momentum for each. This work indicates that twisted photons are naturally emitted by free electrons and more ubiquitous in laboratories and in nature than ever been thought.
We have performed X-ray diffraction experiments on a single crystalline CeCoSi to investigate the unresolved ordered phase below T012T_0 \sim 12 K. We have discovered that a triclinic lattice distortion takes place below T0T_0, which is further modified in the subsequent antiferromagnetic ordered phase. The structural domains can be selected by applying a magnetic field, indicating that some electronic ordering exists behind and affects the magnetic anisotropy in the hidden ordered phase below T0T_0. The transition at T0T_0, although the order parameter is still unknown, is associated with the maximum in the cc-axis lattice parameter. In magnetic fields along [1,0,0][1, 0, 0], the structural transition temperature, named as Ts1T_{\text{s1}}, deviates from T0T_0 and decreases with increasing the field, whereas T0T_0 increases. This shows that the hidden ordered phase without triclinic distortion exists between Ts1T_{\text{s1}} and T0T_0. The results for H[1,1,0]H \parallel [1, 1, 0] are also reported.
The properties of cuprate high-temperature superconductors are largely shaped by competing phases whose nature is often a mystery. Chiefly among them is the pseudogap phase, which sets in at a doping pp^* that is material-dependent. What determines pp^* is currently an open question. Here we show that the pseudogap cannot open on an electron-like Fermi surface, and can only exist below the doping pFSp_{FS} at which the large Fermi surface goes from hole-like to electron-like, so that pp^* \leq pFSp_{FS}. We derive this result from high-magnetic-field transport measurements in La1.6x_{1.6-x}Nd0.4_{0.4}Srx_xCuO4_4 under pressure, which reveal a large and unexpected shift of pp^* with pressure, driven by a corresponding shift in pFSp_{FS}. This necessary condition for pseudogap formation, imposed by details of the Fermi surface, is a strong constraint for theories of the pseudogap phase. Our finding that pp^* can be tuned with a modest pressure opens a new route for experimental studies of the pseudogap.
There are no more papers matching your filters at the moment.