University of Applied Sciences and Arts Northwestern Switzerland
This paper presents a new approach to fine-tuning OpenAI's Whisper model for low-resource languages by introducing a novel data generation method that converts sentence-level data into a long-form corpus, using Swiss German as a case study. Non-sentence-level data, which could improve the performance of long-form audio, is difficult to obtain and often restricted by copyright laws. Our method bridges this gap by transforming more accessible sentence-level data into a format that preserves the model's ability to handle long-form audio and perform segmentation without requiring non-sentence-level data. Our data generation process improves performance in several real-world applications and leads to the development of a new state-of-the-art speech-to-text (STT) model for Swiss German. We compare our model with a non-fine-tuned Whisper and our previous state-of-the-art Swiss German STT models, where our new model achieves higher BLEU scores. Our results also indicate that the proposed method is adaptable to other low-resource languages, supported by written guidance and code that allows the creation of fine-tuned Whisper models, which keep segmentation capabilities and allow the transcription of longer audio files using only sentence-level data with high quality.
The field of visual localization has been researched for several decades and has meanwhile found many practical applications. Despite the strong progress in this field, there are still challenging situations in which established methods fail. We present an approach to significantly improve the accuracy and reliability of established visual localization methods by adding rendered images. In detail, we first use a modern visual SLAM approach that provides a 3D Gaussian Splatting (3DGS) based map to create reference data. We demonstrate that enriching reference data with images rendered from 3DGS at randomly sampled poses significantly improves the performance of both geometry-based visual localization and Scene Coordinate Regression (SCR) methods. Through comprehensive evaluation in a large industrial environment, we analyze the performance impact of incorporating these additional rendered views.
Bruni et al. systematically evaluated prompt engineering techniques to enhance the security of code generated by GPT models, finding that security-focused prefixes reduced vulnerability occurrence by up to 56% for newer models and Recursive Criticism and Improvement (RCI) fixed up to 64.7% of vulnerabilities.
Artificial Intelligence (AI) tools have become incredibly powerful in generating synthetic images. Of particular concern are generated images that resemble photographs as they aspire to represent real world events. Synthetic photographs may be used maliciously by a broad range of threat actors, from scammers to nation-state actors, to deceive, defraud, and mislead people. Mitigating this threat usually involves answering a basic analytic question: Is the photograph real or synthetic? To address this, we have examined the capabilities of recent generative diffusion models and have focused on their flaws: visible artifacts in generated images which reveal their synthetic origin to the trained eye. We categorize these artifacts, provide examples, discuss the challenges in detecting them, suggest practical applications of our work, and outline future research directions.
Solar flares are the most powerful, magnetically-driven, explosions in the heliosphere. The nature of magnetic energy release in the solar corona that heats the plasma and accelerates particles in a flare, however, remains poorly understood. Here, we report high-resolution coronal observations of a flare (SOL2024-09-30T23:47) by the Solar Orbiter mission that reveal initially weaker but rapid reconnection events, on timescales of at most a few seconds, leading to a more prominent activity of similar nature that explosively cause a flare. Signatures of this process are further imprinted on the widespread raining plasma blobs with short lifetimes, giving rise to the characteristic ribbon-like emission pattern associated with the flare. Our novel observations unveil the central engine of a flare and emphasize the crucial role of an avalanche-like magnetic energy release mechanism at work.
Solar flares are the most explosive phenomena in the solar system and the main trigger of the events' chain that starts from Coronal Mass Ejections and leads to geomagnetic storms with possible impacts on the infrastructures at Earth. Data-driven solar flare forecasting relies on either deep learning approaches, which are operationally promising but with a low explainability degree, or machine learning algorithms, which can provide information on the physical descriptors that mostly impact the prediction. This paper describes a web-based technological platform for the execution of a computational pipeline of feature-based machine learning methods that provide predictions of the flare occurrence, feature ranking information, and assessment of the prediction performances.
Nonthermal sources located above bright flare arcades, referred to as the "above-the-loop-top" sources, have been often suggested as the primary electron acceleration site in major solar flares. The X8.2 limb flare on 2017 September 10 features such an above-the-loop-top source, which was observed in both microwaves and hard X-rays (HXRs) by the Expanded Owens Valley Solar Array (EOVSA) and the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI), respectively. By combining the microwave and HXR imaging spectroscopy observations with multi-filter extreme ultraviolet and soft X-ray imaging data, we derive the energetic electron distribution of this source over a broad energy range from <10 keV up to \simMeV during the early impulsive phase of the flare. The best-fit electron distribution consists of a thermal "core" from \sim25 MK plasma. Meanwhile, a nonthermal power-law "tail" joins the thermal core at \sim16 keV with a spectral index of \sim3.6, which breaks down at above \sim160 keV to >6.0. In addition, temporally resolved analysis suggests that the electron distribution above the break energy rapidly hardens with the spectral index decreasing from >20 to \sim6.0 within 20 s, or less than \sim10 Alfvén crossing times in the source. These results provide strong support for the above-the-loop-top source as the primary site where an on-going bulk acceleration of energetic electrons is taking place very early in the flare energy release.
This paper addresses the incorporation of problem decomposition skills as an important component of computational thinking (CT) in K-12 computer science (CS) education. Despite the growing integration of CS in schools, there is a lack of consensus on the precise definition of CT in general and decomposition in particular. While decomposition is commonly referred to as the starting point of (computational) problem-solving, algorithmic solution formulation often receives more attention in the classroom, while decomposition remains rather unexplored. This study presents "CTSKills", a web-based skill assessment tool developed to measure students' problem decomposition skills. With the data collected from 75 students in grades 4-9, this research aims to contribute to a baseline of students' decomposition proficiency in compulsory education. Furthermore, a thorough understanding of a given problem is becoming increasingly important with the advancement of generative artificial intelligence (AI) tools that can effectively support the process of formulating algorithms. This study highlights the importance of problem decomposition as a key skill in K-12 CS education to foster more adept problem solvers.
Auroral radio emissions in planetary magnetospheres typically feature highly polarized, intense radio bursts, usually attributed to electron cyclotron maser (ECM) emission from energetic electrons in the planetary polar region that features a converging magnetic field. Similar bursts have been observed in magnetically active low-mass stars and brown dwarfs, often prompting analogous interpretations. Here we report observations of long-lasting solar radio bursts with high brightness temperature, wide bandwidth, and high circular polarization fraction akin to these auroral/exo-auroral radio emissions, albeit two to three orders of magnitude weaker than those on certain low-mass stars. Spatially, spectrally, and temporally resolved analysis suggests that the source is located above a sunspot where a strong, converging magnetic field is present. The source morphology and frequency dispersion are consistent with ECM emission due to precipitating energetic electrons produced by recurring flares nearby. Our findings offer new insights into the origin of such intense solar radio bursts and may provide an alternative explanation for auroral-like radio emissions on other flare stars with large starspots.
Researchers from FHNW, ZHAW, and SpinningBytes AG created SDS-200, a 200-hour multi-dialectal Swiss German speech-to-Standard German text corpus. This publicly available dataset, collected from nearly 4000 speakers via crowd-sourcing, addresses the data scarcity for Swiss German speech technology and achieves a baseline WER of 21.6 and BLEU of 64.0 for end-to-end speech translation using a fine-tuned XLS-R model.
3
This paper presents a new long-form release of the Swiss Parliaments Corpus, converting entire multi-hour Swiss German debate sessions (each aligned with the official session protocols) into high-quality speech-text pairs. Our pipeline starts by transcribing all session audio into Standard German using Whisper Large-v3 under high-compute settings. We then apply a two-step GPT-4o correction process: first, GPT-4o ingests the raw Whisper output alongside the official protocols to refine misrecognitions, mainly named entities. Second, a separate GPT-4o pass evaluates each refined segment for semantic completeness. We filter out any segments whose Predicted BLEU score (derived from Whisper's average token log-probability) and GPT-4o evaluation score fall below a certain threshold. The final corpus contains 801 hours of audio, of which 751 hours pass our quality control. Compared to the original sentence-level SPC release, our long-form dataset achieves a 6-point BLEU improvement, demonstrating the power of combining robust ASR, LLM-based correction, and data-driven filtering for low-resource, domain-specific speech corpora.
This paper presents a case study of a recommender system that can be used to save energy in smart homes without lowering the comfort of the inhabitants. We present an algorithm that uses consumer behavior data only and uses machine learning to suggest actions for inhabitants to reduce the energy consumption of their homes. The system mines for frequent and periodic patterns in the event data provided by the Digitalstrom home automation system. These patterns are converted into association rules, prioritized and compared with the current behavior of the inhabitants. If the system detects an opportunities to save energy without decreasing the comfort level it sends a recommendation to the residents.
We present observations of the occulted active region AR12222 during the third {\em NuSTAR} solar campaign on 2014 December 11, with concurrent {\em SDO/}AIA and {\em FOXSI-2} sounding rocket observations. The active region produced a medium size solar flare one day before the observations, at 18\sim18UT on 2014 December 10, with the post-flare loops still visible at the time of {\em NuSTAR} observations. The time evolution of the source emission in the {\em SDO/}AIA 335A˚335\textrmÅ channel reveals the characteristics of an extreme-ultraviolet late phase event, caused by the continuous formation of new post-flare loops that arch higher and higher in the solar corona. The spectral fitting of {\em NuSTAR} observations yields an isothermal source, with temperature 3.84.63.8-4.6 MK, emission measure 0.31.8×1046 cm30.3-1.8 \times 10^{46}\textrm{ cm}^{-3}, and density estimated at 2.56.0×108 cm32.5-6.0 \times 10^8 \textrm{ cm}^{-3}. The observed AIA fluxes are consistent with the derived {\em NuSTAR} temperature range, favoring temperature values in the range 4.04.34.0-4.3 MK. By examining the post-flare loops' cooling times and energy content, we estimate that at least 12 sets of post-flare loops were formed and subsequently cooled between the onset of the flare and {\em NuSTAR} observations, with their total thermal energy content an order of magnitude larger than the energy content at flare peak time. This indicates that the standard approach of using only the flare peak time to derive the total thermal energy content of a flare can lead to a large underestimation of its value.
Context. The Spectrometer/Telescope for Imaging X-rays (STIX) is one of 6 remote sensing instruments on-board Solar Orbiter. It provides hard X-ray imaging spectroscopy of solar flares by sampling the Fourier transform of the incoming flux. Aims. To show that the visibility amplitude and phase calibration of 24 out of 30 STIX sub-collimators is well advanced and that a set of imaging methods is able to provide the first hard X-ray images of the flaring Sun from Solar Orbiter. Methods. We applied four visibility-based image reconstruction methods and a count-based one to calibrated STIX observations. The resulting reconstructions are compared to those provided by an optimization algorithm used for fitting the amplitudes of STIX visibilities. Results. When applied to six flares with GOES class between C4 and M4 which occurred in May 2021, the five imaging methods produce results morphologically consistent with the ones provided by the Atmospheric Imaging Assembly on-board the Solar Dynamic Observatory (SDO/AIA) in UV wavelengths. The χ2\chi^2 values and the parameters of the reconstructed sources are comparable between methods, thus confirming their robustness. Conclusions. This paper shows that the current calibration of the main part of STIX sub-collimators has reached a satisfactory level for scientific data exploitation, and that the imaging algorithms already available in the STIX data analysis software provide reliable and robust reconstructions of the morphology of solar flares.
The white-light continuum emissions in solar flares (i.e., white-light flares) are usually observed on the solar disk but, in a few cases, off the limb. Here we present on-disk as well as off-limb continuum emissions at 3600 {\AA} (in the Balmer continuum) in an X2.1 flare (SOL2023-03-03T17:52) and an X1.5 flare (SOL2023-08-07T20:46), respectively, observed by the White-light Solar Telescope (WST) on the Advanced Space-based Solar Observatory (ASO-S). These continuum emissions are seen at the ribbons for the X2.1 flare and on loops during the X1.5 event, in which the latter also appears in the decay phase. These emissions also show up in the pseudo-continuum images at Fe I {\lambda}6173 from the Helioseismic and Magnetic Imager (HMI) on the Solar Dynamics Observatory (SDO). In addition, the ribbon sources in the X2.1 flare exhibit significant enhancements in the Fe I line at 6569.2 {\AA} and the nearby continuum observed by the Chinese H{\alpha} Solar Explorer (CHASE). It is found that the on-disk continuum emissions in the X2.1 flare are related to a nonthermal electron-beam heating either directly or indirectly, while the off-limb emissions in the X1.5 flare are associated with thermal plasma cooling or due to Thomson scattering. These comprehensive continuum observations can provide good constraints on flare energy deposition models, which helps well understand the physical mechanism of white-light flares.
We study the nature of energy release and transfer for two sub-A class solar microflares observed during the second flight of the Focusing Optics X-ray Solar Imager (FOXSI-2) sounding rocket experiment on 2014 December 11. FOXSI is the first solar-dedicated instrument to utilize focusing optics to image the Sun in the hard X-ray (HXR) regime, sensitive to the energy range 4-20 keV. Through spectral analysis of the two microflares using an optically thin isothermal plasma model, we find evidence for plasma heated to temperatures of ~10 MK and emissions measures down to ~1044 10^{44}~cm3^{-3}. Though nonthermal emission was not detected for the FOXSI-2 microflares, a study of the parameter space for possible hidden nonthermal components shows that there could be enough energy in nonthermal electrons to account for the thermal energy in microflare 1, indicating that this flare is plausibly consistent with the standard thick-target model. With a solar-optimized design and improvements in HXR focusing optics, FOXSI-2 offers approximately five times greater sensitivity at 10 keV than the Nuclear Spectroscopic Telescope Array (NuSTAR) for typical microflare observations and allows for the first direct imaging spectroscopy of solar HXRs with an angular resolution at scales relevant for microflares. Harnessing these improved capabilities to study the evolution of small-scale events, we find evidence for spatial and temporal complexity during a sub-A class flare. These studies in combination with contemporanous observations by the Atmospheric Imaging Assembly onboard the Solar Dynamics Observatory (SDO/AIA) indicate that the evolution of these small microflares is more similar to that of large flares than to the single burst of energy expected for a nanoflare.
This article utilizes the projected gradient method (PG) for a non-negative matrix factorization problem (NMF), where one or both matrix factors must have orthonormal columns or rows. We penalise the orthonormality constraints and apply the PG method via a block coordinate descent approach. This means that at a certain time one matrix factor is fixed and the other is updated by moving along the steepest descent direction computed from the penalised objective function and projecting onto the space of non-negative matrices. Our method is tested on two sets of synthetic data for various values of penalty parameters. The performance is compared to the well-known multiplicative update (MU) method from Ding (2006), and with a modified global convergent variant of the MU algorithm recently proposed by Mirzal (2014). We provide extensive numerical results coupled with appropriate visualizations, which demonstrate that our method is very competitive and usually outperforms the other two methods.
We present a new analytical technique, combining Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) high-resolution imaging and spectroscopic observations, to visualize solar flare emission as a function of spectral component (e.g., isothermal temperature) rather than energy. This computationally inexpensive technique is applicable to all spatially-invariant spectral forms and is useful for visualizing spectroscopically-determined individual sources and placing them in context, e.g., comparing multiple isothermal sources with nonthermal emission locations. For example, while extreme ultraviolet images can usually be closely identified with narrow temperature ranges, due to the emission being primarily from spectral lines of specific ion species, X-ray images are dominated by continuum emission and therefore have a broad temperature response, making it difficult to identify sources of specific temperatures regardless of the energy band of the image. We combine RHESSI calibrated X-ray visibilities with spatially-integrated spectral models including multiple isothermal components to effectively isolate the individual thermal sources from the combined emission and image them separately. We apply this technique to the 2002 July 23 X4.8 event studied in prior works, and image for the first time the super-hot and cooler thermal sources independently. The super-hot source is farther from the footpoints and more elongated throughout the impulsive phase, consistent with an in situ heating mechanism for the super-hot plasma.
Symbolic Execution is a formal method that can be used to verify the behavior of computer programs and detect software vulnerabilities. Compared to other testing methods such as fuzzing, Symbolic Execution has the advantage of providing formal guarantees about the program. However, despite advances in performance in recent years, Symbolic Execution is too slow to be applied to real-world software. This is primarily caused by the \emph{path explosion problem} as well as by the computational complexity of SMT solving. In this paper, we present a divide-and-conquer approach for symbolic execution by executing individual slices and later combining the side effects. This way, the overall problem size is kept small, reducing the impact of computational complexity on large problems.
Biometric authentication by means of handwritten signatures is a challenging pattern recognition task, which aims to infer a writer model from only a handful of genuine signatures. In order to make it more difficult for a forger to attack the verification system, a promising strategy is to combine different writer models. In this work, we propose to complement a recent structural approach to offline signature verification based on graph edit distance with a statistical approach based on metric learning with deep neural networks. On the MCYT and GPDS benchmark datasets, we demonstrate that combining the structural and statistical models leads to significant improvements in performance, profiting from their complementary properties.
There are no more papers matching your filters at the moment.