Univ. Artois
This work focuses on learning non-canonical Hamiltonian dynamics from data, where long-term predictions require the preservation of structure both in the learned model and in numerical schemes. Previous research focused on either facet, respectively with a potential-based architecture and with degenerate variational integrators, but new issues arise when combining both. In experiments, the learnt model is sometimes numerically unstable due to the gauge dependency of the scheme, rendering long-time simulations impossible. In this paper, we identify this problem and propose two different training strategies to address it, either by directly learning the vector field or by learning a time-discrete dynamics through the scheme. Several numerical test cases assess the ability of the methods to learn complex physical dynamics, like the guiding center from gyrokinetic plasma physics.
Let J \subseteq I be ideals in a commutative Noetherian ring R, and r,s \geq 0. We say that J is a demotion of I if I^r J^s = I^{r+s} \cap J^s for all r,s \geq 0. In this paper, we mainly aim to explore this notion in polynomial rings. In particular, we investigate the relation between the demotion property and normal torsion-freeness. Furthermore, we compare the reductions of ideals and demotions of ideals.
In this paper, we establish some criteria to detect the presence of the maximal ideal (x1,,xn)(x_1, \ldots, x_n) in the set of associated primes of powers of monomial ideals in the polynomial ring K[x1,,xn]K[x_1, \ldots, x_n]. Furthermore, for each of these criteria, we illustrate its applicability with corresponding examples.
Researchers from French institutions introduce Π-NeSy, a neuro-symbolic framework that combines neural networks with possibilistic rule-based systems, enabling efficient handling of epistemic uncertainty through possibility theory rather than probability theory while maintaining comparable performance to existing approaches on benchmark tasks like MNIST addition and Sudoku.
This article presents a new approach based on MiniRocket, called SelF-Rocket, for fast time series classification (TSC). Unlike existing approaches based on random convolution kernels, it dynamically selects the best couple of input representations and pooling operator during the training process. SelF-Rocket achieves state-of-the-art accuracy on the University of California Riverside (UCR) TSC benchmark datasets.
7
In this article, we study the inconsistency of systems of min\min-\rightarrow fuzzy relational equations. We give analytical formulas for computing the Chebyshev distances =infdDβd\nabla = \inf_{d \in \mathcal{D}} \Vert \beta - d \Vert associated to systems of min\min-\rightarrow fuzzy relational equations of the form Γminx=β\Gamma \Box_{\rightarrow}^{\min} x = \beta, where \rightarrow is a residual implicator among the G\"odel implication G\rightarrow_G, the Goguen implication GG\rightarrow_{GG} or Lukasiewicz's implication L\rightarrow_L and D\mathcal{D} is the set of second members of consistent systems defined with the same matrix Γ\Gamma. The main preliminary result that allows us to obtain these formulas is that the Chebyshev distance \nabla is the lower bound of the solutions of a vector inequality, whatever the residual implicator used. Finally, we show that, in the case of the minG\min-\rightarrow_{G} system, the Chebyshev distance \nabla may be an infimum, while it is always a minimum for minGG\min-\rightarrow_{GG} and minL\min-\rightarrow_{L} systems.
Argumentation is a central subarea of Artificial Intelligence (AI) for modeling and reasoning about arguments. The semantics of abstract argumentation frameworks (AFs) is given by sets of arguments (extensions) and conditions on the relationship between them, such as stable or admissible. Today's solvers implement tasks such as finding extensions, deciding credulous or skeptical acceptance, counting, or enumerating extensions. While these tasks are well charted, the area between decision, counting/enumeration and fine-grained reasoning requires expensive reasoning so far. We introduce a novel concept (facets) for reasoning between decision and enumeration. Facets are arguments that belong to some extensions (credulous) but not to all extensions (skeptical). They are most natural when a user aims to navigate, filter, or comprehend the significance of specific arguments, according to their needs. We study the complexity and show that tasks involving facets are much easier than counting extensions. Finally, we provide an implementation, and conduct experiments to demonstrate feasibility.
In this paper, we introduce a syntactic framework for analyzing and handling inconsistencies in propositional bases. Our approach focuses on examining the relationships between variable occurrences within conflicts. We propose two dual concepts: Minimal Inconsistency Relation (MIR) and Maximal Consistency Relation (MCR). Each MIR is a minimal equivalence relation on variable occurrences that results in inconsistency, while each MCR is a maximal equivalence relation designed to prevent inconsistency. Notably, MIRs capture conflicts overlooked by minimal inconsistent subsets. Using MCRs, we develop a series of non-explosive inference relations. The main strategy involves restoring consistency by modifying the propositional base according to each MCR, followed by employing the classical inference relation to derive conclusions. Additionally, we propose an unusual semantics that assigns truth values to variable occurrences instead of the variables themselves. The associated inference relations are established through Boolean interpretations compatible with the occurrence-based models.
Motivated by recent connections to factorised databases, we analyse the efficiency of representations by context free grammars (CFGs). Concretely, we prove a recent conjecture by Kimelfeld, Martens, and Niewerth (ICDT 2025), that for finite languages representations by general CFGs can be doubly-exponentially smaller than those by unambiguous CFGs. To do so, we show the first exponential lower bounds for representation by unambiguous CFGs of a finite language that can efficiently be represented by CFGs. Our proof first reduces the problem to proving a lower bound in a non-standard model of communication complexity. Then, we argue similarly in spirit to a recent discrepancy argument to show the required communication complexity lower bound. Our result also implies that a finite language may admit an exponentially smaller representation as a nondeterministic finite automaton than as an unambiguous CFG.
The ever increasing complexity of machine learning techniques used more and more in practice, gives rise to the need to explain the predictions and decisions of these models, often used as black-boxes. Explainable AI approaches are either numerical feature-based aiming to quantify the contribution of each feature in a prediction or symbolic providing certain forms of symbolic explanations such as counterfactuals. This paper proposes a generic agnostic approach named ASTERYX allowing to generate both symbolic explanations and score-based ones. Our approach is declarative and it is based on the encoding of the model to be explained in an equivalent symbolic representation, this latter serves to generate in particular two types of symbolic explanations which are sufficient reasons and counterfactuals. We then associate scores reflecting the relevance of the explanations and the features w.r.t to some properties. Our experimental results show the feasibility of the proposed approach and its effectiveness in providing symbolic and score-based explanations.
Pathologies systematically induce morphological changes, thus providing a major but yet insufficiently quantified source of observables for diagnosis. The study develops a predictive model of the pathological states based on morphological features (3D-morphomics) on Computed Tomography (CT) volumes. A complete workflow for mesh extraction and simplification of an organ's surface is developed, and coupled with an automatic extraction of morphological features given by the distribution of mean curvature and mesh energy. An XGBoost supervised classifier is then trained and tested on the 3D-morphomics to predict the pathological states. This framework is applied to the prediction of the malignancy of lung's nodules. On a subset of NLST database with malignancy confirmed biopsy, using 3D-morphomics only, the classification model of lung nodules into malignant vs. benign achieves 0.964 of AUC. Three other sets of classical features are trained and tested, (1) clinical relevant features gives an AUC of 0.58, (2) 111 radiomics gives an AUC of 0.976, (3) radiologist ground truth (GT) containing the nodule size, attenuation and spiculation qualitative annotations gives an AUC of 0.979. We also test the Brock model and obtain an AUC of 0.826. Combining 3D-morphomics and radiomics features achieves state-of-the-art results with an AUC of 0.978 where the 3D-morphomics have some of the highest predictive powers. As a validation on a public independent cohort, models are applied to the LIDC dataset, the 3D-morphomics achieves an AUC of 0.906 and the 3D-morphomics+radiomics achieves an AUC of 0.958, which ranks second in the challenge among deep models. It establishes the curvature distributions as efficient features for predicting lung nodule malignancy and a new method that can be applied directly to arbitrary computer aided diagnosis task.
When extracting a relation of spans (intervals) from a text document, a common practice is to filter out tuples of the relation that are deemed dominated by others. The domination rule is defined as a partial order that varies along different systems and tasks. For example, we may state that a tuple is dominated by tuples which extend it by assigning additional attributes, or assigning larger intervals. The result of filtering the relation would then be the skyline according to this partial order. As this filtering may remove most of the extracted tuples, we study whether we can improve the performance of the extraction by compiling the domination rule into the extractor. To this aim, we introduce the skyline operator for declarative information extraction tasks expressed as document spanners. We show that this operator can be expressed via regular operations when the domination partial order can itself be expressed as a regular spanner, which covers several natural domination rules. Yet, we show that the skyline operator incurs a computational cost (under combined complexity). First, there are cases where the operator requires an exponential blowup on the number of states needed to represent the spanner as a sequential variable-set automaton. Second, the evaluation may become computationally hard. Our analysis more precisely identifies classes of domination rules for which the combined complexity is tractable or intractable.
Using first-principles calculations, we explore the magnetoelectric properties of the room-temperature multiferroic crystal BiCoO3_3. We use both applied magnetic field and finite-difference techniques to show that BiCoO3_3 is anti-magnetoelectric at the linear level. The calculation of the dynamical effective charges reveals that the total magnetoelectric response is zero due to the compensating non-zero magnetoelectric response of each magnetic sublattice. This calculation also highlights that the the orbital contribution to the response is remarkably larger than the spin one and that each sublattice has a rather large total magnetoelectric response of 85 ps/m. Furthermore, we provide an intuitive recipe to visualize the dynamical magnetic effective charge, allowing to examine its multipolar nature which we confirm by means of ab initio calculations. Given the large value of the local response, we investigate the ferromagnetic phase as well, which gives a giant magnetoelectric response of about 1000 ps/m and coming mainly from the spin contribution this time. Finally, we discuss the possible reasons for such a large magnetoelectric response in BiCoO3 and propose possible strategies to unveil this potentially large response.
In this article, we introduce a method for learning a capacity underlying a Sugeno integral according to training data based on systems of fuzzy relational equations. To the training data, we associate two systems of equations: a maxmin\max-\min system and a minmax\min-\max system. By solving these two systems (in the case that they are consistent) using Sanchez's results, we show that we can directly obtain the extremal capacities representing the training data. By reducing the maxmin\max-\min (resp. minmax\min-\max) system of equations to subsets of criteria of cardinality less than or equal to qq (resp. of cardinality greater than or equal to nqn-q), where nn is the number of criteria, we give a sufficient condition for deducing, from its potential greatest solution (resp. its potential lowest solution), a qq-maxitive (resp. qq-minitive) capacity. Finally, if these two reduced systems of equations are inconsistent, we show how to obtain the greatest approximate qq-maxitive capacity and the lowest approximate qq-minitive capacity, using recent results to handle the inconsistency of systems of fuzzy relational equations.
Interpretable Machine Learning faces a recurring challenge of explaining the predictions made by opaque classifiers such as ensemble models, kernel methods, or neural networks in terms that are understandable to humans. When the model is viewed as a black box, the objective is to identify a small set of features that jointly determine the black box response with minimal error. However, finding such model-agnostic explanations is computationally demanding, as the problem is intractable even for binary classifiers. In this paper, the task is framed as a Constraint Optimization Problem, where the constraint solver seeks an explanation of minimum error and bounded size for an input data instance and a set of samples generated by the black box. From a theoretical perspective, this constraint programming approach offers PAC-style guarantees for the output explanation. We evaluate the approach empirically on various datasets and show that it statistically outperforms the state-of-the-art heuristic Anchors method.
In this tutorial, we will survey known results on the complexity of conjunctive query evaluation in different settings, ranging from Boolean queries over counting to more complex models like enumeration and direct access. A particular focus will be on showing how different relatively recent hypotheses from complexity theory connect to query answering and allow showing that known algorithms in several cases can likely not be improved.
A systematic study of the electronic structure and optical properties of the thio-apatites Ba5_5(VSα_{\alpha}Oβ_{\beta})3_3X (X= Cl, F, Br, I) is carried out through first principles density functional theory simulations. The band gap and properties evolution from fluorine to iodine on fixed O/S ratios, as well as by substituting sulfur (S) for oxygen (O) are discussed. The reduction of the band gap by raising valence band energy levels, with an increasing S/O ratio can also be further modulated by the type of halide in the channels of the structure, thus promoting fine tuning of the band gap region. Defect states also play a crucial role in band gap modulation. Furthermore, the examination of the band edges properties in Ba5_5(VSα_{\alpha}Oβ_{\beta})3_3X compounds suggests they can be potential photocatalysts candidates for the water splitting reaction, with reduced band gaps enabling efficient light-driven reactions, particularly in Ba5_5(VSα_{\alpha}Oβ_{\beta})3_3I. Optical investigations reveal that sulfur doping induces optical anisotropy, enhancing light absorption and offering tailored optical behaviour. These results provide new insights for the design of functional materials in the broad family of apatites.
While the success of pre-trained language models has largely eliminated the need for high-quality static word vectors in many NLP applications, such vectors continue to play an important role in tasks where words need to be modelled in the absence of linguistic context. In this paper, we explore how the contextualised embeddings predicted by BERT can be used to produce high-quality word vectors for such domains, in particular related to knowledge base completion, where our focus is on capturing the semantic properties of nouns. We find that a simple strategy of averaging the contextualised embeddings of masked word mentions leads to vectors that outperform the static word vectors learned by BERT, as well as those from standard word embedding models, in property induction tasks. We notice in particular that masking target words is critical to achieve this strong performance, as the resulting vectors focus less on idiosyncratic properties and more on general semantic properties. Inspired by this view, we propose a filtering strategy which is aimed at removing the most idiosyncratic mention vectors, allowing us to obtain further performance gains in property induction.
The optimization of yields in multi-reactor systems, which are advanced tools in heterogeneous catalysis research, presents a significant challenge due to hierarchical technical constraints. To this respect, this work introduces a novel approach called process-constrained batch Bayesian optimization via Thompson sampling (pc-BO-TS) and its generalized hierarchical extension (hpc-BO-TS). This method, tailored for the efficiency demands in multi-reactor systems, integrates experimental constraints and balances between exploration and exploitation in a sequential batch optimization strategy. It offers an improvement over other Bayesian optimization methods. The performance of pc-BO-TS and hpc-BO-TS is validated in synthetic cases as well as in a realistic scenario based on data obtained from high-throughput experiments done on a multi-reactor system available in the REALCAT platform. The proposed methods often outperform other sequential Bayesian optimizations and existing process-constrained batch Bayesian optimization methods. This work proposes a novel approach to optimize the yield of a reaction in a multi-reactor system, marking a significant step forward in digital catalysis and generally in optimization methods for chemical engineering.
We study the problem of evaluating a Monadic Second Order (MSO) query over strings under updates in the setting of direct access. We present an algorithm that, given an MSO query with first-order free variables represented by an unambiguous variable-set automaton A\mathcal{A} with state set QQ and variables XX and a string ss, computes a data structure in time O(QωX2s)\mathcal{O}(|Q|^\omega\cdot |X|^2 \cdot |s|) and, then, given an index ii retrieves, using the data structure, the ii-th output of the evaluation of A\mathcal{A} over ss in time O(QωX3log(s)2)\mathcal{O}(|Q|^\omega \cdot |X|^3 \cdot \log(|s|)^2) where ω\omega is the exponent for matrix multiplication. Ours is the first efficient direct access algorithm for MSO query evaluation over strings; such algorithms so far had only been studied for first-order queries and conjunctive queries over relational data. Our algorithm gives the answers in lexicographic order where, in contrast to the setting of conjunctive queries, the order between variables can be freely chosen by the user without degrading the runtime. Moreover, our data structure can be updated efficiently after changes to the input string, allowing more powerful updates than in the enumeration literature, e.g.~efficient deletion of substrings, concatenation and splitting of strings, and cut-and-paste operations. Our approach combines a matrix representation of MSO queries and a novel data structure for dynamic word problems over semi-groups which yields an overall algorithm that is elegant and easy to formulate.
There are no more papers matching your filters at the moment.