Yokohama City University
In the field of emotion recognition, the development of high-performance models remains a challenge due to the scarcity of high-quality, diverse emotional datasets. Emotional expressions are inherently subjective, shaped by individual personality traits, socio-cultural backgrounds, and contextual factors, making large-scale, generalizable data collection both ethically and practically difficult. To address this issue, we introduce PersonaGen, a novel framework for generating emotionally rich text using a Large Language Model (LLM) through multi-stage persona-based conditioning. PersonaGen constructs layered virtual personas by combining demographic attributes, socio-cultural backgrounds, and detailed situational contexts, which are then used to guide emotion expression generation. We conduct comprehensive evaluations of the generated synthetic data, assessing semantic diversity through clustering and distributional metrics, human-likeness via LLM-based quality scoring, realism through comparison with real-world emotion corpora, and practical utility in downstream emotion classification tasks. Experimental results show that PersonaGen significantly outperforms baseline methods in generating diverse, coherent, and discriminative emotion expressions, demonstrating its potential as a robust alternative for augmenting or replacing real-world emotional datasets.
Inverse probability (IP) weighting of marginal structural models (MSMs) can provide consistent estimators of time-varying treatment effects under correct model specifications and identifiability assumptions, even in the presence of time-varying confounding. However, this method has two problems: (i) inefficiency due to IP-weights cumulating all time points and (ii) bias and inefficiency due to the MSM misspecification. To address these problems, we propose (i) new IP-weights for estimating parameters of the MSM that depends on partial treatment history and (ii) closed testing procedures for selecting partial treatment history (how far back in time the MSM depends on past treatments). All theoretical results are provided under known IP-weights. In simulation studies, our proposed methods outperformed existing methods both in terms of performance in estimating time-varying treatment effects and in selecting partial treatment history. Our proposed methods have also been applied to real data of hemodialysis patients with reasonable results.
The SELLM framework enables large language models to systematically generate cross-disciplinary solutions for complex scientific problems by creating comprehensive expert agents. This approach successfully produced high-quality, practical solutions for electronic device development challenges, outperforming standard LLM methods across various evaluation metrics.
Well-being in family settings involves subtle psychological dynamics that conventional metrics often overlook. In particular, unconscious parental expectations, termed ideal parent bias, can suppress children's emotional expression and autonomy. This suppression, referred to as suppressed emotion, often stems from well-meaning but value-driven communication, which is difficult to detect or address from outside the family. Focusing on these latent dynamics, this study explores Large Language Model (LLM)-based support for psychologically safe family communication. We constructed a Japanese parent-child dialogue corpus of 30 scenarios, each annotated with metadata on ideal parent bias and suppressed emotion. Based on this corpus, we developed a Role-Playing LLM-based multi-agent dialogue support framework that analyzes dialogue and generates feedback. Specialized agents detect suppressed emotion, describe implicit ideal parent bias in parental speech, and infer contextual attributes such as the child's age and background. A meta-agent compiles these outputs into a structured report, which is then passed to five selected expert agents. These agents collaboratively generate empathetic and actionable feedback through a structured four-step discussion process. Experiments show that the system can detect categories of suppressed emotion with moderate accuracy and produce feedback rated highly in empathy and practicality. Moreover, simulated follow-up dialogues incorporating this feedback exhibited signs of improved emotional expression and mutual understanding, suggesting the framework's potential in supporting positive transformation in family interactions.
We revisit language bottleneck models as an approach to ensuring the explainability of deep learning models for image classification. Because of inevitable information loss incurred in the step of converting images into language, the accuracy of language bottleneck models is considered to be inferior to that of standard black-box models. Recent image captioners based on large-scale foundation models of Vision and Language, however, have the ability to accurately describe images in verbal detail to a degree that was previously believed to not be realistically possible. In a task of disaster image classification, we experimentally show that a language bottleneck model that combines a modern image captioner with a pre-trained language model can achieve image classification accuracy that exceeds that of black-box models. We also demonstrate that a language bottleneck model and a black-box model may be thought to extract different features from images and that fusing the two can create a synergistic effect, resulting in even higher classification accuracy.
Virtual try-on systems have significant potential in e-commerce, allowing customers to visualize garments on themselves. Existing image-based methods fall into two categories: those that directly warp garment-images onto person-images (explicit warping), and those using cross-attention to reconstruct given garments (implicit warping). Explicit warping preserves garment details but often produces unrealistic output, while implicit warping achieves natural reconstruction but struggles with fine details. We propose HYB-VITON, a novel approach that combines the advantages of each method and includes both a preprocessing pipeline for warped garments and a novel training option. These components allow us to utilize beneficial regions of explicitly warped garments while leveraging the natural reconstruction of implicit warping. A series of experiments demonstrates that HYB-VITON preserves garment details more faithfully than recent diffusion-based methods, while producing more realistic results than a state-of-the-art explicit warping method.
In this paper we revisit the isomorphism SU(2)SU(2)SO(4)SU(2)\otimes SU(2)\cong SO(4) to apply to some subjects in Quantum Computation and Mathematical Physics. The unitary matrix QQ by Makhlin giving the isomorphism as an adjoint action is studied and generalized from a different point of view. Some problems are also presented. In particular, the homogeneous manifold SU(2n)/SO(2n)SU(2n)/SO(2n) which characterizes entanglements in the case of n=2n=2 is studied, and a clear-cut calculation of the universal Yang-Mills action in (hep-th/0602204) is given for the abelian case.
For a graph GG, a subset SS of V(G)V(G) is a {\it hop dominating set} of GG if every vertex not in SS has a 22-step neighbor in SS. The {\it hop domination number}, γh(G)\gamma_h(G), of GG is the minimum cardinality of a hop dominating set of GG. In this paper, we show that for a connected triangle-free graph GG with n15n\ge 15 vertices, if δ(G)2\delta(G)\ge 2, then γh(G)2n5\gamma_h(G)\le \frac{2n}{5}, and the bound is tight. We also give some tight upper bounds on γh(G)\gamma_h(G) for {triangle-free} graphs GG that contain a Hamiltonian path or a Hamiltonian cycle.
Latent variable models provide a powerful framework for incorporating and inferring unobserved factors in observational data. In causal inference, they help account for hidden factors influencing treatment or outcome, thereby addressing challenges posed by missing or unmeasured covariates. This paper proposes a new framework that integrates latent variable modeling into the double machine learning (DML) paradigm to enable robust causal effect estimation in the presence of such hidden factors. We consider two scenarios: one where a latent variable affects only the outcome, and another where it may influence both treatment and outcome. To ensure tractability, we incorporate latent variables only in the second stage of DML, separating representation learning from latent inference. We demonstrate the robustness and effectiveness of our method through extensive experiments on both synthetic and real-world datasets.
For an edge-colored graph GG, the minimum color degree of GG means the minimum number of colors on edges which are adjacent to each vertex of GG. We prove that if GG is an edge-colored graph with minimum color degree at least 55 then V(G)V(G) can be partitioned into two parts such that each part induces a subgraph with minimum color degree at least 22. We show this theorem by proving a much stronger form. Moreover, we point out an important relationship between our theorem and Bermond-Thomassen's conjecture in digraphs.
In observational study, the propensity score has the central role to estimate causal effects. Since the propensity score is usually unknown, estimating by appropriate procedures is an indispensable step. A point to note that a causal effect estimator might have some bias if a propensity score model was misspecified; valid model construction is important. To overcome the problem, a variety of interesting methods has been proposed. In this paper, we review four methods: using ordinary logistic regression approach; CBPS proposed by Imai and Ratkovic; boosted CART proposed by McCaffrey and colleagues; a semiparametric strategy proposed by Liu and colleagues. Also, we propose the novel robust two step strategy: estimating each candidate model in the first step and integrating them in the second step. We confirm the performance of these methods through simulation examples by estimating the ATE and ATO proposed by Li and colleagues. From the results of the simulation examples, the boosted CART and CBPS with higher-order balancing condition have good properties; both the estimate of the ATE and ATO has the small variance and the absolute value of bias. The boosted CART and CBPS are useful for a variety of estimands and estimating procedures.
Let kk be a positive integer, and GG be a kk-connected graph. An edge-coloured path is \emph{rainbow} if all of its edges have distinct colours. The \emph{rainbow kk-connection number} of GG, denoted by rck(G)rc_k(G), is the minimum number of colours in an edge-colouring of GG such that, any two vertices are connected by kk internally vertex-disjoint rainbow paths. The function rck(G)rc_k(G) was introduced by Chartrand, Johns, McKeon and Zhang in 2009, and has since attracted significant interest. Let tk(n,r)t_k(n,r) denote the minimum number of edges in a kk-connected graph GG on nn vertices with rck(G)rrc_k(G)\le r. Let sk(n,r)s_k(n,r) denote the maximum number of edges in a kk-connected graph GG on nn vertices with rck(G)rrc_k(G)\ge r. The functions t1(n,r)t_1(n,r) and s1(n,r)s_1(n,r) have previously been studied by various authors. In this paper, we study the functions t2(n,r)t_2(n,r) and s2(n,r)s_2(n,r). We determine bounds for t2(n,r)t_2(n,r) which imply that t2(n,2)=(1+o(1))nlog2nt_2(n,2)=(1+o(1))n\log_2 n, and t2(n,r)t_2(n,r) is linear in nn for r3r\ge 3. We also provide some remarks about the function s2(n,r)s_2(n,r).
In 1966, T. Gallai asked whether every connected graph has a vertex that appears in all longest paths. Since then this question has attracted much attention and many work has been done in this topic. One important open question in this area is to ask whether any three longest paths contains a common vertex in a connected graph. It was conjectured that the answer to this question is positive. In this paper, we propose a new approach in view of distances among longest paths in a connected graph, and give a substantial progress towards the conjecture along the idea.
Classification problems are essential statistical tasks that form the foundation of decision-making across various fields, including patient prognosis and treatment strategies for critical conditions. Consequently, evaluating the performance of classification models is of significant importance, and numerous evaluation metrics have been proposed. Among these, the Matthews correlation coefficient (MCC), also known as the phi coefficient, is widely recognized as a reliable metric that provides balanced measurements even in the presence of class imbalance. However, with the increasing prevalence of multiclass classification problems involving three or more classes, macro-averaged and micro-averaged extensions of MCC have been employed, despite a lack of clear definitions or established references for these extensions. In the present study, we propose a formal framework for MCC tailored to multiclass classification problems using macro-averaged and micro-averaged approaches. Moreover, discussions on the use of these extended MCCs for multiclass problems often rely solely on point estimates, potentially overlooking the statistical significance and reliability of the results. To address this gap, we introduce several methods for constructing asymptotic confidence intervals for the proposed metrics. Furthermore, we extend these methods to include the construction of asymptotic confidence intervals for differences in the proposed metrics, specifically for paired study designs. The utility of our methods is evaluated through comprehensive simulations and real-world data analyses.
In this paper we discuss a master equation applied to the two level system of an atom and derive an exact solution to it in an abstract manner. We also present a problem and a conjecture based on the three level system. Our results may give a small hint to understand the huge transition from Quantum World to Classical World. To the best of our knowledge this is the finest method up to the present.
PHYSBO (optimization tools for PHYSics based on Bayesian Optimization) is a Python library for fast and scalable Bayesian optimization. It has been developed mainly for application in the basic sciences such as physics and materials science. Bayesian optimization is used to select an appropriate input for experiments/simulations from candidate inputs listed in advance in order to obtain better output values with the help of machine learning prediction. PHYSBO can be used to find better solutions for both single and multi-objective optimization problems. At each cycle in the Bayesian optimization, a single proposal or multiple proposals can be obtained for the next experiments/simulations. These proposals can be obtained interactively for use in experiments. PHYSBO is available at this https URL
We propose a novel path sampling method based on the Onsager-Machlup (OM) action by generalizing the multiscale enhanced sampling (MSES) technique suggested by Moritsugu and coworkers (J. Chem. Phys. 133, 224105 (2010)). The basic idea of this method is that the system we want to study (for example, some molecular system described by molecular mechanics) is coupled to a coarse-grained (CG) system, which can move more quickly and computed more efficiently than the original system. We simulate this combined system (original + CG system) using (underdamped) Langevin dynamics where different heat baths are coupled to the two systems. When the coupling is strong enough, the original system is guided by the CG system, and able to sample the configuration and path space more efficiency. We need to correct the bias caused by the coupling, however, by employing the Hamiltonian replica exchange where we prepare many path replica with different coupling strengths. As a result, an unbiased path ensemble for the original system can be found in the weakest coupling path ensemble. This strategy is easily implemented because a weight for a path calculated by the OM action is formally the same as the Boltzmann weight if we properly define the path "Hamiltonian". We apply this method to a model polymer with Asakura-Oosawa interaction, and compare the results with the conventional transition path sampling method.
We derive a simple and precise approximation to probability density functions in sampling distributions based on the Fourier cosine series. After clarifying the required conditions, we illustrate the approximation on two examples: the distribution of the sum of uniformly distributed random variables, and the distribution of sample skewness drawn from a normal population. The probability density function of the first example can be explicitly expressed, but that of the second example has no explicit expression.
An edge-colored connected graph GG is properly connected if between every pair of distinct vertices, there exists a path that no two adjacent edges have a same color. Fujita (2019) introduced the optimal proper connection number pcopt(G){\mathrm{pc}_{\mathrm{opt}}}(G) for a monochromatic connected graph GG, to make a connected graph properly connected efficiently. More precisely, pcopt(G){\mathrm{pc}_{\mathrm{opt}}}(G) is the smallest integer p+qp+q when one converts a given monochromatic graph GG into a properly connected graph by recoloring pp edges with qq colors. In this paper, we show that pcopt(G){\mathrm{pc}_{\mathrm{opt}}}(G) has an upper bound in terms of the independence number α(G)\alpha(G). Namely, we prove that for a connected graph GG, ${\mathrm{pc}_{\mathrm{opt}}}(G)\le \frac{5\alpha(G)-1}{2}.Moreoevr,forthecase. Moreoevr, for the case \alpha(G)\leq 3$, we improve the upper bound to 44, which is tight.
Let GG be an edge-colored graph. We use e(G)e(G) and c(G)c(G) to denote the number of edges of GG and the number of colors appearing on E(G)E(G), respectively. For a vertex vV(G)v\in V(G), the \emph{color neighborhood} of vv is defined as the set of colors assigned to the edges incident to vv. A subgraph of GG is \emph{rainbow} if all of its edges are assigned with distinct colors. The well-known Mantel's theorem states that a graph GG on nn vertices contains a triangle if e(G)n24+1e(G)\geq\lfloor\frac{n^2}{4}\rfloor+1. Rademacher (1941) showed that GG contains at least n2\lfloor\frac{n}{2}\rfloor triangles under the same condition. Li, Ning, Xu and Zhang (2014) proved a rainbow version of Mantel's theorem: An edge-colored graph GG has a rainbow triangle if e(G)+c(G)n(n+1)/2e(G)+c(G)\geq n(n+1)/2. In this paper, we first characterize all graphs GG satisfying e(G)+c(G)n(n+1)/21e(G)+c(G)\geq n(n+1)/2-1 but containing no rainbow triangles. Motivated by Rademacher's theorem, we then characterize all graphs GG which satisfy e(G)+c(G)n(n+1)/2e(G)+c(G)\geq n(n+1)/2 but contain only one rainbow triangle. We further obtain two results on color neighborhood conditions for the existence of rainbow short cycles. Our results improve a previous theorem due to Broersma, Li, Woeginger, and Zhang (2005). Moreover, we provide a sufficient condition in terms of color neighborhood for the existence of a specified number of vertex-disjoint rainbow cycles.
There are no more papers matching your filters at the moment.