MIT World Peace University
Diffusion models have achieved remarkable success in data-driven learning and in sampling from complex, unnormalized target distributions. Building on this progress, we reinterpret Maximum Entropy Reinforcement Learning (MaxEntRL) as a diffusion model-based sampling problem. We tackle this problem by minimizing the reverse Kullback-Leibler (KL) divergence between the diffusion policy and the optimal policy distribution using a tractable upper bound. By applying the policy gradient theorem to this objective, we derive a modified surrogate objective for MaxEntRL that incorporates diffusion dynamics in a principled way. This leads to simple diffusion-based variants of Soft Actor-Critic (SAC), Proximal Policy Optimization (PPO) and Wasserstein Policy Optimization (WPO), termed DiffSAC, DiffPPO and DiffWPO. All of these methods require only minor implementation changes to their base algorithm. We find that on standard continuous control benchmarks, DiffSAC, DiffPPO and DiffWPO achieve better returns and higher sample efficiency than SAC and PPO.
Fleischner introduced the idea of splitting a vertex of degree at least three in a connected graph and used the operation to characterize Eulerian graphs. Raghunathan et. al. extended the splitting operation from graphs to binary matroids. It has been studied that splitting operation, in general, may not preserve the connectedness of the binary matroid. Interestingly, it is true that the splitting matroid of a disconnected matroid may be connected. In this paper, we characterize the binary disconnected matroids whose splitting matroid is connected.
Several Artificial Intelligence based heuristic and metaheuristic algorithms have been developed so far. These algorithms have shown their superiority towards solving complex problems from different domains. However, it is necessary to critically validate these algorithms for solving real-world constrained optimization problems. The search behavior in those problems is different as it involves large number of linear, nonlinear and non-convex type equality and inequality constraints. In this work a 57 real-world constrained optimization problems test suite is solved using two constrained metaheuristic algorithms originated from a socio-based Cohort Intelligence (CI) algorithm. The first CI-based algorithm incorporates a self-adaptive penalty function approach i.e., CI-SAPF. The second algorithm combines CI-SAPF with the intrinsic properties of the physics-based Colliding Bodies Optimization (CBO) referred to CI-SAPF-CBO. The results obtained from CI-SAPF and CI-SAPF-CBO are compared with other constrained optimization algorithms. The superiority of the proposed algorithms is discussed in details followed by future directions to evolve the constrained handling techniques.
Artificial Intelligence is everywhere today. But unfortunately, Agriculture has not been able to get that much attention from Artificial Intelligence (AI). A lack of automation persists in the agriculture industry. For over many years, farmers and crop field owners have been facing a problem of trespassing of wild animals for which no feasible solution has been provided. Installing a fence or barrier like structure is neither feasible nor efficient due to the large areas covered by the fields. Also, if the landowner can afford to build a wall or barrier, government policies for building walls are often very irksome. The paper intends to give a simple intelligible solution to the problem with Automated Crop Field Surveillance using Computer Vision. The solution will significantly reduce the cost of crops destroyed annually and completely automate the security of the field.
There are no more papers matching your filters at the moment.