MIT Sloan School of Management
This study from Microsoft Research, GitHub Inc., and MIT Sloan School of Management empirically measures the productivity impact of GitHub Copilot in software development, demonstrating that developers using the AI tool complete programming tasks 55.8% faster. The research used a randomized controlled experiment with professional programmers to quantify these efficiency gains.
Large language models (LLMs) integrated into agent-driven workflows hold immense promise for healthcare, yet a significant gap exists between their potential and practical implementation within clinical settings. To address this, we present a practitioner-oriented field manual for deploying generative agents that use electronic health record (EHR) data. This guide is informed by our experience deploying the "irAE-Agent", an automated system to detect immune-related adverse events from clinical notes at Mass General Brigham, and by structured interviews with 20 clinicians, engineers, and informatics leaders involved in the project. Our analysis reveals a critical misalignment in clinical AI development: less than 20% of our effort was dedicated to prompt engineering and model development, while over 80% was consumed by the sociotechnical work of implementation. We distill this effort into five "heavy lifts": data integration, model validation, ensuring economic value, managing system drift, and governance. By providing actionable solutions for each of these challenges, this field manual shifts the focus from algorithmic development to the essential infrastructure and implementation work required to bridge the "valley of death" and successfully translate generative AI from pilot projects into routine clinical care.
Haiwen Li and Sinan Aral from MIT conducted a large-scale experiment revealing how generative AI search interface designs causally influence human trust. They found that merely presenting reference links, even if invalid, significantly boosts trust and reduces critical evaluation, while uncertainty highlighting paradoxically decreases trust in AI-generated information.
To learn how to behave, the current revolutionary generation of AIs must be trained on vast quantities of published images, written works, and sounds, many of which fall within the core subject matter of copyright law. To some, the use of copyrighted works as training sets for AI is merely a transitory and non-consumptive use that does not materially interfere with owners' content or copyrights protecting it. Companies that use such content to train their AI engine often believe such usage should be considered "fair use" under United States law (sometimes known as "fair dealing" in other countries). By contrast, many copyright owners, as well as their supporters, consider the incorporation of copyrighted works into training sets for AI to constitute misappropriation of owners' intellectual property, and, thus, decidedly not fair use under the law. This debate is vital to the future trajectory of AI and its applications. In this article, we analyze the arguments in favor of, and against, viewing the use of copyrighted works in training sets for AI as fair use. We call this form of fair use "fair training". We identify both strong and spurious arguments on both sides of this debate. In addition, we attempt to take a broader perspective, weighing the societal costs (e.g., replacement of certain forms of human employment) and benefits (e.g., the possibility of novel AI-based approaches to global issues such as environmental disruption) of allowing AI to make easy use of copyrighted works as training sets to facilitate the development, improvement, adoption, and diffusion of AI. Finally, we suggest that the debate over AI and copyrighted works may be a tempest in a teapot when placed in the wider context of massive societal challenges such as poverty, equality, climate change, and loss of biodiversity, to which AI may be part of the solution.
Understanding land cover holds considerable potential for a myriad of practical applications, particularly as data accessibility transitions from being exclusive to governmental and commercial entities to now including the broader research community. Nevertheless, although the data is accessible to any community member interested in exploration, there exists a formidable learning curve and no standardized process for accessing, pre-processing, and leveraging the data for subsequent tasks. In this study, we democratize this data by presenting a flexible and efficient end to end pipeline for working with the Dynamic World dataset, a cutting-edge near-real-time land use/land cover (LULC) dataset. This includes a pre-processing and representation framework which tackles noise removal, efficient extraction of large amounts of data, and re-representation of LULC data in a format well suited for several downstream tasks. To demonstrate the power of our pipeline, we use it to extract data for an urbanization prediction problem and build a suite of machine learning models with excellent performance. This task is easily generalizable to the prediction of any type of land cover and our pipeline is also compatible with a series of other downstream tasks.
We study the impact of non-pharmaceutical interventions (NPIs) on mortality and economic activity across U.S. cities during the 1918 Flu Pandemic. The combination of fast and stringent NPIs reduced peak mortality by 50% and cumulative excess mortality by 24% to 34%. However, while the pandemic itself was associated with short-run economic disruptions, we find that these disruptions were similar across cities with strict and lenient NPIs. NPIs also did not worsen medium-run economic outcomes. Our findings indicate that NPIs can reduce disease transmission without further depressing economic activity, a finding also reflected in discussions in contemporary newspapers.
Administrative burden has been growing in organizations despite many counterproductive effects. We develop a system dynamics model to explain why this phenomenon occurs and to explore potential remedies. Prior literature has identified behavioral mechanisms leading to process creation, obsolescence, and removal, but typically examines them individually. Here, we integrate these mechanisms in the context of an organization allocating limited resources to competing priorities. We show that their interaction -- via accumulation and feedback loops -- leads to two possible outcomes: a sustainable equilibrium, where administrative costs stabilizes, and runaway administrative bloat, where administrative costs and waste accumulate in a self-reinforcing cycle. The two outcomes are separated by a critical threshold in management behavioral parameters -- the propensity to create processes in response to problems, and the propensity to prune obsolete processes in response to administrative burden. Rapid environmental change worsens the threshold, making bloat more likely. We evaluate several intervention strategies using simulation and find that lasting reductions in administrative costs and waste require two key commitments: a permanent shift in organizational priorities, and investment in discerning obsolete processes from useful ones. In contrast, temporary shifts and indiscriminate process cuts offer only short-lived relief. Counterintuitively, we find that prioritizing direct production can increase administrative waste. Our findings suggest that while dynamic environments make administrative bloat more likely, administrative bloat is not inevitable -- managers play a critical role in preventing or reversing it.
In paired randomized experiments individuals in a given matched pair may differ on prognostically important covariates despite the best efforts of practitioners. We examine the use of regression adjustment as a way to correct for persistent covariate imbalances after randomization, and present two regression assisted estimators for the sample average treatment effect in paired experiments. Using the potential outcomes framework, we prove that these estimators are consistent for the sample average treatment effect under mild regularity conditions even if the regression model is improperly specified. Further, we describe how asymptotically conservative confidence intervals can be constructed. We demonstrate that the variances of the regression assisted estimators are at least as small as that of the standard difference-in-means estimator asymptotically. Through a simulation study, we illustrate the appropriateness of the proposed methods in small and moderate samples. The analysis does not require a superpopulation model, a constant treatment effect, or the truth of the regression model, and hence provides a mode of inference for the sample average treatment effect with the potential to yield improvements in the power of the resulting analysis over the classical analysis without imposing potentially unrealistic assumptions.
Social networks play a key role in studying various individual and social behaviors. To use social networks in a study, their structural properties must be measured. For offline social networks, the conventional procedure is surveying/interviewing a set of randomly-selected respondents. In many practical applications, inferring the network structure via sampling is too prohibitively costly. There are also applications in which it simply fails. For example, for optimal vaccination or employing influential spreaders for public health interventions, we need to efficiently and quickly target well-connected individuals, which random sampling does not accomplish. In a few studies, an alternative sampling scheme (which we dub `alter sampling') has proven useful. This method simply targets randomly-chosen neighbors of the randomly-selected respondents. A natural question that arises is: to what extent does this method generalize? Is the method suitable for every social network or only the very few ones considered so far? In this paper, we demonstrate the robustness of this method across a wide range of networks with diverse structural properties. The method outperforms random sampling by a large margin for a vast majority of cases. We then propose an estimator to assess the advantage of choosing alter sampling over random sampling in practical scenarios, and demonstrate its accuracy via Monte Carlo simulations on diverse synthetic networks.
We investigate the structural organization of the point-to-point electric, diffusive or hydraulic transport in complex scale-free networks. The random choice of two nodes, a source and a drain, to which a potential difference is applied, selects two tree-like structures, one emerging from the source and the other converging to the drain. These trees merge into a large cluster of the remaining nodes that is found to be quasi-equipotential and thus presents almost no resistance to transport. Such a global "tree-cluster-tree" structure is universal and leads to a power law decay of the currents distribution. Its exponent, 2-2, is determined by the multiplicative decrease of currents at successive branching points of a tree and is found to be independent of the network connectivity degree and resistance distribution.
Researchers at MIT developed Quant-BnB, a scalable Branch-and-Bound method that efficiently learns globally optimal decision trees directly with continuous features. The method achieves orders of magnitude speedup over existing optimal tree algorithms for shallow trees (depth 2-3) and often yields better generalization performance than greedy heuristics like CART.
9
The Model-free Prediction Principle has been successfully applied to general regression problems, as well as problems involving stationary and locally stationary time series. In this paper we demonstrate how Model-Free Prediction can be applied to handle random fields that are only locally stationary, i.e., they can be assumed to be stationary only across a limited part over their entire region of definition. We construct one-step-ahead point predictors and compare the performance of Model-free to Model-based prediction using models that incorporate a trend and/or heteroscedasticity. Both aspects of the paper, Model-free and Model-based, are novel in the context of random fields that are locally (but not globally) stationary. We demonstrate the application of our Model-based and Model-free point prediction methods to synthetic data as well as images from the CIFAR-10 dataset and in the latter case show that our best Model-free point prediction results outperform those obtained using Model-based prediction.
We address the Least Quantile of Squares (LQS) (and in particular the Least Median of Squares) regression problem using modern optimization methods. We propose a Mixed Integer Optimization (MIO) formulation of the LQS problem which allows us to find a provably global optimal solution for the LQS problem. Our MIO framework has the appealing characteristic that if we terminate the algorithm early, we obtain a solution with a guarantee on its sub-optimality. We also propose continuous optimization methods based on first-order subdifferential methods, sequential linear optimization and hybrid combinations of them to obtain near optimal solutions to the LQS problem. The MIO algorithm is found to benefit significantly from high quality solutions delivered by our continuous optimization based methods. We further show that the MIO approach leads to (a) an optimal solution for any dataset, where the data-points (yi,xi)(y_i,\mathbf{x}_i)'s are not necessarily in general position, (b) a simple proof of the breakdown point of the LQS objective value that holds for any dataset and (c) an extension to situations where there are polyhedral constraints on the regression coefficient vector. We report computational results with both synthetic and real-world datasets showing that the MIO algorithm with warm starts from the continuous optimization methods solve small (n=100n=100) and medium (n=500n=500) size problems to provable optimality in under two hours, and outperform all publicly available methods for large-scale (n=n={}10,000) LQS problems.
Artificial Intelligence is now recognized as a general-purpose technology with ample impact on human life. This work aims at understanding the evolution of AI and, in particular Machine learning, from the perspective of researchers' contributions to the field. In order to do so, we present several measures allowing the analyses of AI and machine learning researchers' impact, influence, and leadership over the last decades. This work also contributes, to a certain extent, to shed new light on the history and evolution of AI by exploring the dynamics involved in the field's evolution by looking at papers published at the flagship AI and machine learning conferences since the first International Joint Conference on Artificial Intelligence (IJCAI) held in 1969. AI development and evolution have led to increasing research output, reflected in the number of articles published over the last sixty years. We construct comprehensive citation collaboration and paper-author datasets and compute corresponding centrality measures to carry out our analyses. These analyses allow a better understanding of how AI has reached its current state of affairs in research. Throughout the process, we correlate these datasets with the work of the ACM Turing Award winners and the so-called two AI winters the field has gone through. We also look at self-citation trends and new authors' behaviors. Finally, we present a novel way to infer the country of affiliation of a paper from its organization. Therefore, this work provides a deep analysis of Artificial Intelligence history from information gathered and analysed from large technical venues datasets and suggests novel insights that can contribute to understanding and measuring AI's evolution.
Many applications in different domains produce large amount of time series data. Making accurate forecasting is critical for many decision makers. Various time series forecasting methods exist which use linear and nonlinear models separately or combination of both. Studies show that combining of linear and nonlinear models can be effective to improve forecasting performance. However, some assumptions that those existing methods make, might restrict their performance in certain situations. We provide a new Autoregressive Integrated Moving Average (ARIMA)-Artificial Neural Network(ANN) hybrid method that work in a more general framework. Experimental results show that strategies for decomposing the original data and for combining linear and nonlinear models throughout the hybridization process are key factors in the forecasting performance of the methods. By using appropriate strategies, our hybrid method can be an effective way to improve forecasting accuracy obtained by traditional hybrid methods and also either of the individual methods used separately.
We present new results for the Frank-Wolfe method (also known as the conditional gradient method). We derive computational guarantees for arbitrary step-size sequences, which are then applied to various step-size rules, including simple averaging and constant step-sizes. We also develop step-size rules and computational guarantees that depend naturally on the warm-start quality of the initial (and subsequent) iterates. Our results include computational guarantees for both duality/bound gaps and the so-called FW gaps. Lastly, we present complexity bounds in the presence of approximate computation of gradients and/or linear optimization subproblem solutions.
Why do banks fail? We create a panel covering most commercial banks from 1863 through 2024 to study the history of failing banks in the United States. Failing banks are characterized by rising asset losses, deteriorating solvency, and an increasing reliance on expensive noncore funding. These commonalities imply that bank failures are highly predictable using simple accounting metrics from publicly available financial statements. Failures with runs were common before deposit insurance, but these failures are strongly related to weak fundamentals, casting doubt on the importance of non-fundamental runs. Furthermore, low recovery rates on failed banks' assets suggest that most failed banks were fundamentally insolvent, barring strong assumptions about the value destruction of receiverships. Altogether, our evidence suggests that the primary cause of bank failures and banking crises is almost always and everywhere a deterioration of bank fundamentals.
A data intermediary acquires signals from individual consumers regarding their preferences. The intermediary resells the information in a product market wherein firms and consumers tailor their choices to the demand data. The social dimension of the individual data -- whereby a consumer's data are predictive of others' behavior -- generates a data externality that can reduce the intermediary's cost of acquiring the information. The intermediary optimally preserves the privacy of consumers' identities if and only if doing so increases social surplus. This policy enables the intermediary to capture the total value of the information as the number of consumers becomes large.
In recent years, there has been growing interest in solving linear optimization problems - or more simply "LP" - using first-order methods. The restarted primal-dual hybrid gradient method (PDHG) - together with some heuristic techniques - has emerged as a powerful tool for solving huge-scale LPs. However, the theoretical understanding of it and the validation of various heuristic implementation techniques are still very limited. Existing complexity analyses have relied on the Hoffman constant of the LP KKT system, which is known to be overly conservative, difficult to compute (and hence difficult to empirically validate), and fails to offer insight into instance-specific characteristics of the LP problems. These limitations have limited the capability to discern which characteristics of LP instances lead to easy versus difficult LP. With the goal of overcoming these limitations, in this paper we introduce and develop two purely geometry-based condition measures for LP instances: the "limiting error ratio" and the LP sharpness. We provide new computational guarantees for the restarted PDHG based on these two condition measures. For the limiting error ratio, we provide a computable upper bound and show its relationship with the data instance's proximity to infeasibility under perturbation. For the LP sharpness, we prove its equivalence to the stability of the LP optimal solution set under perturbation of the objective function. We validate our computational guarantees in terms of these condition measures via specially constructed instances. Conversely, our computational guarantees validate the practical efficacy of certain heuristic techniques (row preconditioners and step-size tuning) that improve computational performance in practice. Finally, we present computational experiments on LP relaxations from the MIPLIB dataset that demonstrate the promise of various implementation strategies.
There are no more papers matching your filters at the moment.