A Bug Inducing Commit (BIC) is a code change that introduces a bug into the
codebase. Although the abnormal or unexpected behavior caused by the bug may
not manifest immediately, it will eventually lead to program failures further
down the line. When such a program failure is observed, identifying the
relevant BIC can aid in the bug resolution process, because knowing the
original intent and context behind the code change, as well as having a link to
the author of that change, can facilitate bug triaging and debugging. However,
existing BIC identification techniques have limitations. Bisection can be
computationally expensive because it requires executing failing tests against
previous versions of the codebase. Other techniques rely on the availability of
specific post hoc artifacts, such as bug reports or bug fixes. In this paper,
we propose a technique called Fonte that aims to identify the BIC with a core
concept that a commit is more likely to be a BIC if it has more recently
modified code elements that are highly suspicious of containing the bug. To
realise this idea, Fonte leverages two fundamental relationships in software:
the failure-to-code relationship, which can be quantified through fault
localisation techniques, and the code-to-commit relationship, which can be
obtained from version control systems. Our empirical evaluation using 206
real-world BICs from open-source Java projects shows that Fonte significantly
outperforms state-of-the-art BIC identification techniques, achieving up to
45.8% higher MRR. We also report that the ranking scores produced by Fonte can
be used to perform weighted bisection. Finally, we apply Fonte to a large-scale
industry project with over 10M lines of code, and show that it can rank the
actual BIC within the top five commits for 87% of the studied real
batch-testing failures, and save the BIC inspection cost by 32% on average.