Researchers from Smartesting and Université de Bordeaux explored the efficacy of autonomous web agents for executing natural language test cases, introducing a dedicated benchmark and two open-source agent implementations. Their advanced agent, PinATA, achieved a 50% higher True Accuracy compared to a baseline, though a qualitative analysis revealed five categories of persistent limitations.
There are no more papers matching your filters at the moment.