The assessment of safety performance plays a pivotal role in the development
and deployment of connected and automated vehicles (CAVs). A common approach
involves designing testing scenarios based on prior knowledge of CAVs (e.g.,
surrogate models), conducting tests in these scenarios, and subsequently
evaluating CAVs' safety performances. However, substantial differences between
CAVs and the prior knowledge can significantly diminish the evaluation
efficiency. In response to this issue, existing studies predominantly
concentrate on the adaptive design of testing scenarios during the CAV testing
process. Yet, these methods have limitations in their applicability to
high-dimensional scenarios. To overcome this challenge, we develop an adaptive
testing environment that bolsters evaluation robustness by incorporating
multiple surrogate models and optimizing the combination coefficients of these
surrogate models to enhance evaluation efficiency. We formulate the
optimization problem as a regression task utilizing quadratic programming. To
efficiently obtain the regression target via reinforcement learning, we propose
the dense reinforcement learning method and devise a new adaptive policy with
high sample efficiency. Essentially, our approach centers on learning the
values of critical scenes displaying substantial surrogate-to-real gaps. The
effectiveness of our method is validated in high-dimensional overtaking
scenarios, demonstrating that our approach achieves notable evaluation
efficiency.