Scientific hypothesis generation is a fundamentally challenging task in
research, requiring the synthesis of novel and empirically grounded insights.
Traditional approaches rely on human intuition and domain expertise, while
purely large language model (LLM) based methods often struggle to produce
hypotheses that are both innovative and reliable. To address these limitations,
we propose the Monte Carlo Nash Equilibrium Self-Refine Tree (MC-NEST), a novel
framework that integrates Monte Carlo Tree Search with Nash Equilibrium
strategies to iteratively refine and validate hypotheses. MC-NEST dynamically
balances exploration and exploitation through adaptive sampling strategies,
which prioritize high-potential hypotheses while maintaining diversity in the
search space. We demonstrate the effectiveness of MC-NEST through comprehensive
experiments across multiple domains, including biomedicine, social science, and
computer science. MC-NEST achieves average scores of 2.65, 2.74, and 2.80 (on a
1-3 scale) for novelty, clarity, significance, and verifiability metrics on the
social science, computer science, and biomedicine datasets, respectively,
outperforming state-of-the-art prompt-based methods, which achieve 2.36, 2.51,
and 2.52 on the same datasets. These results underscore MC-NEST's ability to
generate high-quality, empirically grounded hypotheses across diverse domains.
Furthermore, MC-NEST facilitates structured human-AI collaboration, ensuring
that LLMs augment human creativity rather than replace it. By addressing key
challenges such as iterative refinement and the exploration-exploitation
balance, MC-NEST sets a new benchmark in automated hypothesis generation.
Additionally, MC-NEST's ethical design enables responsible AI use, emphasizing
transparency and human supervision in hypothesis generation.