Ask or search anything...

History

Events

Watch Recordings

AI for Law01/09 · Joel Niklaus · Hugging Face

Papers Benchmarks

Hot

MiraclePlus

A Chinese Dataset for Evaluating the Safeguards in Large Language Models

04 Aug 2024

Tsinghua University The University of Melbourne

Researchers developed an open-source Chinese dataset to evaluate Large Language Model safeguards, revealing that region-specific cultural and political sensitivities are critical determinants of safety performance, particularly for models not aligned with these contexts, and demonstrating LLMs' vulnerability to subtly disguised harmful prompts.

View blog

#computer-science #computation-and-language

Resources

There are no more papers matching your filters at the moment.

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Dark mode

Ask or search anything...

Events