This survey provides a comprehensive review of instruction tuning for Large Language Models, detailing methodologies, datasets, models, and applications. It highlights how instruction tuning aligns LLMs with human instructions and demonstrates its continued necessity as a foundational step in modern alignment pipelines, while also addressing challenges like superficial alignment.
View blogGPT-NER adapts Large Language Models for Named Entity Recognition by re-framing the task as text generation, complemented by targeted demonstration retrieval and a self-verification mechanism. This approach achieves performance comparable to supervised baselines on full datasets and significantly outperforms them in low-resource settings.
View blogResearchers from Zhejiang University and Shannon.AI introduce Self-Adjusting Dice Loss (DSC) to address data imbalance in natural language processing tasks. This F1-oriented loss function consistently improves performance on tasks like NER, POS tagging, and Machine Reading Comprehension, achieving new state-of-the-art results on several benchmarks.
View blogGlyce develops Glyph-vectors for Chinese characters, effectively integrating their visual forms into NLP models by using an ensemble of historical scripts, a specialized Tianzige-CNN, and an auxiliary image classification objective. The method consistently achieves new state-of-the-art results across a comprehensive suite of Chinese NLP tasks when combined with BERT.
View blog