Tag: llm-evaluation
All the articles with the tag "llm-evaluation".
- 5.0
Principled Detection of Hallucinations in LLMs via Multiple Testing
基于多重检验统计原则的 LLM 幻觉检测方法,解决现有规则不稳定问题
- 4.0
Do LLMs Capture Embodied Cognition and Cultural Variation? Cross-Linguistic Evidence from Demonstratives
用指示词(this/that)作为探针研究 LLM 是否真正获得具身认知和文化约定