Abstract:
Value priority recognition is a fundamental task in computational linguistics that focuses on discerning and categorizing the implicit hierarchical structure of human values manifested within textual expressions. Its core objective is to determine whether textual content aligns with specific value types and identify the relative precedence assigned to these values within the implicit hierarchy. This capability holds profound significance across several critical domains. It is indispensable for conducting sophisticated analysis of user language patterns in behavioral profiling. It serves as an essential metric for evaluating the ethical alignment and value consistency inherent in content generated by large language models. It establishes a vital methodological foundation for investigating the capacity of large language models to comprehend, interpret, and evaluate the complex hierarchies inherent in human value systems. Dialogue, which is a primary and natural mode of human communication, inherently functions as a potent vehicle for expressing value-driven judgments, preferences, and priorities. The interactive nature of conversations involving turn-taking, argumentation, and negotiation frequently reveals implicit value tradeoffs and hierarchical relationships. Consequently, dialogue presents an exceptionally fertile domain for modeling human value prioritization. Despite this inherent suitability, dedicated research that focuses on systematically modeling human value priorities within interactive conversational settings remains underdeveloped and largely unexplored. This significant research gap stems primarily from the current absence of dedicated high-quality datasets specifically designed for recognizing value priorities within authentic dialogue contexts. The lack of such resources substantially hinders empirical investigations and development of effective computational models in this field. Therefore, the creation of a meticulously annotated large-scale dataset for dialogue value priority recognition emerges as an essential foundational prerequisite for advancing scholarly understanding in this area. However, the annotation process required for constructing such specialized datasets encounters substantial and intrinsic challenges. These difficulties arise principally from the complex cognitive nature and profound subjectivity that characterize human values. Values represent deeply held, often abstract, cognitive constructs that fundamentally guide decision making and behavior. Their reliable identification and hierarchical ordering within textual data necessitate more than superficial linguistic analysis; they demand interpretative insight into the underlying motivations, ethical frameworks, and contextual nuances. This cognitive dimension imposes rigorous requirements on human annotators, who must possess substantial expertise in relevant psychological theories, principles of cognitive science pertaining to moral reasoning, and sociolinguistic understanding. Consequently, achieving consistent, reliable, and expert-level manual annotation is a prohibitively high barrier. This challenge inevitably leads to persistent issues including inconsistency among different annotators, conceptual ambiguity in label application, and substantial noise within the resulting annotations—factors that can critically compromise dataset quality and the subsequent performance of models trained upon it. To address these challenges, this study strategically leveraged the advanced capabilities of contemporary large language models. We propose a novel annotation method for dialogue value priority recognition using existing textual dialogue corpora by capitalizing on their exceptional natural language understanding, sophisticated reasoning abilities, and extensive internalized knowledge bases in psychology, ethics, philosophy, and social sciences,. Based on this method, we constructed the ValueCon dataset, a large-scale, high-quality benchmark dataset specifically designed for value priority recognition in dialogue. The experimental results demonstrate that compared with manual annotation methods, the annotation method proposed in this study alleviated the inconsistencies and noise associated with manual annotation. The ValueCon dataset constructed based on this method can effectively train dialogue value recognition models, thereby validating the practical value of the annotation method proposed in this study.