知识图谱的最新进展、关键技术和挑战

Recent advances, key techniques and future challenges of knowledge graph

  • 摘要: 围绕知识图谱的全生命周期技术,从知识抽取、知识融合、知识推理、知识应用几个层面展开综述,重点介绍了知识融合技术和知识推理技术。通过知识抽取技术,可从已有的结构化、半结构化、非结构化样本源以及一些开源的百科类网站抽取实体、关系、属性等知识要素。通过知识融合,可消除实体、关系、属性等指称项与实体对象之间的歧义,得到一系列基本的事实表达。通过本体抽取、知识推理和质量评估形成最终的知识图谱库。按照知识抽取、知识融合、知识推理3个步骤对知识图谱迭代更新,实现碎片化的互联网知识的自动抽取、自动关联和融合、自动加工,从而拥有词条自动化链接、词条编辑辅助功能,最终达成全流程自动化知识获取的目标。最后,讨论知识图谱未来的发展方向与可能存在的挑战。

     

    Abstract: The Google knowledge graph is a knowledge base used by Google and its services to enhance the search engine's results with information gathered from a variety of sources. Since its inception by Google to improve users' quality of experience of the search engine, the knowledge graph has become a term that is recently ubiquitously used in medical, education, finance, e-commerce and other industries to promote artificial intelligence (AI), which evolves from perceptual intelligence to cognitive intelligence. As a branch of knowledge engineering, a knowledge graph is based on the semantic network of knowledge engineering, and it combines the latest advancements achieved in machine learning, natural language processing, knowledge representation, and inference. Both academia and industries are showing keen interest in AI, and several studies are in progress under promotion of big data. With its powerful semantic processing and open interconnection capabilities, the knowledge graph can break the data isolation in different scenarios, and can generate application value in intelligent information services such as intelligent search and recommendation, intelligent question answering, and content distribution networks, thereby making information services more intelligent. The state of the art of knowledge graph technologies is outlined by introducing a process of building a knowledge graph. A knowledge graph is a structured representation of facts, consisting of entities, relations and semantic descriptions. A comprehensive summary of the overall lifecycle technologies of the knowledge graph is provided, including knowledge extraction, knowledge fusion, knowledge reasoning, and knowledge application. But the focus is on knowledge fusion and knowledge reasoning. Entities, relations, attributes, and other knowledge elements can be extracted from existing structured, semi-structured, unstructured data sources, and websites given in encyclopedia using knowledge extraction. With knowledge fusion, the ambiguity between referential items such as entities, relations, and attributes can be eliminated, and a series of basic facts can be obtained. The final knowledge base is formed through ontology extraction, knowledge reasoning and quality evaluation. Following the three steps of knowledge extraction, knowledge fusion, and knowledge reasoning, it can iteratively update the knowledge graph and realize full process automation knowledge acquisition, such as realizing the automatic extraction, automatic association and fusion, automatic processing of fragmented Internet knowledge, and realizing automatic linking of entries and auxiliary functions of entry editing. Finally, the future directions and possible challenges of the knowledge graph are discussed.

     

/

返回文章
返回