Abstract:
As the internet coverage continues to expand, obtaining valuable information from a large amount of fragmented semi-structured text data has become a huge challenge considering the vast amount of social public information. Event trigger identification technology can effectively mine and refine text information so that the users can quickly and accurately get what they need; thus, it has gradually become an active research area in the field of natural language processing. An event trigger word is generally a word or phrase that marks the occurrence of the event, then trigger word identification has been applied to many aspects and plays an important role in the fields of knowledge base construction, intelligent search engine, automatic question answering robot, and automatic summarization. However, the text data are characterized by high dimensionality and ambiguity. The existing identification methods are mostly based on manual complex feature engineering or only consider the features in a certain text window. In this process, manual analysis and selection of a large number of features are required. Considerable reliance on natural language processing tools leads to the inability of applying the model on a large scale, and there are problems of erroneous cascade communication and complicated feature engineering. This paper proposed a fusion model based on the bidirectional long short-term memory (BiLSTM) and feed-forward neural networks to complete the trigger identification task for public security events. First, the high-level features of the entire text were extracted through BiLSTM to avoid manual feature extraction, which was associated with the existing machine learning methods. Then, contacted features were used to input feed-forward neural networks and identify event triggers. The experimental results show that the proposed method achieves good performance in the Chinese emergency corpus, CEC, and the Micro-F1 is 78.47%. In addition, the importance of different contacted features was also discussed in trigger word recognition tasks, and the importance of three types of features, namely part of speech, syntax, and entity, in text analysis was analyzed. It is concluded that syntactic features are most helpful to the task of event-trigger word recognition.