基于软超球体的高维非线性数据异常点识别算法

An outlier detection algorithm based on a soft hyper-sphere for high dimension nonlinear data

  • 摘要: 在冶金、化工等流程型工业领域,生产中的过程控制参数往往具有高维非线性结构特征.为了解决这类高维复杂数据的异常点检测问题,本文引入了软超球体的概念,采用非线性核函数将原始数据映射到高维的特征空间,并在特征空间中确定软超球体的边界.通过检测待识别样本映射到特征空间的位置信息来判定过程参数的设定值是否为异常点,从而避免出现批量的产品质量问题.以某类汽车用钢为应用实例,对实际生产数据进行检测,证明了所提出的基于软超球体的异常点识别算法对于高维的非线性数据具有良好的检测能力.

     

    Abstract: In process industries, such as metallurgy and chemistry, real procedure parameters usually possess high-dimensional nonlinear features. To solve the problem of outlier detection in complex high-dimensional data, the concept of a soft hyper-sphere is introduced in this paper. An original data set is projected into a high-dimensional feature space using a nonlinear kernel function, and the boundary of the soft hyper-sphere is determined within this feature space. To avoid a mass product quality incident, location information on the testing samples, which are projected into the feature space, is used to decide whether they are outliers. As an applied example, practical procedure data obtained from a type of auto steel product were tested. The results verify that the proposed outlier detection algorithm based on a soft hyper-sphere has a better ability for outlier detection in high-dimensional nonlinear data than tradional methods.

     

/

返回文章
返回