基于超立方体与信息熵的离散化方法

Discretization algorithm based on super-cube and information entropy

  • 摘要: 针对粗糙集中连续属性需要离散化问题进行了研究.根据数据对象的可分辨性原理构造超立方体,在数据空间上对信息表中的连续属性进行整体离散化处理.根据条件属性与决策属性的一致性关系,依照条件属性在粗糙集边界域中的分类能力来确定条件属性的重要性,在此基础上选取重要划分点对信息表中的连续属性进行局部离散化,同时以信息熵作为迭代约束条件.数值示例和实验表明这种整体与局部相结合的离散化方法是有效可行的.

     

    Abstract: Discretizing continuous attributes in a rough set were researched. Based on the concept of super-cube, all attributes of the information table in data space were globally discretized. By the consistent correlation of condition attributes and decision attributes, important condition attributes were selected depending on their classifying ability in the rough set boundary zone, and furthermore, important breaking points were selected to discretize the information table on a single attribute locally with the iterative constraints of information entropy. Illustration and experimental results indicate that the algorithm combining the global and local discretization is effective and efficient.

     

/

返回文章
返回