基于多镜像站点的分布式Web使用聚类
Distributed Web usage clustering based on multi-mirror image sites
-
摘要: 提出了一种适用于多镜像站点环境下的分布式Web使用聚类局部挖掘算法LUC和全局挖掘算法GUC,较好地解决了Web访问信息的异地存储、分布式算法通讯量等因素给模式分析过程带来的困难.将给出的算法用Java语言加以实现,并对算法性能进行了研究.结果证明,该算法是有效的,可以用来高效、准确地在多镜像站点环境下发现Web用户群体模式.Abstract: The general algorithms of local Web usage clustering (LUC) and global Web usage clustering (GUC) in a distributed data mining system based on multi-mirror sites were proposed, which better solved the troubles made by distributed Web access information and communication number. Java language was used to implement the algorithms and its performance was studied. The results showed that the algorithms were valid and could be effectively and accurately identified by Web user group patterns.