﻿ 多粒度粗糙集粒度约简的高效算法
 计算机应用   2017, Vol. 37 Issue (12): 3391-3396  DOI: 10.11772/j.issn.1001-9081.2017.12.3391 0

### 引用本文

HU Shanzhong, XU Yi, HE Minghui, WANG Ran. Effective algorithm for granulation reduction of multi-granulation rough set[J]. Journal of Computer Applications, 2017, 37(12): 3391-3396. DOI: 10.11772/j.issn.1001-9081.2017.12.3391.

### 文章历史

1. 安徽大学 计算机科学与技术学院, 合肥 230601;
2. 计算智能与信号处理教育部重点实验室(安徽大学), 合肥 230039

Effective algorithm for granulation reduction of multi-granulation rough set
HU Shanzhong1, XU Yi1,2, HE Minghui1, WANG Ran1
1. College of Computer Science and Technology, Anhui University, Hefei Anhui 230601, China;
2. Key Laboratory of Intelligent Computing and Signal Processing, Ministry of Education(Anhui University), Hefei Anhui 230039, China
Abstract: Aiming at the low efficiency problem of the existing granulation reduction algorithms for multi-granulation rough set, an Effective Algorithm for Granulation Reduction of Multi-granulation Rough Set (EAGRMRS) was proposed. Firstly, the lower approximation Boolean matrix of decision classes was defined by using the decision information system as the object. The defined matrix could be used for converting redundant and repeated set operations into Boolean operations in the process of granular reduction. Based on this matrix, the algorithm for computing lower approximation of decision classes and the algorithm for computing the important measure of granularity were presented. Then, focusing on the problem of redundancy calculation when computing the important measure of granularity, a fast algorithm for computing the important measure of granularity with dynamic increasing of granularity was presented. On the basis, the EAGRMRS was proposed. The time complexity of the proposed algorithm is O(|A|·|U|2+|A|2·|U|), in which|A|is the size of granulation set, |U|is the number of instances in decision information system. The experimental results on UCI datasets show that, the proposed algorithm is effective and efficient, the efficiency advantage of EAGRMRS is more obvious over Heuristic Approach to Granular Structure Selection (HAGSS) for multi-granulation rough set when the dataset increases.
Key words: multi-granulation rough set    boolean matrix    information system    important measure    granulation reduction
0 引言

1 相关概念

 $\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{B} \left( X \right) = \left\{ {x \in U|{{\left[x \right]}_B} \subseteq X} \right\}$ (1)
 $\overline B \left( X \right) = \left\{ {x \in U\left| {{{\left[x \right]}_B} \cap X \ne \emptyset } \right.} \right\}$ (2)
 $BN{D_B}\left( X \right) = \overline B \left( X \right)-\underline B \left( X \right)$ (3)

$BN{D_B}\left( X \right) = \emptyset$，即B(X)=B(X)时，称X在信息系统中是可以定义的，否则称X是粗糙的。X的正域POSB(X)=B(X)，X的负域NEGB(X)=U-B(X)。

 $\begin{gathered} {\underline {\sum\limits_{i = 1}^m {{A_i}} } ^{\rm{O}}}\left( X \right) = \left\{ {x:{{\left[x \right]}_{{A_1}}} \subseteq X \vee } \right. \hfill \\ \;\;\;\;\;\;\;\;{\left[x \right]_{{A_2}}} \subseteq X \vee \cdots \left. { \vee {{\left[x \right]}_{{A_m}}} \subseteq X, x \in U} \right\} \hfill \\ \end{gathered}$ (4)
 ${\overline {\sum\limits_{i = 1}^m {{A_i}} } ^{\text{O}}}\left( X \right) = \sim \left( {{{\underline {\sum\limits_{i = 1}^m {{A_i}} } }^{\text{O}}}\left( { \sim X} \right)} \right)$ (5)

 $\begin{gathered} {\underline {\sum\limits_{i = 1}^m {{A_i}} } ^{\rm{P}}}\left( X \right) = \left\{ {x:{{\left[x \right]}_{{A_1}}} \subseteq X \wedge {{\left[x \right]}_{{A_2}}} \subseteq X \wedge \cdots \wedge } \right. \hfill \\ \left. {{{\left[x \right]}_{{A_m}}} \subseteq X, x \in U} \right\} \hfill \\ \end{gathered}$ (6)
 ${\overline {\sum\limits_{i = 1}^m {{A_i}} } ^{\text{P}}}\left( X \right) = \sim \left( {{{\underline {\sum\limits_{i = 1}^m {{A_i}} } }^{\text{P}}}\left( { \sim X} \right)} \right)$ (7)

 $\gamma \left( {A,D} \right){\text{ = }}\sum\limits_{j = 1}^r {\left\{ {\left| {{{\underline {\sum\limits_{i = 1}^m {{A_i}} } }^{\text{O}}}\left( {{X_j}} \right)} \right|:{X_j} \in U/D} \right\}\left| U \right|}$ (8)
2 多粒度粗糙集的粒度约简

 $\underline D _A^{\rm{O}} = \left( {{{\underline {\sum\limits_{i = 1}^m {{A_i}} } }^{\rm{O}}}\left( {{X_1}} \right), {{\underline {\sum\limits_{i = 1}^m {{A_i}} } }^{\rm{O}}}({X_2}), \cdots, {{\underline {\sum\limits_{i = 1}^m {{A_i}} } }^{\rm{O}}}({X_r})} \right)$ (9)

DAO=DCO，则称AC的粒度乐观下近似分布一致集。粒度下近似分布一致集保持每个目标决策的下近似不变，即可保持分类的不变。

${\underline {\sum\limits_{{A_i} \in A} {{A_i}} } ^{\rm{O}}}\left( X \right){\rm{ = }}{\underline {\sum\limits_{{A_i} \in A, {A_i} \notin A'} {{A_i}} } ^{\rm{O}}}\left( X \right)$，则称A′在A中关于X是不重要的；

${\underline {\sum\limits_{{A_i} \in A} {{A_i}} } ^{\rm{O}}}\left( X \right) \ne {\underline {\sum\limits_{{A_i} \in A, {A_i} \notin A'} {{A_i}} } ^{\rm{O}}}\left( X \right)$，则称A′在A中关于X是重要的。

 $\underline {{A_i}} \left( X \right) \subseteq {\underline {\sum\limits_{{A_i} \in A'} {{A_i}} } ^{\rm{O}}}\left( X \right) \subseteq {\underline {\sum\limits_{{A_i} \in A} {{A_i}} } ^{\rm{O}}}\left( X \right)$

1) A′⊆A，∀AiA′，定义在粒度集A′上，Ai关于D的粒度重要度为：

 $Si{g_{{\rm{in}}}}({A_i}, A', D) = {\rm{abs}}(\gamma (A', D)-\gamma (A'-{A_i}, D))$ (10)

2) A′⊆A，∀AiAA′，定义在粒度集A′上，Ai关于D的粒度重要度为：

 $Si{g_{{\rm{out}}}}({A_i}, A', D){\rm{ = abs}}(\gamma ({A_i} \cup A', D)-\gamma (A'{\rm{, }}D))$ (11)

3 多粒度粗糙集粒度约简的高效算法 3.1 决策类下近似布尔矩阵

 $M\_m\left( {i, j} \right) = \left\{ \begin{gathered} 0, \;\;\;\;{x_i} \notin \underline {{A_j}} \left( {{X_k}} \right) \hfill \\ 1, \;\;\;\;{x_i} \in \underline {{A_j}} \left( {{X_k}} \right) \hfill \\ \end{gathered} \right.$ (12)

1)  设置$\mathit{\boldsymbol{M}} \leftarrow \emptyset$i←1，j←1；

2)  计算U/D={X1, X2, …, Xr}；

3)  For i=1 to |U|

For j=1 to m

If [xi]AjXk

M_m(i, j)=1；

Else

M_m(i, j)=0；

Endif

Endfor

Endfor

4)  输出M

 ${M_{\left| U \right| \times \left| A \right|}} = \left[{\begin{array}{*{20}{c}} 0&0&0&0&0&1 \\ 0&0&0&1&1&1 \\ 0&0&0&1&1&1 \\ 0&0&0&0&0&1 \\ 0&0&0&0&0&1 \\ 0&0&0&0&0&1 \end{array}} \right]$

1)  设置$L \leftarrow \emptyset$j←1；

2)  ∀xiXk

For j=1 to m

If M_m(i, j)=1

LL∪{xi}，break；

Endif

Endfor

3)  输出L

1)  设置$D' \leftarrow \emptyset$$D'' \leftarrow \emptyset$k←1；

2)  For k=1 to r

利用算法2计算${\underline {\sum\limits_{{A_j} \in A'} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right)$

$D' = D' \cup {\underline {\sum\limits_{{A_j} \in A'} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right)$

Endfor

3)  If AiA

A″=A′－Ai

Else

A″=A′∪Ai

For k=1 to r

利用算法2计算${\underline {\sum\limits_{{A_j} \in A''} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right)$

$D'' = D'' \cup {\underline {\sum\limits_{{A_j} \in A''} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right)$

Endfor

4)  基于D′和D″，根据定义9计算SigO

5)  输出SigO

3.2 快速计算粒度重要度算法

 ${\underline {\sum\limits_{j = 1}^m {{A_j}} } ^{\rm{O}}}\left( X \right) = {\underline {\sum\limits_{j = 1}^{m- 1} {{A_j}} } ^{\rm{O}}}\left( X \right) \cup \left\{ {x \notin {{\underline {\sum\limits_{j = 1}^{m- 1} {{A_j}} } }^{\rm{O}}}\left( X \right):{{\left[x \right]}_{{A_m}}} \subseteq X} \right\}$ (13)
 ${\underline {\sum\limits_{j = 1}^m {{A_j}} } ^{\rm{P}}}\left( X \right) = {\underline {\sum\limits_{j = 1}^{m- 1} {{A_j}} } ^{\rm{P}}}\left( X \right)- \left\{ {x \in {{\underline {\sum\limits_{j = 1}^{m- 1} {{A_j}} } }^{\rm{P}}}\left( X \right):{{\left[x \right]}_{{A_m}}} \not\subset X} \right\}$ (14)

1)  ${\underline {\sum\limits_{{A_j} \in A' \cup {A_i}} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right) = {\underline {\sum\limits_{{A_j} \in A'} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right)$

2)  对于$\forall {x_t} \notin {\underline {\sum\limits_{{A_j} \in A'} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right)$xtXk

If M_m(t, i)=1

${\underline {\sum\limits_{{A_j} \in A' \cup {A_i}} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right) = {\underline {\sum\limits_{{A_j} \in A' \cup {A_i}} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right) + \left\{ {{x_t}} \right\}$

Endif

3)  输出${\underline {\sum\limits_{{A_j} \in A' \cup {A_i}} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right)$

1)  设置$A'' \leftarrow \emptyset, D_{A''}^{\rm{O}} \leftarrow \emptyset$

2)  A″=A′∪Ai

For k=1 to r

基于DAO，利用算法4计算${\underline {\sum\limits_{{A_j} \in A''} {{A_j}} } ^{\rm{O}}}\left( {{X_k}} \right)$

$\underline D _{A''}^{\rm{O}} = \underline D _{A''}^{\rm{O}} \cup \left\{ {{{\underline {\sum\limits_{{A_j} \in A''} {{A_j}} } }^{\rm{O}}}\left( {{X_k}} \right)} \right\}$

Endfor

3)  基于DAODAO，根据定义9计算SigO

4)  输出SigO

3.3 EAGRMRS

1)  设置$RE{D^{\rm{O}}} \leftarrow \emptyset$

2)  利用算法1建立决策类下近似布尔矩阵M

3)  对于∀AiA，利用算法3计算Sigin(Ai, A, D)，

If Sigin(Ai, A, D)≠0

REDO=REDOAi

Endif

令COREO=REDO

4)  利用算法2计算粒度集合REDO上的决策类下近似构成的集合$\underline {D_{RE{D^{\rm{O}}}}^{\rm{O}}}$

Do

∀AiAREDO，基于$\underline {D_{RE{D^{\rm{O}}}}^{\rm{O}}}$，利用算法5计算Sigout(Ai, REDO, D)；

If Sigout(Ak, REDO, D)=max{Sigout(Ai, REDO, D):AiAREDO}

REDO=REDOAk

更新$\underline {D_{RE{D^{\rm{O}}}}^{\rm{O}}}$

Endif

Until γ(REDO, D)=γ(A, D)；

5)  ∀AiREDOCOREO

If γ(REDOAi, D)=γ(A, D)

REDO=REDOAi

Endif

6)  输出REDO

4 实验结果与分析

 图 1 不同数据集上不同算法粒度约简的运行时间 Figure 1 Running time of granulation reduction for different algorithms on different datasets

 图 2 不同算法在不同数据集上的运行时间 Figure 2 Running time of different algorithms on different datasets

5 结语

 [1] PAWLAK Z. Rough sets[J]. International Journal of Computer & Information Sciences, 1982, 11(5): 341-356. [2] HU Q H, CHE X J, ZHANG L, et al. Rank entropy based decision trees for monotonic classification[J]. IEEE Transactions on Knowledge & Data Engineering, 2012, 24(11): 2052-2064. [3] SRINIVASA K G, VENUGOPAL K R, PATNAIK L M. A soft computing approach for data mining based query processing using rough sets and genetic algorithms[J]. International Journal of Hybrid Intelligent Systems, 2008, 5(1): 1-17. DOI:10.3233/HIS-2008-5101 [4] 白鹤翔, 王健, 李德玉, 等. 基于粗糙集的非监督快速属性选择算法[J]. 计算机应用, 2015, 35(8): 2355-2359. (BAI H X, WANG J, LI D Y, et al. Fast unsupervised feature selection algorithm based on rough set theory[J]. Journal of Computer Applications, 2015, 35(8): 2355-2359. DOI:10.11772/j.issn.1001-9081.2015.08.2355) [5] QIAN Y H, LIANG J Y, YAO Y Y, et al. MGRS:a multi-granulation rough set[J]. Information Sciences, 2010, 180(6): 949-970. DOI:10.1016/j.ins.2009.11.023 [6] XU W H, SUN W X, ZHANG X Y, et al. Multiple granulation rough set approach to ordered information systems[J]. International Journal of General Systems, 2012, 41(5): 475-501. DOI:10.1080/03081079.2012.673598 [7] YANG X B, SONG X N, DOU H L, et al. Multi-granulation rough set:from crisp to fuzzy case[J]. Annals of Fuzzy Mathematics and Informatics, 2011, 1(1): 55-70. [8] LIN G P, QIAN Y H, LI J J. NMGRS:Neighborhood-based multigranulation rough sets[J]. International Journal of Approximate Reasoning, 2012, 53(7): 1080-1093. DOI:10.1016/j.ijar.2012.05.004 [9] QIAN Y H, ZHANG H, SANG Y L, et al. Multigranulation decision-theoretic rough sets[J]. International Journal of Approximate Reasoning, 2014, 55(1): 225-237. DOI:10.1016/j.ijar.2013.03.004 [10] 张明, 唐振民, 徐维艳, 等. 可变多粒度粗糙集模型[J]. 模式识别与人工智能, 2012, 25(4): 709-720. (ZHANG M, TANG Z M, XU W Y, et al. Variable multigranulation rough set model[J]. Pattern Recognition and Artificial Intelligence, 2012, 25(4): 709-720.) [11] 桑妍丽, 钱宇华. 一种悲观多粒度粗糙集中的粒度约简算法[J]. 模式识别与人工智能, 2012, 25(3): 361-366. (SANG Y L, QIAN Y H. A granular space reduction approach to pessimistic multi-granulation rough sets[J]. Pattern Recognition and Artificial Intelligence, 2012, 25(3): 361-366.) [12] YANG X B, QI Y S, SONG X N, et al. Test cost sensitive multigranulation rough set:model and minimal cost selection[J]. Information Sciences, 2013, 250(11): 184-199. [13] 张明, 程科, 杨习贝, 等. 基于加权粒度的多粒度粗糙集[J]. 控制与决策, 2015, 30(2): 222-228. (ZHANG M, CHENG K, YANG X B, et al. Multigranulation rough set based on weighted granulations[J]. Control and Decision, 2015, 30(2): 222-228.) [14] DAI J H, WANG W T, TIAN H W, et al. Attribute selection based on a new conditional entropy for incomplete decision systems[J]. Knowledge-Based Systems, 2013, 39(2): 207-213. [15] MIAO D Q, GAO C, ZHANG N, et al. Diverse reduct subspaces based co-training for partially labeled data[J]. International Journal of Approximate Reasoning, 2011, 52(8): 1103-1117. DOI:10.1016/j.ijar.2011.05.006 [16] QIAN J, MIAO D Q, ZHANG Z H, et al. Hybrid approaches to attribute reduction based on indiscernibility and discernibility relation[J]. International Journal of Approximate Reasoning, 2011, 52(2): 212-230. DOI:10.1016/j.ijar.2010.07.011 [17] 鞠恒荣, 周献中, 杨佩, 等. 测试代价敏感的粗糙集方法[J]. 系统工程理论与实践, 2017, 37(1): 228-240. (JU H R, ZHOU X Z, YANG P, et al. Test-cost-sensitive based rough set approach[J]. System Engineering-Theory and Practice, 2017, 37(1): 228-240. DOI:10.12011/1000-6788(2017)01-0228-13) [18] YANG X B, QI Y, YU H L, et al. Updating multigranulation rough approximations with increasing of granular structures[J]. Knowledge-Based Systems, 2014, 64(1): 59-69.