基于高斯核函数改进的电力用户用电数据离群点检测方法Improved Outlier Detection Method of Power Consumer Data Based on Gaussian Kernel Function
孙毅;李世豪;崔灿;李彬;陈宋宋;崔高颖;
摘要(Abstract):
针对智能配用电大数据背景下用电数据离群点检测方法的适用性以及实际数据集中异常用电样本获取成本较高的问题,提出一种基于高斯核函数改进的电力用户用电数据离群点检测方法。首先通过模糊聚类的方法将用户分类;然后提取每一类用户的用电行为特征量,采用主成分分析法对特征集进行降维;最后利用高斯核函数改进局部离群因子算法,提出高斯核密度局部离群因子(Gaussian kernel densitybased local outlier factor,GKLOF)算法,通过理论推导与仿真实验相结合的方式分析了GKLOF算法的特性。选取了5000个用户真实的用电数据进行实验分析,实验结果表明,该方法具有较高的检测准确率以及较为稳定的判定阈值,并且受局部数据分布的影响较小,更加适用于用户用电行为复杂多样以及实际数据集中所有用户用电行为类型信息未知情况下的离群点检测。
关键词(KeyWords): 电力大数据;数据挖掘;离群点检测;高斯核密度局部离群因子;用电行为分析
基金项目(Foundation): 国家重点研究发展计划项目(2016YFB0901104)~~
作者(Author): 孙毅;李世豪;崔灿;李彬;陈宋宋;崔高颖;
Email:
DOI: 10.13335/j.1000-3673.pst.2017.1586
参考文献(References):
- [1]苗新,张东霞,孙德栋.在配电网中应用大数据的机遇与挑战[J].电网技术,2015,39(11):3122-3127.Miao Xin,Zhang Dongxia,Sun Dedong.The opportunity and challenge of big data’s application in power distribution networks[J].Power System Technology,2015,39(11):3122-3127(in Chinese).
- [2]刘科研,盛万兴,张东霞,等.智能配电网大数据应用需求和场景分析研究[J].中国电机工程学报,2015,35(2):287-293.Liu Keyan,Sheng Wanxing,Zhang Dongxia,et al.Big data application requirements and scenario analysis in smart distribution network[J].Proceedings of the CSEE,2015,35(2):287-293(in Chinese).
- [3]彭显刚,郑伟钦,林利祥,等.基于密度聚类和Fréchet判别分析的电价执行稽查方法[J].电网技术,2015,39(11):3195-3201.Peng Xiangang,Zheng Weiqin,Lin Lixiang,et al.A method to inspect the implementation of electricity price based on density clustering analysis and Fréchet discriminant analysis[J].Power System Technology,2015,39(11):3195-3201(in Chinese).
- [4]赵腾,张焰,张东霞.智能配电网大数据应用技术与前景分析[J].电网技术,2014,38(12):3305-3312.Zhao Teng,Zhang Yan,Zhang Dongxia.Application technology of big data in smart distribution grid and its prospect analysis[J].Power System Technology,2014,38(12):3305-3312(in Chinese).
- [5]Han S J,Cho S B,et al.Evolutionary neural networks for anomaly detection based on the behavior of a program[J].IEEE Transactions on Systems,Man,and Cybernetics,Part B,2005,36(3):559-570.
- [6]Nizar A H,Dong Z Y,Wang Y.Power utility nontechnical loss analysis with extreme learning machine model[J].IEEE Transactions on Power Systems,2008,23(3):946-955.
- [7]Nagi J,Tiong K S,Ahmed S K,et al.Non-technical loss detection for metered customers in power utility using support vector machines[J].IEEE Transactions on Power Delivery,2010,25(2):1162-1171.
- [8]Nagi J,Yap K S,et al.Detection of abnormalities and electricity theft using genetic support vector machines[C]//TENCON-IEEE Region 10Conference Proceedings.Hyderabad,India:IEEE,2008:1-6.
- [9]Nagi J,Yap K S,Tiong S K,et al.Improving SVM-based nontechnical loss detection in power utility using the fuzzy inference system[J].IEEE Transactions on Power Delivery,2011,26(2):1284-1285.
- [10]Leon C.Variability and trend-generalized rule induction model to NTL detection in power companies[J].IEEE Transactions on Power Systems,2011,26(4):1798-1807.
- [11]Amin S,Schwartz G A,Tembine H.Incentives and security in electricity distribution networks[J].Decision and Game Theory for Security,2012,27(2):264-280.
- [12]Cardenas A A,Amin S,Schwartz G,et al.A game theory model for electricity theft detection and privacy-aware control in AMI systems[C]//IEEE Allerton Conference on Communication,Control and Computing.Monticello,IL,USA,2012:1830-1837.
- [13]Breuning M M,Kriegel H P,Ng R T,et al.LOF:identifying density-based local outliers[J].ACM SIGMOD International Conference on Management of Data,2000,9(2):93-104.
- [14]胡彩平,秦小麟.一种基于密度的局部离群点检测算法DLOF[J].计算机研究与发展,2010,47(12):2110-2116.Hu Caiping,Qin Xiaolin.A density-based local outlier detecting algorithm[J].Journal of Computer Research and Development,2010,47(12):2110-2116(in Chinese).
- [15]韩敏,张占奎.基于加权核独立成分分析的故障检测方法[J].控制与决策,2016(2):242-248.Han Min,Zhang Zhankui.Fault detection method based on weighted kernel independent component analysis[J].Control and Decision,2016(2):242-248(in Chinese).
- [16]张蕾.一种基于核空间局部离群因子的离群点挖掘方法[J].上海电机学院学报,2014(3):132-136,143.Zhang Lei.Outlier mining based on kernel local outlier factor[J].Journal of Shanghai Dianji University,2014(3):132-136,143(in Chinese).
- [17]庄池杰,张斌,胡军,等.基于无监督学习的电力用户异常用电模式检测[J].中国电机工程学报,2016,36(2):379-387.Zhuang Chijie,Zhang Bin,Hu Jun,et al.Anomaly detection for power consumption patterns based on unsupervised learning[J].Proceedings of the CSEE,2016,36(2):379-387(in Chinese).
- [18]欧阳森,李奇,石怡理,等.考虑模糊聚类特性的电能质量预警方法及其应用[J].电网技术,2014,38(6):1712-1716.Ou Yangsen,Li Qi,Shi Yili,et al.Early-warning method of power quality considering the characteristics of fuzzy clustering and its application[J].Power System Technology,2014,38(6):1712-1716(in Chinese).
- [19]吴旭,张建华,赵天阳,等.基于模糊聚类和模糊推理的电网连锁故障预警方法[J].电网技术,2013,37(6):1659-1665.Wu Xu,Zhang Jianhua,Zhao Tianyang,et al.A forewarning method of cascading failure in power grid based on fuzzy clustering and fuzzy inference[J].Power System Technology,2013,37(6):1659-1665(in Chinese).
- [20]Nizar A H,Dong Z Y,Jalaluddin M,et al.Load profiling method in detecting non-technical loss activities in a power utility[C]//IEEE Power and Energy Conference.Putrajaya,Malaysia:IEEE,2006,82-87.
- [21]Mclaughlin S,Holbert B,Fawaz A,et al.A multi-sensor energy theft detection framework for advanced metering infrastructures[J].IEEE Journal on Selected Areas in Communications,2013,31(7),1319-1330.
- [22]宋易阳,李存斌,祁之强.基于云模型和模糊聚类的电力负荷模式提取方法[J].电网技术,2014,38(12):3378-3383.Song Yiyang,Li Cunbin,Qi Zhiqiang.Extraction of power load patterns based on cloud model and fuzzy clustering[J].Power System Technology,2014,38(12):3378-3383(in Chinese).
- [23]Jokar P,Arianpoo N,Leung C M.Electricity theft detection in AMI using customer consumption patterns[J].IEEE Transactions on Smart Grid,2016,7(1):216-226.
- [24]Stehman S V.Selecting and interpreting measures of thematic classification accuracy[J].Remote Sensing of Environment,1997,62(1):77-89.
- [25]Fawcett T.An introduction to ROC analysis[J].Pattern Recognition Letters,2006,27(8):861-874.附录A