[1]蔡艳婧,程晓红,程显毅.网络敏感信息动态特征的抽取方法[J].常州大学学报(自然科学版),2014,(04):80-85.[doi:10.3969/j.issn.2095-0411.2014.04.017]
 CAI Yan-jing,CHENG Xiao-hong,CHENG Xian-yi.Research on Algorithm of Network Sensitive InformationFeatures Extracting[J].Journal of Changzhou University(Natural Science Edition),2014,(04):80-85.[doi:10.3969/j.issn.2095-0411.2014.04.017]
点击复制

网络敏感信息动态特征的抽取方法()
分享到:

常州大学学报(自然科学版)[ISSN:2095-0411/CN:32-1822/N]

卷:
期数:
2014年04期
页码:
80-85
栏目:
计算机与信息工程
出版日期:
2014-08-31

文章信息/Info

Title:
Research on Algorithm of Network Sensitive Information Features Extracting
作者:
蔡艳婧1程晓红2程显毅2
1.江苏商贸职业学院,江苏 南通 226001; 2.南通大学 计算机科学与技术学院,江苏 南通 226019
Author(s):
CAI Yan-jing1CHENG Xiao-hong2CHENG Xian-yi2
1.Jiangsu Vocational College of Business,Nantong 226019,China; 2.School of Computer Science and Technology,Nantong University,Nantong 226019,China
关键词:
敏感信息 信息过滤 自然语言处理 意见挖掘
Keywords:
sensitive information information filtering natural language processing opinions mining
分类号:
TP 309
DOI:
10.3969/j.issn.2095-0411.2014.04.017
文献标志码:
A
摘要:
网络的匿名性、开放性、平等性、交互性等特点不可避免地会出现一些不和谐“杂音”,人们怎样才能吸取精华、去其糟粕,已经成为网络信息安全迫切需要解决的问题。针对传统的文本特征抽取方法,在应用于敏感信息过滤时出现的时间滞后、准确性低、自适应性差等问题,以网络舆论观点文本为研究对象,结合敏感信息特性,提出融合意见挖据和自然语言处理技术的敏感信息动态特征抽取方法,实验表明,本方法对敏感信息过滤有明显优势,实现了字典的动态维护。
Abstract:
Because of the characteristics of anonymity,openness,equality and interaction in internet,it is inevitab there will be some disharmonious‘noise’,and how to absorb the essence and discard the dregs,has become an urgent need to solve the problem in the network information security.Aiming at the problems of time lag,low accuracy and poor adaptability in applied traditional text feature extraction method to sensitive information filtering,this paper put network public opinion as the research object,taking into consideration the characteristics of sensitive information.A method of sensitive information dynamic feature extraction is put forward for fusion opinions mining and natural languageprocessing technology.The experimental results show that the method has obviousadvantages to the sensitive information filtering and the dynamic maintenance ofthe dictionary is realized.

参考文献/References:

[1]林佳豪.敏感信息[EB/OL].(2004-04-01)
[2014-05-20].http://baike.baidu.com/view/3061484.htm.
[2]Reagle J.Statement on the Internet Use of PICS:Using PICS Web[EB/OL].(1997-02 -03)
[2014-05-18]. http://www.w3.org/PICS.
[3]GREEVY E,ALAN F S.Classifying racist texts using a support vector machine[C]∥Proceedings of the 27th Annual International ACM SIGIR Conference.New York:CSREA Press,2004:468-469.
[4]Che Wanxiang,Li Zhenghua,Liu Ting.LTP:A Chinese Language Technology Platform[C].Beijing:Chinese Information Processing Society of China,2010.
[5]Tsou B,Yuen R,Kwong O,et al.Polarity classification of celebrity coverage in the chinese press[C]∥Proceedings of the International Conference on Intelligence Analysis.McLean:Springer-Verlag Press,2005.
[6]Xia Y,Wong K F,Li W.A Phonetic-based approach to chinese chat text normalization[C]∥Proceedings of the 21st International Conference on ComputationalLinguistics and 44th Annual Meeting of the Association for Computational Linguistics.Sydney:Melbourne University Publishing,2006:993-1000.
[7]ZHOU Y L,REID E,QIN J.US domestic extremist groups on the Web:link and content analysis[J].IEEE Intelligent Systems,2005,20(5):44-51.
[8]MIN J,HUANG X J.Text filtering system based on topic and sentimentclassification[J].Computer Engineering,2007,33(2):163-164.
[9]TANG Xuri,CHEN Xiaohe,QU Weiguang,et al.Semi-Supervised WSD in Selectional Preferences with Semantic Redundancy[C].Beijing:Chinese Information Processing Society of China,2010.
[10]Dave K,Lawrence S,Pennock D M.Mining the peanut gallery:Opinion extractionand semantic classification of product reviews[C]∥Proceedings of the 12th International World Wide Web Conference.Budapest:Springer-Verlag Press,2003.
[11]Takamura H,Inui T,Okumura M.Extracting semantic orientation of words usingspin model[C]∥Proceedings of the Association for Computational Linguistics 2005.Morristown:Springer-Verlag Press,2005:133-140.
[12]Wiebe J,Breuce R,Bell M.A corpus study of evaluative and speculative language[C]∥Proceedings of the 2nd ACL SIGdial Workshop on Discourse and Dialogue.Aalborg:Copenhagen University Press,2001.
[13]通融软件科技有限公司.云珠网络信息采集[EB/OL].(2011-03-02)
[2014-05- 23].
[14]Kim S M,Hovy E.Determining the sentiment of opinions[C]∥Proceedings ofCOL ING 204,the Conference on Computational Linguistics(COL IN G 2-04).Geneva:The Swiss Press,2004:1367-1373.
[15]Bleidm NG A Y,Jordanm I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003(3):993-1022.
[16]Machine Learning Group at the University of Waikato.Weka[EB/OL].(2005-06-06)
[2014-03-10].http://www.cs.waikato.ac.nz/ml/weka/index_downloading.html.

备注/Memo

备注/Memo:
基金项目:国家自然科学基金项目资助(61340037)。 作者简介:蔡艳婧(1985-),女,江苏南通人,硕士生。通讯联系人:程显毅(1956-),E-mail:xycheng@ntu.edu.cn
更新日期/Last Update: 2014-12-20