数字人文中的文本挖掘研究Text Mining Research in Digital Humanities
郭金龙;许鑫;
摘要(Abstract):
概述数字人文的概念、研究内容和现状,指出文本挖掘方法是数字人文研究的一个研究热点与趋势。在综述文本挖掘在数字人文各个研究领域中的具体应用基础上,重点介绍欧美发达国家文本挖掘应用于数字人文研究的前沿实践,以期为我国人文学科研究方法与范式的转型提供借鉴。
关键词(KeyWords): 文本挖掘;数字人文;人文研究
基金项目(Foundation): 2011年度国家社科基金青年项目“联合虚拟参考咨询系统的知识库研究”(11CTQ003)的研究成果
作者(Authors): 郭金龙;许鑫;
参考文献(References):
- 1王晓光.“数字人文”学科的产生、发展与前沿.[2011-11-24].http://blog.sciencenet.cn/home.php mod=space&uid=67855&do=blog&id=275758
- 2中国社会科学院.中国社会科学综合地理信息服务平台.[2011-11-19].http://gis.cass.cn/
- 3南京师范大学虚拟地理环境教育部重点实验室.华夏家谱GIS网.[2011-11-19].http://www.hxjiapu.com.cn/Index.aspx?&ThreadID=
- 4中国艺术研究院.西部人文资源网.[2011-11-19].http://www.xbchina.com/
- 5清华大学图书馆.《中国基本古籍库》全文网络版.[2011-11-19].http://www.lib.tsinghua.edu.cn/database/jibenguji.html
- 6首都师范大学.古籍电子定本工程.[2011-11-19].http://www.guoxue.com/zt/dzdb/
- 7 Tan A H.Text Mining:The State of the Art and the Challen-ges//Tsinghua University.Third Pacific-Asia Conference on Knowledge Discovery and Data Ming,Beijing,l999:65-70
- 8 Diederich J,Kindermann J,Leopold E,et al.Authorship Attri-bution with Support Vector Machines.Applied Intelligence,2003,19(1/2):109-123
- 9李贤平.《红楼梦》成书新说.复旦学报(社会科学版),1987(5):3-16
- 10施建军.关于以《红楼梦》120回为样本进行其作者聚类分析的可信度问题研究.红楼梦学刊,2010(5):318-335
- 11武晓春等.基于语义分析的作者身份识别方法研究.中文信息学报,2006(6):61-68
- 12年洪东等.当代文学作品的作者身份识别研究.计算机工程与应用,2010,46(4):226-229
- 13 Gerritsen C M.Authorship Attribution Using Lexical Attrac-tion[硕士学位论文].Boston:Massachusetts Institute of Tech-nology,2003
- 14 Peng F,Schuurmans D,Keselj V,et al.Language Independent Authorship Attribution using Character Level Language Mod-els//Hungarian Academy of Science.10th Conference of the Eu-ropean Chapter of the Association for Computational Linguis-tics,Budapest,2003:267-274
- 15 Dik H,Whaling R.Mining Classical Greek Gender.[2011-11-19].http://cybergreek.uchicago.edu/MiningGender.pdf
- 16 Argamon S,Goulain J B,Horton R,et al.Text Mining Gender Difference in French Literature.[2011-11-19].http://www.digitalhumanities.org/dhq/vol/3/2/000042/000042.html
- 17胡俊峰.基于词汇语义分析的唐宋诗计算机辅助深层研究[博士学位论文].北京:北京大学计算语言学研究所,2001
- 18苏劲松.全宋词语料库建设及其风格与情感分析的计算方法研究[硕士学位论文].厦门:厦门大学,2007
- 19吴春龙.宋词风格的计算机辅助分析研究[硕士学位论文].厦门:厦门大学,2008
- 20 Horton T,Taylor C,Yu B,et al.‘Quite Right,Dear and Interesting':Seeking the Sentimental in Nineteenth Century Ameri-can Fiction//Paris-Sorbonne.Digital Humanities.France,2006
- 21 Plaisant C,Rose J,Yu B,et al.Exploring Erotics in Emily Dickinson's Correspondence with Text Mining and Visual Inter-faces//ACM Press.Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries.Chapel Hill,NC[C].New York,2006
- 22 Elson D K,Dames N,McKeown K R.Extracting Social Net-works from Literature Fiction.[2011-11-19].http://www.cs.columbia.edu/~delson/pubs/ACL2010-Elson-DamesMcKeown.pdf
- 23 Elson D K,McKeown K R.Automatic attribution of quoted speech in literary narrative//AAAI,2010
- 24 Celikyilmaz A,et al.The Actor-Topic Model for Extracting Social Networks in Literary Narrative[C]//NIPS Workshop:Machine Learning for Social Computing,2010
- 25 Yu B.Toward Discovering Potential Data Mining Applications in Literary Criticism.[2011-11-19].http://www.csdl.tamu.edu/~furuta/courses/06c_689dh/dh06readings/DH06-237-239.pdf
- 26 Don A,Zheleva E,Gregory M,et al.Discovering Interesting Usage Patterns in Text Collections-Integrating Text Mining With Visualization//ACM.16th ACM Conference on Informa-tion and Knowledge Management,2007:213-222
- 27 Clement T E.‘A thing not beginning and not ending’:using digital tools to distant-read Gertrude Stein's The Making of A-merican.Linguist Computing,2008,23(3):361-381
- 28 NEH.Mapping Historical Texts:Combining Text-mining&Geo-visualization to Unlock the Research Potential of Histori-cal Newspapers.[2011-11-19].https://securegrants.neh.gov/PublicQuery/main.aspx?f=1&gn=HD-51188-10
- 29 AHRC.Hestia.[2011-11-19].http://www.open.ac.uk/Arts/hestia/index.html
- 30 Dover J.The Philosophy Family Tree.[2011-11-19]..ht-tps://webspace.utexas.edu/deverj/personal/philtree/philtree.html
- 31 Pasin M.PhiloSURFical project.[2011-11-19].http://phi-losurfical.open.ac.uk/index.html
- 32董慧等.基于本体的数字图书馆检索模型研究(Ⅲ)——历史领域资源本体构建.情报学报,2006(5):564-574
- 33 European Union.VICODI.[2011-11-19].http://www.vico-di.org/about.htm
- 34中国社科院.中国社会科学报刊网.[2011-11-19].http://sspress.cass.cn/news/13448.htm
- 35 University of York.Archaeology Data Service.[2011-11-19].http://ads.ahds.ac.uk/
- 36 University of York.Archaeology Data Service.[2011-11-19].http://ads.ahds.ac.uk/project/archaeotools/
- 37 National Endowment for the Humanities.‘Text Mining’-Dig-ging Through Digital Archives.[2011-11-19].http://www.neh.gov/news/archive/20101221.html
- 38 National Endowment for the Humanities.Digging into Data.[2011-11-19].http://www.diggingintodata.org/
- 39 University of Illinois Urbana-Champaign.SEASR.[2011-11-19].http://seasr.org/
- 40 University of Illinois Urbana-Champaign.NORA.[2011-11-19].http://www.noraproject.org/
- 41 University of Illinois Urbana-Champaign.MONK.[2011-11-19].http://www.monkproject.org/
- 42 University of Chicago.PhiloMine.[2011-11-19].http://code.google.com/p/philomine/
- 43 Michel J B,Shen Y K,Aiden A P,et al.Quantitative Analysis of Culture Using Millions of Digitized Books.[2011-11-19].http://mfi.uchicago.edu/publications/papers/Science_Cul-turomics.pdf
- 44 Blanke T.Text Analysis in the Arts and Humanities.[2011-11-19].http://cnx.org/content/m31502/latest/
- 45 Manchester Interdisciplinary Biocentre.The National Center for Text Mining.[2011-11-19].http://www.nactem.ac.uk
- 46 Lancaster University.Text-mining in the Digital Humanities:The Interface between Conceptual History,Critical Discourse Analysis and Corpus Linguistics.[2011-11-19].http://ucrel.lancs.ac.uk/events/chcdacl2010/
- 47 King's College London.Text Mining in the Digital Humanities.[2011-11-19].http://dh2010.cch.kcl.ac.uk/academic-programme/pre-conference-workshops/workshop-2.html
- 48 Universitt Leipzig.eAQUA.[2011-11-19].http://www.eaqua.net/index.php
- 49 Buechler M,Gerhard H,Sabine G.eAQUA-Bringing mod-ern text mining approaches to two thousand years old ancient texts.[2011-11-19].http://asv.informatik.uni-leipzig.de/publication/file/123/IEEE09.pdf
- 50 University of Virginia.TEI:Text Encoding Initiative.[2011-11-19].http://www.tei-c.org/