机器学习在图书馆应用初探:以TensorFlow为例Machine Learning and Its application in Library:Take TensorFlow as an Example
郭利敏;刘炜;吴佩娟;张磊;
摘要(Abstract):
机器学习是人工智能的重要分支,TensorFlow是谷歌第二代开源人工智能机器学习平台。此文重点介绍机器学习(主要是深度神经网络)的基本原理和利用TensorFlow进行机器学习的基本方法,探讨在图书馆领域应用的可能和场景。以《全国报刊索引》的自动分类问题作为实验对象,利用两台图形工作站,建立了TensorFlow深度学习模型,通过设定参数和阈值、系统调优等工作,实践了应用TensorFlow的完整过程,论证了其可行性。实验通过对170万余条题录数据进行训练和测试,克服了报刊索引数据过于简单与中国图书馆分类法的类目过于细致之间的矛盾,实现了大类近80%和四级分类总体近70%的准确率(其中TP类达到91%),得出基本可代替人工分类流程的结论,为全国报刊索引的分类流程的半自动化提供有力工具,从而可望大大节省人力成本。下一步将继续利用TensorFlow的优化功能,结合更多的字段属性,进行系统调优,力争做到自动分类90%以上的准确率。
关键词(KeyWords): 智慧图书馆;人工智能;机器学习;TensorFlow;自动分类;神经网络
基金项目(Foundation): 国家社会科学基金重大项目“面向大数据的数字图书馆移动视觉搜索机制及应用研究”(编号:15ZDB126)的研究成果之一
作者(Authors): 郭利敏;刘炜;吴佩娟;张磊;
DOI: 10.16603/j.issn1002-1027.2017.06.004
参考文献(References):
- 1陈宗周.从GPU到ImageNet,两位硅谷华人改变了AI发展史.[2017-02-01].https://mp.weixin.qq.com/s?__biz=MjM5NDA1Njg2MA==&mid=2651984124&idx=1&sn=ec445431989126e8c33352af54ca8b6b.
- 2 Hebb Donald.The Organization of Behavior a neuropsychological theory[M].New York:John Wiley.1949:100-136.
- 3 Liu M Q.Discrete-time delayed standard neural.Network and its application[J].Sci China,2006,49(2):137-154
- 4 Neha Gupta,Artificial Neural Network[J].Network and Complex Systrems,2013(1):24-28.
- 5毛健,赵红东,姚婧婧.人工神经网络的发展及应用[J].电子设计工程,2011,(24):62-65.
- 6 google developers blog[EB/OL].[2017-02-01].https://developers.googleblog.com/2017/02/announcing-tensorflow-10.html.6 Auto Draw[EB/OL].[2017-04-01].https://www.autodraw.com/.
- 7沈敏,杨新涯,王凯.基于机器学习的高校图书馆用户偏好检索系统研究[J].图书情报工作,2015.(11):143-148.
- 8王昊,严明,苏新宁.基于机器学习的中文书目自动分类研究[J].中国图书馆学报,2010.(6):28-39.
- 9叶鹏.基于机器学习的中文期刊论文自动分类研究[D].南京大学,2013.
- 10 Sutskever,I.,Vinyals,O.,and Le,Q.(2014).Sequence to sequence learning with neural networks.In Advances in Neural Information Processing Systems(NIPS 2014).
- 11 Kim,Yoon.2014.Convolutional neural networks for sentence classification.arXiv preprint arXiv:1408.5882.
- 12 A Picture is Worth Thousand Coherent[EB/OL].[2014-11-01].https://research.googleblog.com/2014/11/a-picture-is-worth-thousand-coherent.html.
- 13 Graves,A.,Mohamed,A.,&Hinton,G.(2013).Speech recognition with deep recurrent neural networks.In IEEEInternationalconferenceonacoustics,speech and signal processing(pp.6645-6649).IEEE.
- 14 J.Ng,M.Hausknecht,S.Vijayanarasimhan,O.Vinyals,R.Monga,and G.Toderici.Beyond short snippets:Deep networks for video classification.In CVPR,2015.
- 15同12.
- 16於坚秋.公共图书馆读者信息咨询服务分析与对策[J].图书情报论坛,2006,(03):46-48.
- 17姚飞等.实时虚拟参考咨询服务新尝试---清华大学图书馆智能聊天机器人[J].现代图书情报技术,2011,(04):77-81.
- 18同12.
- 19 Kalchbrenner N,Grefenstette E,Blunsom P.A convolutional neural network for modelling sentences[J].arXiv preprint arX-iv:1404.2188,2014.
- 20 Zhou C,Sun C,Liu Z,et al.A C-LSTM neural network for text classification[J].arXiv preprint arXiv:1511.08630,2015.
- 21 Wen Y,Zhang W,Luo R,et al.Learning text representation using recurrent convolutional neural network with highway layers[J].arXiv preprint arXiv:1606.06905,2016.
- 22严轩,钟静.从数据来源分析入手的图书馆大数据应用系统研究---以“重庆图书馆大数据分析试验系统”为例[J].四川图书馆学报,2016,(04):2-6.
- 23李艳,吕鹏,李珑.基于大数据挖掘与决策分析体系的高校图书馆个性化服务研究[J].图书情报知识,2016,(02):60-68.
- 24 Thakur G S M,Bhattacharyya R,Mondal S S.Artificial Neural Network Based Model for Forecasting of Inflation in India[J].Fuzzy Information and Engineering,2016,8(1):87-100.
- 25 Rolich T,ajatovi A H,Pavlinic'D Z.Application of artificial neural network(ANN)for prediction of fabrics’extensibility[J].Fibers and polymers,2010,11(6):917-923.
- 26 Lee K Y,Chung N,Hwang S.Application of an artificial neural network(ANN)model for predicting mosquito abundances in urban areas[J].Ecological informatics,2016,36:172-180.
- 27 Goldberg Y,Levy O.word2vec Explained:deriving Mikolov et al.’s negative-sampling word-embedding method[J].Eprint Arxiv,2014.
- 28 Levy O,Goldberg Y.Neural word embedding as implicit matrix factorization[J].Advances in Neural Information Processing Systems,2014,3:2177-2185.
- 29 Levy O,Goldberg Y,Dagan I.Improving distributional similarity with lessons learned from word embeddings[J].Bulletin De La SociétéBotanique De France,2015,75(3):552-555.