首页 > 技术文章 > Facebook fastText

chengweihuang 2017-11-03 17:11 原文

paper: Bag of tricks for effcient text classification

by Facebook AI Research, FAIR

主要的应用在文本分类上,text classification,例如在人机交互环境下,对用户自然语言输入的描述,进行意图的解读,是想要开机、关机、调出何种操作菜单、听喜欢的音乐等等。用文本分类可以将输入文本映射到机器的功能菜单选项的类别上。

Author names: Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov

Mikolov的名字有点眼熟,这几位从事NLP的工作,FAIR在卷积网络上做了很多视觉应用,近期有报道卷积网络在NLP领域的Seq2Seq上也应用更成功[1]

 

论文主要的思路是用快速的、线性模型,在CPU上运行,同深度网络,GPU,进行方案的对比,显示相当的性能,和更好的速度。这背后的思想是,虽然深度网络是非常好的非线性建模工具,但是有些应用问题,不一定需要这样的非线性建模。

 

Fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and much faster for training and evaluation.

Linear classifiers are often considered as strong baselines for text classification problems. If the right features are used they obtain state-of-the-art performances.

 

采用BoW进行文本语句的特征表达,再加上一个线性分类器,能够取得不错的效果。

 

Figure 1 shows a simple linear model with rank constraint. The first weight matrix A is a look-up table over the words. The word representations are then averaged into a text representation, which is in turn fed to a linear classifier.  The text representation is an hidden variable which can be potentially be used. This architechture is similar to the cbow model of Mikolv 2013.

 

 

 

 

 

 

[1] 参考论文??

 

推荐阅读