python中的TF-IDF实现
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20140678/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
TF-IDF implementations in python
提问by scarecrow
What are the standard tf-idf implementations/api available in python? I've come across the one in nltk. I want to know the other libraries that provide this feature.
python中可用的标准tf-idf实现/api是什么?我在 nltk 中遇到过一个。我想知道提供此功能的其他库。
回答by Nilani Algiriyage
Try the libraries which implements TF-IDF algorithm in python.
尝试在 python 中实现 TF-IDF 算法的库。
http://code.google.com/p/tfidf/
http://code.google.com/p/tfidf/
回答by alko
Unfortunately, questions asking for a tool or library are offtopic on SO. There are lot of machine learning libraries implementing tfidf. Two most comprehensive of them besides mentioned ntlk in my view are sklearnand gensim.
不幸的是,要求工具或库的问题在 SO 上是无关紧要的。有很多机器学习库实现了tfidf. 在我看来,除了提到的 ntlk 之外,其中最全面的两个是sklearn和gensim。
回答by Gunjan
there is a package called scikitwhich calculates tf-idf scores.
有一个名为scikit的包可以计算 tf-idf 分数。
you can refer to my answer to this question
你可以参考我对这个问题的回答
Python: tf-idf-cosine: to find document similarity
and also see the question code from this. Thankz.
并从中查看问题代码。谢谢。

