python中的TF-IDF实现

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20140678/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 19:39:56  来源:igfitidea点击:

TF-IDF implementations in python

pythonnltkinformation-retrievaltf-idf

提问by scarecrow

What are the standard tf-idf implementations/api available in python? I've come across the one in nltk. I want to know the other libraries that provide this feature.

python中可用的标准tf-idf实现/api是什么?我在 nltk 中遇到过一个。我想知道提供此功能的其他库。

回答by Nilani Algiriyage

Try the libraries which implements TF-IDF algorithm in python.

尝试在 python 中实现 TF-IDF 算法的库。

http://code.google.com/p/tfidf/

http://code.google.com/p/tfidf/

https://github.com/hrs/python-tf-idf

https://github.com/hrs/python-tf-idf

回答by alko

Unfortunately, questions asking for a tool or library are offtopic on SO. There are lot of machine learning libraries implementing tfidf. Two most comprehensive of them besides mentioned ntlk in my view are sklearnand gensim.

不幸的是,要求工具或库的问题在 SO 上是无关紧要的。有很多机器学习库实现了tfidf. 在我看来,除了提到的 ntlk 之外,其中最全面的两个是sklearngensim

回答by Gunjan

there is a package called scikitwhich calculates tf-idf scores.

有一个名为scikit的包可以计算 tf-idf 分数。

you can refer to my answer to this question

你可以参考我对这个问题的回答

Python: tf-idf-cosine: to find document similarity

Python:tf-idf-cosine:查找文档相似度

and also see the question code from this. Thankz.

并从中查看问题代码。谢谢。