database 文本中情感检测的数据集
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30703485/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Data sets for emotion detection in text
提问by ekka
I'm implementing a system that could detect the human emotion in text. Are there any manually annotated data sets available for supervised learning and testing?
我正在实施一个可以检测文本中人类情感的系统。是否有任何手动注释的数据集可用于监督学习和测试?
Here are some interesting datasets: https://dataturks.com/projects/trending
以下是一些有趣的数据集:https: //dataturks.com/projects/trending
回答by buechel
The field of textual emotion detection is still very new and the literature is fragmented in many different journals of different fields. Its really hard to get a good look on whats out there.
文本情感检测领域还很新,文献分散在不同领域的许多不同期刊中。真的很难好好看看外面的东西。
Note that there a several emotion theories psychology. Hence there a different ways of modeling/representing emotions in computing. Most of the times "emotion" refers to a phenomena such as anger, fear or joy. Other theories state that all emotions can be represented in a multi-dimensional space (so there is an infinite number of them).
请注意,有几种情绪理论心理学。因此,在计算中存在不同的建模/表示情绪的方法。大多数时候,“情绪”是指愤怒、恐惧或喜悦等现象。其他理论指出,所有情绪都可以在多维空间中表示(因此它们的数量是无限的)。
Here are a some (publicly available) data sets I know of (updated):
以下是我所知道的一些(公开可用的)数据集(已更新):
EmoBank. 10k sentences annotated with Valence, Arousal and Dominance values (disclosure: I am one of the authors). https://github.com/JULIELab/EmoBank
The "Emotion Intensity in Tweets" data set from the WASSA 2017 shared task. http://saifmohammad.com/WebPages/EmotionIntensity-SharedTask.html
The Valence and Arousal Facebook Posts by Preotiuc-Pietro and others: http://wwbp.org/downloads/public_data/dataset-fb-valence-arousal-anon.csv
The Affect data by Cecilia Ovesdotter Alm: http://people.rc.rit.edu/~coagla/affectdata/index.html
The Emotion in Text data set by CrowdFlower https://www.crowdflower.com/wp-content/uploads/2016/07/text_emotion.csv
ISEAR: http://emotion-research.net/toolbox/toolboxdatabase.2006-10-13.2581092615
Test Corpus of SemEval 2007 (Task on Affective Text) http://web.eecs.umich.edu/~mihalcea/downloads.html
A reannotation of the SemEval Stance data with emotions: http://www.ims.uni-stuttgart.de/data/ssec
电子商务银行。10k 句用价、唤醒和支配值注释的句子(披露:我是作者之一)。https://github.com/JULIELab/EmoBank
来自 WASSA 2017 共享任务的“推文中的情绪强度”数据集。http://saifmohammad.com/WebPages/EmotionIntensity-SharedTask.html
Preotiuc-Pietro 和其他人在 Facebook 上发表的价价和唤醒帖子:http://wwbp.org/downloads/public_data/dataset-fb-valence-arousal-anon.csv
Cecilia Ovesdotter Alm 的影响数据:http://people.rc.rit.edu/~coagla/affectdata/index.html
由 CrowdFlower https://www.crowdflower.com/wp-content/uploads/2016/07/text_emotion.csv设置的文本数据中的情感
ISEAR:http://emotion-research.net/toolbox/toolboxdatabase.2006-10-13.2581092615
SemEval 2007 测试语料库(情感文本任务) http://web.eecs.umich.edu/~mihalcea/downloads.html
带有情绪的 SemEval Stance 数据的重新注释:http: //www.ims.uni-stuttgart.de/data/ssec
If you want to go deeper into the topic, here are some surveys I recommend (disclosure: I authored the first one).
如果你想更深入地研究这个话题,这里有一些我推荐的调查(披露:我撰写了第一个)。
Buechel, S., & Hahn, U. (2016). Emotion Analysis as a Regression Problem — Dimensional Models and Their Implications on Emotion Representation and Metrical Evaluation. In ECAI 2016.22nd European Conference on Artificial Intelligence (pp. 1114–1122). The Hague, Netherlands (available: http://ebooks.iospress.nl/volumearticle/44864).
Canales, L., & Martínez-Barco, P. (n.d.). Emotion Detection from text: A Survey. Processing in the 5th Information Systems Research Working Days (JISIC 2014), 37 (available: http://www.aclweb.org/anthology/W14-6905).
Buechel, S., & Hahn, U. (2016)。作为回归问题的情感分析——维度模型及其对情感表示和度量评估的影响。在 ECAI 2016.22nd 欧洲人工智能会议上(第 1114-1122 页)。荷兰海牙(可用:http: //ebooks.iospress.nl/volumearticle/44864)。
Canales, L., & Martínez-Barco, P. (nd)。从文本中检测情绪:一项调查。在第 5 个信息系统研究工作日处理(JISIC 2014),37(可获取:http://www.aclweb.org/anthology/W14-6905 )。