database 文本中情感检测的数据集

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30703485/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 07:56:16  来源:igfitidea点击:

Data sets for emotion detection in text

databasedatasetnlptext-miningemotion

提问by ekka

I'm implementing a system that could detect the human emotion in text. Are there any manually annotated data sets available for supervised learning and testing?

我正在实施一个可以检测文本中人类情感的系统。是否有任何手动注释的数据集可用于监督学习和测试?

Here are some interesting datasets: https://dataturks.com/projects/trending

以下是一些有趣的数据集:https: //dataturks.com/projects/trending

回答by buechel

The field of textual emotion detection is still very new and the literature is fragmented in many different journals of different fields. Its really hard to get a good look on whats out there.

文本情感检测领域还很新,文献分散在不同领域的许多不同期刊中。真的很难好好看看外面的东西。

Note that there a several emotion theories psychology. Hence there a different ways of modeling/representing emotions in computing. Most of the times "emotion" refers to a phenomena such as anger, fear or joy. Other theories state that all emotions can be represented in a multi-dimensional space (so there is an infinite number of them).

请注意,有几种情绪理论心理学。因此,在计算中存在不同的建模/表示情绪的方法。大多数时候,“情绪”是指愤怒、恐惧或喜悦等现象。其他理论指出,所有情绪都可以在多维空间中表示(因此它们的数量是无限的)。

Here are a some (publicly available) data sets I know of (updated):

以下是我所知道的一些(公开可用的)数据集(已更新):

  1. EmoBank. 10k sentences annotated with Valence, Arousal and Dominance values (disclosure: I am one of the authors). https://github.com/JULIELab/EmoBank

  2. The "Emotion Intensity in Tweets" data set from the WASSA 2017 shared task. http://saifmohammad.com/WebPages/EmotionIntensity-SharedTask.html

  3. The Valence and Arousal Facebook Posts by Preotiuc-Pietro and others: http://wwbp.org/downloads/public_data/dataset-fb-valence-arousal-anon.csv

  4. The Affect data by Cecilia Ovesdotter Alm: http://people.rc.rit.edu/~coagla/affectdata/index.html

  5. The Emotion in Text data set by CrowdFlower https://www.crowdflower.com/wp-content/uploads/2016/07/text_emotion.csv

  6. ISEAR: http://emotion-research.net/toolbox/toolboxdatabase.2006-10-13.2581092615

  7. Test Corpus of SemEval 2007 (Task on Affective Text) http://web.eecs.umich.edu/~mihalcea/downloads.html

  8. A reannotation of the SemEval Stance data with emotions: http://www.ims.uni-stuttgart.de/data/ssec

  1. 电子商务银行。10k 句用价、唤醒和支配值注释的句子(披露:我是作者之一)。https://github.com/JULIELab/EmoBank

  2. 来自 WASSA 2017 共享任务的“推文中的情绪强度”数据集。http://saifmohammad.com/WebPages/EmotionIntensity-SharedTask.html

  3. Preotiuc-Pietro 和其他人在 Facebook 上发表的价价和唤醒帖子:http://wwbp.org/downloads/public_data/dataset-fb-valence-arousal-anon.csv

  4. Cecilia Ovesdotter Alm 的影响数据:http://people.rc.rit.edu/~coagla/affectdata/index.html

  5. 由 CrowdFlower https://www.crowdflower.com/wp-content/uploads/2016/07/text_emotion.csv设置的文本数据中的情感

  6. ISEAR:http://emotion-research.net/toolbox/toolboxdatabase.2006-10-13.2581092615

  7. SemEval 2007 测试语料库(情感文本任务) http://web.eecs.umich.edu/~mihalcea/downloads.html

  8. 带有情绪的 SemEval Stance 数据的重新注释:http: //www.ims.uni-stuttgart.de/data/ssec

If you want to go deeper into the topic, here are some surveys I recommend (disclosure: I authored the first one).

如果你想更深入地研究这个话题,这里有一些我推荐的调查(披露:我撰写了第一个)。

  1. Buechel, S., & Hahn, U. (2016). Emotion Analysis as a Regression Problem — Dimensional Models and Their Implications on Emotion Representation and Metrical Evaluation. In ECAI 2016.22nd European Conference on Artificial Intelligence (pp. 1114–1122). The Hague, Netherlands (available: http://ebooks.iospress.nl/volumearticle/44864).

  2. Canales, L., & Martínez-Barco, P. (n.d.). Emotion Detection from text: A Survey. Processing in the 5th Information Systems Research Working Days (JISIC 2014), 37 (available: http://www.aclweb.org/anthology/W14-6905).

  1. Buechel, S., & Hahn, U. (2016)。作为回归问题的情感分析——维度模型及其对情感表示和度量评估的影响。在 ECAI 2016.22nd 欧洲人工智能会议上(第 1114-1122 页)。荷兰海牙(可用:http: //ebooks.iospress.nl/volumearticle/44864)。

  2. Canales, L., & Martínez-Barco, P. (nd)。从文本中检测情绪:一项调查。在第 5 个信息系统研究工作日处理(JISIC 2014),37(可获取:http://www.aclweb.org/anthology/W14-6905 )。