database 如何获取英文单词数据库?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2213607/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 07:36:24  来源:igfitidea点击:

How to get english language word database?

databasewords

提问by

I need a database of every single valid word in English. I checked the /usr/share/dict/wordsfile, it contains less than 100k words. Wikipedia says English has 475k words. Where do I get the complete list (American spelling)?

我需要一个包含英语中每个有效单词的数据库。我检查了该/usr/share/dict/words文件,它包含的字数不到 100k。维基百科说英语有 475k 个单词。我在哪里可以获得完整列表(美式拼写)?

Also, is there a single website that gives out words for other languages too, including Asian and European ones?

另外,是否有一个网站也提供其他语言的单词,包括亚洲和欧洲的语言?

Edit: Forgot to add, I do not need names etc., just valid English words.

编辑:忘了补充,我不需要名字等,只需要有效的英文单词。

采纳答案by user266803

WordNetdatabase might be helpful. I once worked on a Firefox add-on which deals with words and all kinds of simple to complicated associations between them and stuff. Looks like WordNet will be very much useful to you.

WordNet数据库可能会有所帮助。我曾经开发过一个 Firefox 插件,它处理单词以及它们与事物之间的各种简单到复杂的关联。看起来 WordNet 对您非常有用。

Here it is in MySQL format. And this one(web-archived link) uses Wordnet v3.0 data, rather than the older Wordnet 2.0 data.

这里是MySQL 格式。而这个(网络存档链接)使用的是 Wordnet v3.0 数据,而不是旧的 Wordnet 2.0 数据。

回答by danben

You can find what you need on infochimps.org.

您可以在infochimps.org上找到您需要的信息

They have a list of 350,000 simple (ie non-compound) words available for free download.

他们有一个包含 350,000 个简单(即非复合)词的列表,可供免费下载。

Word List - 350,000+ Simple English Words

单词列表 - 350,000+ 简单英语单词

Regarding other languages, you might want to poke around on Wiktionary. Here is a link to all the database backups- the information isnt organized so likely but if they have a language, you can download the data in SQL format.

关于其他语言,您可能想在维基词典上闲逛。这是所有数据库备份的链接- 信息的组织方式不太可能,但如果它们有语言,您可以以 SQL 格式下载数据。

回答by rdm

I do not see http://wordlist.sourceforge.net/mentioned here, but that is where I would start if I were looking for something like this (and I was, when I stumbled over this question).

我没有看到这里提到的http://wordlist.sourceforge.net/,但如果我正在寻找这样的东西,我就会从那里开始(当我偶然发现这个问题时,我就是这样)。

If you cannot find what you want there, and what you want is a list of english words, then you should probably spend some extra time describing how to recognize what it is that you want.

如果你在那里找不到你想要的东西,而你想要的是一个英文单词列表,那么你可能应该花一些额外的时间来描述如何识别你想要的东西。

回答by JW.

There's no such thing as a "complete" list. Different people have different ways of measuring -- for example, they might include slang, neologisms, multi-word phrases, offensive terms, foreign words, verb conjugations, and so on. Some people have even counted a million words! So you'll have to decide what you want in a word list.

没有“完整”列表这样的东西。不同的人有不同的衡量方式——例如,它们可能包括俚语、新词、多词短语、攻击性术语、外来词、动词变位等。甚至有人数百万字!因此,您必须决定在单词列表中想要什么。

回答by mloskot

You may check *spellen-GB dictionaryused by Mozilla, OpenOffice, plenty of other software.

您可以查看Mozilla、OpenOffice 和许多其他软件使用的*spellen-GB 词典

回答by Benjamin Bannier

You didn't say what you needed this list for. If something used as a blacklist for password checks is enough cracklibmight be good for you. It contains over 1.5M words.

你没有说你需要这份清单的目的是什么。如果用作密码检查黑名单的东西就足够了,cracklib可能对你有好处。它包含超过 150 万个单词。