C#中的标签云

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/354738/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 00:16:26  来源:igfitidea点击:

Tag Cloud in C#

c#tag-cloud

提问by Layla

I am making a small C#application and would like to extract a tag cloudfrom a simple plain text. Is there a function that could do that for me?

我正在制作一个小型C#应用程序,并希望从简单的纯文本中提取标签云。有没有可以为我做到这一点的功能?

回答by Nathan W

I'm not sure if this is exactly what your looking for but it may help you get started:

我不确定这是否正是您想要的,但它可能会帮助您入门:

LINQ that counts word frequency(in VB but I'm converting to C# now)

计算词频的 LINQ(在 VB 中,但我现在正在转换为 C#)

Dim Words = "Hello World ))))) This is a test Hello World"
Dim CountTheWords = From str In Words.Split(" ") _
                    Where Char.IsLetter(str) _
                    Group By str Into Count()

回答by Mitchel Sellers

Here is an ASP.NET Cloud COntrol, that might help you at least get started, full source included.

这是一个ASP.NET Cloud CControl,它至少可以帮助您入门,包括完整的源代码。

回答by Ramiro Berrelleza

Building a tag cloud is, as I see it, a two part process:

在我看来,构建标签云是一个由两部分组成的过程:

First, you need to split and count your tokens. Depending on how the document is structured, as well as the language it is written in, this could be as easy as counting the space-separated words. However, this is a very naive approach, as words like the, of, a, etc... will have the biggest word-count and are not very useful as tags. I would suggest implementing some sort of word black list, in order to exclude the most common and meaningless tags.

首先,您需要拆分和计算您的代币。根据文档的结构以及它所用的语言,这可能就像计算空格分隔的单词一样简单。然而,这是一种非常幼稚的方法,因为像 the、of、a 等词将拥有最大的字数,并且作为标签不是很有用。我建议实施某种单词黑名单,以排除最常见和无意义的标签。

Once you have the result in a (tag, count) way, you could use something similar to the following code:

以 (tag, count) 方式获得结果后,您可以使用类似于以下代码的内容:

(Searches is a list of SearchRecordEntity, SearchRecordEntity holds the tag and its count, SearchTagElement is a subclass of SearchRecordEntity that has the TagCategory attribute,and ProcessedTags is a List of SearchTagElements which holds the result)

(Searches 是 SearchRecordEntity 的列表,SearchRecordEntity 包含标签及其计数,SearchTagElement 是具有 TagCategory 属性的 SearchRecordEntity 的子类,ProcessedTags 是包含结果的 SearchTagElements 的列表)

double max = Searches.Max(x => (double)x.Count);
List<SearchTagElement> processedTags = new List<SearchTagElement>();

foreach (SearchRecordEntity sd in Searches)
{
    var element = new SearchTagElement();                    

    double count = (double)sd.Count;
    double percent = (count / max) * 100;                    

    if (percent < 20)
    {
        element.TagCategory = "smallestTag";
    }
    else if (percent < 40)
    {
        element.TagCategory = "smallTag";
    }
    else if (percent < 60)
    {
        element.TagCategory = "mediumTag";
    }
    else if (percent < 80)
    {
        element.TagCategory = "largeTag";
    }
    else
    {
        element.TagCategory = "largestTag";
    }

    processedTags.Add(element);
}

回答by GurdeepS

You could store a category and the amount of items it has in some sort of collection, or database table.

您可以在某种集合或数据库表中存储一个类别及其拥有的项目数量。

From that, you can get the count for a certain category and have certain bounds. So your parameter is the category, and your return value is a count.

从中,您可以获得某个类别的计数并有一定的界限。所以你的参数是类别,你的返回值是一个计数。

So if the count is >10 & <20, then apply a .CSS style to the link which will be of a certain size.

因此,如果计数 >10 & <20,则将 .CSS 样式应用于具有特定大小的链接。

You can store these counts as keys in a collection, and then get the value where the key matches your return value (as I mentioned above).

您可以将这些计数存储为集合中的键,然后获取键与返回值匹配的值(如上所述)。

I haven't got source code at hand for this process, but you won't find a simple function to do all this for you either. A control, yes (as above).

我手头没有这个过程的源代码,但你也找不到一个简单的函数来为你做这一切。一个控制,是的(如上)。

This is a very conventional approach and the standard way of doing it from what I've seen in magazine tutorials, etc, and the first approach I would think of (not necessarily the best).

这是一种非常传统的方法,也是我在杂志教程等中看到的标准方法,也是我想到的第一种方法(不一定是最好的)。

回答by ine

You may want to take a look at WordCloud, a project on CodeProject. It includes 430 stops words (like the, an, a, etc.) and uses the Porter stemming algorithm, which reduces words to their root for so that "stemmed stemming stem" are all counted as 1 occurrence of the same word.

您可能想看看WordCloud,这是 CodeProject 上的一个项目。它包含 430 个停止词(如theana等)并使用 Porter 词干算法,将词减少到它们的词根 for 以便“词干词干”都计为同一词出现 1 次。

It's all in C# - the only thing you would have to do it modify it to output HTML instead of the visualization it creates.

这一切都在 C# 中——您唯一要做的就是修改它以输出 HTML,而不是它创建的可视化。

回答by user85742

I would really recommend using http://thetagcloud.codeplex.com/. It is a very clean implementation that takes care of grouping, counting and rendering of tags. It also provides filtering capabilities.

我真的建议使用http://thetagcloud.codeplex.com/。这是一个非常干净的实现,负责标签的分组、计数和呈现。它还提供过滤功能。

回答by muVectors

The Zoomable TagCloud Generatorwhich extracts keywords from a given source (text file and other sources) and displays the TagCloud as Zooming User Interface (ZUI)

可缩放的TagCloud发生器,其提取从给定源(文本文件和其它来源)的关键字和显示TagCloud作为缩放用户界面(ZUI)

回答by Sunil Raj

Have a look at this answer for an algorithm:

看看这个算法的答案:

Algorithm to implement a word cloud like Wordle

实现Wordle之类的词云的算法

The "DisOrganizer" mentioned in the answers could serve your purpose. With a little change, you can let this "Disorganizer" to serve an image, the way you wanted. PS: The code is written in C# https://github.com/chandru9279/zasz.me/blob/master/zasz.me/

答案中提到的“DisOrganizer”可以满足您的目的。稍加改动,您就可以让这个“Disorganizer”以您想要的方式为图像提供服务。PS:代码是用C#写的https://github.com/chandru9279/zasz.me/blob/master/zasz.me/

回答by Sanjay Panchal

Take a look at this. It worked for me. There is a project under Examples folder named WebExample which will help you for solving this. https://github.com/chrisdavies/Sparc.TagCloud

看看这个。它对我有用。示例文件夹下有一个名为 WebExample 的项目,可以帮助您解决这个问题。 https://github.com/chrisdavies/Sparc.TagCloud