Python Seaborn load_dataset

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30336324/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 08:17:25  来源:igfitidea点击:

Seaborn load_dataset

pythonboxplotseaborn

提问by Arsibalt

I am trying to get a grouped boxplot working using Seaborn as per the example

我正在尝试按照示例使用 Seaborn 使分组​​箱线图工作

I can get the above example working, however the line:

我可以让上面的例子工作,但是行:

tips = sns.load_dataset("tips")

is not explained at all. I have located the tips.csv file, but I can't seem to find adequate documentation on what load_dataset specifically does. I tried to create my own csv and load this, but to no avail. I also renamed the tips file and it still worked...

根本没有解释。我已经找到了 tips.csv 文件,但我似乎无法找到有关 load_dataset 的具体功能的足够文档。我尝试创建自己的 csv 并加载它,但无济于事。我还重命名了提示文件,它仍然有效......

My question is thus:

我的问题是:

Where is load_datasetactually looking for files? Can I actually use this for my own boxplots?

load_dataset实际上在哪里寻找文件?我真的可以将它用于我自己的箱线图吗?

EDIT: I managed to get my own boxplots working using my own DataFrame, but I am still wondering whether load_datasetis used for anything more than mysterious tutorial examples.

编辑:我设法让我自己的箱线图使用我自己的工作DataFrame,但我仍然不知道是否load_dataset被用于任何超过神秘的实例教程。

采纳答案by selwyth

load_datasetlooks for online csv files on https://github.com/mwaskom/seaborn-data. Here's the docstring:

load_datasethttps://github.com/mwaskom/seaborn-data上查找在线 csv 文件。这是文档字符串:

Load a dataset from the online repository (requires internet).

Parameters


name : str Name of the dataset (name.csv on https://github.com/mwaskom/seaborn-data). You can obtain list of available datasets using :func:get_dataset_names

kws : dict, optional Passed to pandas.read_csv

从在线存储库加载数据集(需要互联网)。

参数


name : str 数据集的名称(https://github.com/mwaskom/seaborn-data 上的name.csv )。您可以使用 :func 获取可用数据集的列表:get_dataset_names

kws : dict, 可选 传递给 pandas.read_csv

If you want to modify that online dataset or bring in your own data, you likely have to use pandas. load_datasetactually returns a pandas DataFrameobject, which you can confirm with type(tips).

如果您想修改该在线数据集或引入您自己的数据,您可能必须使用pandasload_dataset实际上返回一个 pandasDataFrame对象,您可以使用type(tips).

If you already created your own data in a csv file called, say, tips2.csv, and saved it in the same location as your script, use this (after installing pandas) to load it in:

如果您已经在一个名为 tip2.csv 的 csv 文件中创建了自己的数据,并将其保存在与您的脚本相同的位置,请使用此文件(安装 pandas 后)将其加载到:

import pandas as pd

tips2 = pd.read_csv('tips2.csv')

回答by Sahil Nagpal

Just to add to 'selwyth's' answer.

只是为了添加到“selwyth”的答案中。

import pandas as pd
Data=pd.read_csv('Path\to\csv\')
Data.head(10)

Once you have completed these steps successfully. Now the plotting actually works like this.

成功完成这些步骤后。现在绘图实际上是这样工作的。

Let's say you want to plot a bar plot.

假设您想绘制条形图。

sns.barplot(x=Data.Year,y=Data.Salary) //year and salary attributes were present in my dataset.

This actually works with every plotting in seaborn.

这实际上适用于 seaborn 中的每个绘图。

Moreover, we will not be eligible to add our own dataset on Seaborn Git.

此外,我们将没有资格在 Seaborn Git 上添加我们自己的数据集。

回答by rahul deshmukh

Download all csv files(zipped) to be used for your examplefrom here.

这里下载用于您的示例的所有 csv 文件(压缩)。

Extract the zip file to a local directory and launch your jupyter notebook from the same directory. Run the following commands in jupyter notebook:

将 zip 文件解压缩到本地目录并从同一目录启动您的 jupyter notebook。在 jupyter notebook 中运行以下命令:

import pandas as pd
tips = pd.read_csv('seaborn-data-master/tips.csv')

you're good to work with your example now!

你现在很高兴使用你的例子!