pandas 如何从数据帧创建键:列名和值的字典:python 列中的唯一值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44105375/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to create a dictionary of key : column_name and value : unique values in column in python from a dataframe
提问by Shuvayan Das
I am trying to create a dictionary of key:value pairs where key is the column name of a dataframe and value will be a list containing all the unique values in that column.Ultimately I want to be able to filter out the key_value pairs from the dict based on conditions. This is what I have been able to do so far:
我正在尝试创建一个 key:value 对字典,其中 key 是数据框的列名,value 将是一个包含该列中所有唯一值的列表。最终我希望能够从dict 基于条件。到目前为止,这是我能够做的:
for col in col_list[1:]:
_list = []
_list.append(footwear_data[col].unique())
list_name = ''.join([str(col),'_list'])
product_list = ['shoe','footwear']
color_list = []
size_list = []
Here product,color,size are all column names and the dict keys should be named accordingly like color_list etc. Ultimately I will need to access each key:value_list in the dictionary. Expected output:
这里产品、颜色、大小都是列名,字典键应该相应地命名为 color_list 等。最终我需要访问字典中的每个键:value_list。预期输出:
KEY VALUE
color_list : ["red","blue","black"]
size_list: ["9","XL","32","10 inches"]
Can someone please help me regarding this?A snapshot of the data is attached.
采纳答案by Chiheb Nexus
With a DataFrame
like this:
有了DataFrame
这样的:
import pandas as pd
df = pd.DataFrame([["Women", "Slip on", 7, "Black", "Clarks"], ["Women", "Slip on", 8, "Brown", "Clarcks"], ["Women", "Slip on", 7, "Blue", "Clarks"]], columns= ["Category", "Sub Category", "Size", "Color", "Brand"])
print(df)
Output:
输出:
Category Sub Category Size Color Brand
0 Women Slip on 7 Black Clarks
1 Women Slip on 8 Brown Clarcks
2 Women Slip on 7 Blue Clarks
You can convert your DataFrame into dict and create your new dict when mapping the the columns of the DataFrame, like this example:
您可以将 DataFrame 转换为 dict 并在映射 DataFrame 的列时创建新的 dict,如下例所示:
new_dict = {"color_list": list(df["Color"]), "size_list": list(df["Size"])}
# OR:
#new_dict = {"color_list": [k for k in df["Color"]], "size_list": [k for k in df["Size"]]}
print(new_dict)
Output:
输出:
{'color_list': ['Black', 'Brown', 'Blue'], 'size_list': [7, 8, 7]}
In order to have a unique values, you can use set
like this example:
为了有一个唯一的值,你可以set
像这个例子一样使用:
new_dict = {"color_list": list(set(df["Color"])), "size_list": list(set(df["Size"]))}
print(new_dict)
Output:
输出:
{'color_list': ['Brown', 'Blue', 'Black'], 'size_list': [8, 7]}
Or, like what @Ami Tavory said in his answer, in order to have the whole unique keys and values from your DataFrame, you can simply do this:
或者,就像@Ami Tavory 在他的回答中所说的那样,为了从您的 DataFrame 中获得整个唯一的键和值,您可以简单地执行以下操作:
new_dict = {k:list(df[k].unique()) for k in df.columns}
print(new_dict)
Output:
输出:
{'Brand': ['Clarks', 'Clarcks'],
'Category': ['Women'],
'Color': ['Black', 'Brown', 'Blue'],
'Size': [7, 8],
'Sub Category': ['Slip on']}
回答by Ami Tavory
I am trying to create a dictionary of key:value pairs where key is the column name of a dataframe and value will be a list containing all the unique values in that column.
我正在尝试创建一个 key:value 对的字典,其中 key 是数据框的列名,value 将是一个包含该列中所有唯一值的列表。
You could use a simple dictionary comprehensionfor that.
你可以使用一个简单的字典理解来做到这一点。
Say you start with
说你开始
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 1], 'b': [1, 4, 5]})
Then the following comprehension solves it:
那么下面的理解就解决了:
>>> {c: list(df[c].unique()) for c in df.columns}
{'a': [1, 2], 'b': [1, 4, 5]}
回答by arnold
If I understand your question correctly, you may need set
instead of list. Probably at this piece of code, you might add set
to get the unique values of the given list.
如果我正确理解您的问题,您可能需要set
而不是列表。可能在这段代码中,您可能会添加set
以获取给定列表的唯一值。
for col in col_list[1:]:
_list = []
_list.append(footwear_data[col].unique())
list_name = ''.join([str(col),'_list'])
list_name = set(list_name)
Sample of usage
使用示例
>>> a_list = [7, 8, 7, 9, 10, 9]
>>> set(a_list)
{8, 9, 10, 7}
回答by Waqar
Here how i did it let me know if it helps
在这里我是怎么做的让我知道它是否有帮助
import pandas as pd
df = pd.read_csv("/path/to/csv/file")
colList = list(df)
dic = {}
for x in colList:
_list = []
_list.append(list(set(list(df[x]))))
list_name = ''.join([str(x), '_list'])
dic[str(x)+"_list"] = _list
print dic
Output:
输出:
{'Color_list': [['Blue', 'Orange', 'Black', 'Red']], 'Size_list': [['9', '8', '10 inches', 'XL', '7']], 'Brand_list': [['Clarks']], 'Sub_list': [['SO', 'FOR']], 'Category_list': [['M', 'W']]}
MyCsv File
MyCsv 文件
Category,Sub,Size,Color,Brand
W,SO,7,Blue,Clarks
W,SO,7,Blue,Clarks
W,SO,7,Black,Clarks
W,SO,8,Orange,Clarks
W,FOR,8,Red,Clarks
M,FOR,9,Black,Clarks
M,FOR,10 inches,Blue,Clarks
M,FOR,XL,Blue,Clarks