pandas MemoryError:无法分配具有形状和数据类型对象的数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/57812453/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 06:24:36  来源:igfitidea点击:

MemoryError: Unable to allocate array with shape and data type object

pythonpandasnumpy

提问by Stanislav Jirák

I want to perform an inner join between two dataset which looks like:

我想在两个数据集之间执行内部连接,如下所示:

theme_ids.head()
id  Loan Theme ID   Loan Theme Type Partner ID
0   638631  a1050000000skGl General 151
1   640322  a1050000000skGl General 151
2   641006  a1050000002X1ij Higher Education    160
3   641019  a1050000002X1ij Higher Education    160
4   641594  a1050000002VbsW Subsistence Agriculture 336

and

theme_reg.head()
Partner ID  Field Partner Name  sector  Loan Theme ID   Loan Theme Type country forkiva region  geocode_old ISO ... amount  LocationName    geocode names   geo lat lon mpi_region  mpi_geo rural_pct
0   9   KREDIT Microfinance Institution General Financial Inclusion a1050000000slfi Higher Education    Cambodia    No  Banteay Meanchey    (13.75, 103.0)  KHM ... 450 Banteay Meanchey, Cambodia  [(13.6672596, 102.8975098)] Banteay Meanchey Province; Cambodia (13.6672596, 102.8975098)   13.667259   102.897507  Banteay Mean Chey, Cambodia (13.6672596, 102.8975098)   90
1   9   KREDIT Microfinance Institution General Financial Inclusion a10500000068jPe Vulnerable Populations  Cambodia    No  Battambang Province NaN KHM ... 20275   Battambang Province, Cambodia   [(13.0286971, 102.989615)]  Battambang Province; Cambodia   (13.0286971, 102.989615)    13.028697   102.989616  Banteay Mean Chey, Cambodia (13.6672596, 102.8975098)   90
2   9   KREDIT Microfinance Institution General Financial Inclusion a1050000000slfi Higher Education    Cambodia    No  Battambang Province NaN KHM ... 9150    Battambang Province, Cambodia   [(13.0286971, 102.989615)]  Battambang Province; Cambodia   (13.0286971, 102.989615)    13.028697   102.989616  Banteay Mean Chey, Cambodia (13.6672596, 102.8975098)   90
3   9   KREDIT Microfinance Institution General Financial Inclusion a10500000068jPe Vulnerable Populations  Cambodia    No  Kampong Cham Province   (12.0, 105.5)   KHM ... 604950  Kampong Cham Province, Cambodia [(12.0982918, 105.3131185)] Kampong Cham Province; Cambodia (12.0982918, 105.3131185)   12.098291   105.313118  Kampong Cham, Cambodia  (11.9924294, 105.4645408)   90
4   9   KREDIT Microfinance Institution General Financial Inclusion a1050000002X1Uu Sanitation  Cambodia    No  Kampong Cham Province   (12.0, 105.5)   KHM ... 275 Kampong Cham Province, Cambodia [(12.0982918, 105.3131185)] Kampong Cham Province; Cambodia (12.0982918, 105.3131185)   12.098291   105.313118  Kampong Cham, Cambodia  (11.9924294, 105.4645408)   90

I've tried:

我试过了:

data = pd.merge(theme_ids, theme_reg, on='Partner ID', how='inner') 

which raises:

这引发了:

MemoryError: Unable to allocate array with shape (15, 144356281) and data type object

MemoryError:无法分配具有形状 (15, 144356281) 和数据类型对象的数组

Help would be appreacited. Thanks!

将获得帮助。谢谢!

回答by garciparedes

As @Iguananaut said in Unable to allocate array with shape and data type, you can modify the overcommitbehavior if you are running over a Linux system. To allow overcommit, you can write 1on /proc/sys/vm/overcommit_memoryas he explained in the answer.

正如@Iguananaut 在无法分配具有形状和数据类型的数组中所说的那样,overcommit如果您在 Linux 系统上运行,则可以修改行为。要允许overcommit,你可以写1/proc/sys/vm/overcommit_memory,因为他在回答解释。