从 Android Play 商店获取数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10272155/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Getting Data from Android Play Store
提问by Ahmad
I have seen some Apps and Websites who use Data from the Android Play store. E.g. Apps or Sites with a top Apps ranking etc. But how can you get the Data? From where I can parse it?
我见过一些使用 Android Play 商店数据的应用程序和网站。例如具有顶级应用程序排名的应用程序或站点等。但是您如何获得数据?我可以从哪里解析它?
采纳答案by lenik
There's an unofficial open-source API for the Android Marketyou may try to use to get the information you need. Hope this helps.
您可以尝试使用Android Market的非官方开源 API来获取所需信息。希望这可以帮助。
回答by Ivan Delchev
Disclaimer: I am from 42matters, who provides this data already on https://42matters.com/api, feel free to check it out or drop us a line.
免责声明:我来自 42matters,他们已经在https://42matters.com/api上提供了这些数据,请随时查看或给我们留言。
As lenik mentioned there are open-source libraries that already help with obtaining some data from GPlay. If you want to build one yourself you can try to parse the Google Play App page, but you should pay attention to the following:
正如 lenik 提到的,有一些开源库已经可以帮助从 GPlay 获取一些数据。如果你想自己搭建可以尝试解析Google Play App页面,但要注意以下几点:
- Make sure the URL you are trying to parse is not blocked in robots.txt - e.g. https://play.google.com/robots.txt
- Make sure that you are not doing it too often, Google will throttle and potentially blacklist you if you are doing it too much.
- Send a correct User-Agent header to actually show you are a bot
- The page of an app is big - make sure you accept gzip and request the mobile version
- GPlay website is not an API, it doesn't care that you parse it so it will change over time. Make sure you handle changes - e.g. by having test to make sure you get what you expected.
- 确保您尝试解析的 URL 未被 robots.txt 阻止 - 例如https://play.google.com/robots.txt
- 确保您不要经常这样做,如果您这样做太多,Google 会限制您并可能将您列入黑名单。
- 发送正确的 User-Agent 标头以实际显示您是机器人
- 应用程序的页面很大 - 确保您接受 gzip 并请求移动版本
- GPlay 网站不是 API,它不在乎您是否解析它,因此它会随着时间而改变。确保您处理更改 - 例如,通过进行测试以确保您获得预期的结果。
So that in mind getting one page metadata is a matter of fetching the page html and parsing it properly. With JSoupyou can try:
因此,请记住获取页面元数据是获取页面 html 并正确解析它的问题。使用JSoup,您可以尝试:
HttpClient httpClient = HttpClientBuilder.create().build();
HttpGet request = new HttpGet(crawlUrl);
HttpResponse rsp = httpClient.execute(request);
int statusCode = rsp.getStatusLine().getStatusCode();
if (statusCode == 200) {
String content = EntityUtils.toString(rsp.getEntity());
Document doc = Jsoup.parse(content);
//parse content, whatever you need
Element price = doc.select("[itemprop=price]").first();
}
For that very simple use case that should get you started. However, the moment you want to do more interesting stuff, things get complicated:
对于应该让您入门的非常简单的用例。然而,当你想做更多有趣的事情时,事情就变得复杂了:
- Search is forbidden in robots.
- Keeping app metadata up-to-date is hard to do. There are more than 2.2m apps, if you want to refresh their metadata daily there are 2.2 requests/day, which will 1) get blocked immediately, 2) costs a lot of money - pessimistic 220gb data transfer per day if one app is 100k
- How do you discover new apps
- How do you get pricing in each country, translations of each language
- 禁止在机器人中搜索。
- 使应用元数据保持最新很难做到。有超过 220 万个应用程序,如果你想每天刷新它们的元数据,那么每天有 2.2 个请求,这将 1) 立即被阻止,2) 花费很多钱 - 如果一个应用程序是 100k,那么每天传输 220GB 的数据是悲观的
- 您如何发现新应用
- 您如何获得每个国家/地区的定价以及每种语言的翻译
The list goes on. If you don't want to do all this by yourself, you can consider 42matters API, which supports lookup and search, top google charts, advanced queries and filters. And this for 35 languages and more than 50 countries.
名单还在继续。如果你不想自己做这一切,你可以考虑42matters API,它支持查找和搜索、顶级谷歌图表、高级查询和过滤器。这适用于 35 种语言和 50 多个国家/地区。
回答by Facundo Olano
I've coded a small Node.js module to scrape app and list data from Google Play: google-play-scraper
我编写了一个小的 Node.js 模块来从 Google Play 抓取应用程序和列出数据:google-play-scraper
var gplay = require('google-play-scrapper');
gplay.List({
category: gplay.category.GAME_ACTION,
collection: gplay.collection.TOP_FREE,
num: 2
}).then(console.log);
Results:
结果:
[ { url: 'https://play.google.com/store/apps/details?id=com.playappking.busrush',
appId: 'com.playappking.busrush',
title: 'Bus Rush',
developer: 'Play App King',
icon: 'https://lh3.googleusercontent.com/R6hmyJ6ls6wskk5hHFoW02yEyJpSG36il4JBkVf-Aojb1q4ZJ9nrGsx6lwsRtnTqfA=w340',
score: 3.9,
price: '0',
free: false },
{ url: 'https://play.google.com/store/apps/details?id=com.yodo1.crossyroad',
appId: 'com.yodo1.crossyroad',
title: 'Crossy Road',
developer: 'Yodo1 Games',
icon: 'https://lh3.googleusercontent.com/doHqbSPNekdR694M-4rAu9P2B3V6ivff76fqItheZGJiN4NBw6TrxhIxCEpqgO3jKVg=w340',
score: 4.5,
price: '0',
free: false } ]
回答by Sparky
The Google Play Store doesn't provide this data, so the sites must just be scraping it.
Google Play 商店不提供此数据,因此网站肯定只是在抓取它。
回答by Nirvana Tikku
Here's a google chrome extension that'll allow you to download yourreviews: https://chrome.google.com/webstore/detail/my-play-store-reviews/ldggikfajgoedghjnflfafiiheagngoa?hl=en
这里有一个谷歌的Chrome扩展程序,将允许您下载您的评论:https://chrome.google.com/webstore/detail/my-play-store-reviews/ldggikfajgoedghjnflfafiiheagngoa?hl=en