php 如何有效地找到给定位置附近最近的位置
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3922404/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to efficiently find the closest locations nearby a given location
提问by Thomas Clayson
I'm making a script where a load of business are loaded into a mySQL database with a latitude and longitude. Then I am supplying that script with a latitude an longitude (of the end user) and the script has to calculate the distance from the supplied lat/long to EACH of the entries it gets from the database and order them in order of nearest to furthest.
我正在制作一个脚本,其中将大量业务加载到具有纬度和经度的 mySQL 数据库中。然后我为该脚本提供纬度和经度(最终用户的),脚本必须计算从提供的纬度/经度到它从数据库中获取的每个条目的距离,并按照从最近到最远的顺序对它们进行排序.
I only realistically need about 10 or 20 "nearest" results, but I can't think of anyway to do this other than to get all the results from the database and run the function on each of them and then array sort.
我实际上只需要大约 10 或 20 个“最接近”的结果,但除了从数据库中获取所有结果并在每个结果上运行函数然后进行数组排序之外,我想不出无论如何要这样做。
This is what I have already:
这是我已经拥有的:
<?php
function getDistance($point1, $point2){
$radius = 3958; // Earth's radius (miles)
$pi = 3.1415926;
$deg_per_rad = 57.29578; // Number of degrees/radian (for conversion)
$distance = ($radius * $pi * sqrt(
($point1['lat'] - $point2['lat'])
* ($point1['lat'] - $point2['lat'])
+ cos($point1['lat'] / $deg_per_rad) // Convert these to
* cos($point2['lat'] / $deg_per_rad) // radians for cos()
* ($point1['long'] - $point2['long'])
* ($point1['long'] - $point2['long'])
) / 180);
$distance = round($distance,1);
return $distance; // Returned using the units used for $radius.
}
include("../includes/application_top.php");
$lat = (is_numeric($_GET['lat'])) ? $_GET['lat'] : 0;
$long = (is_numeric($_GET['long'])) ? $_GET['long'] : 0;
$startPoint = array("lat"=>$lat,"long"=>$long);
$sql = "SELECT * FROM mellow_listings WHERE active=1";
$result = mysql_query($sql);
while($row = mysql_fetch_array($result)){
$thedistance = getDistance($startPoint,array("lat"=>$row['lat'],"long"=>$row['long']));
$data[] = array('id' => $row['id'],
'name' => $row['name'],
'description' => $row['description'],
'lat' => $row['lat'],
'long' => $row['long'],
'address1' => $row['address1'],
'address2' => $row['address2'],
'county' => $row['county'],
'postcode' => strtoupper($row['postcode']),
'phone' => $row['phone'],
'email' => $row['email'],
'web' => $row['web'],
'distance' => $thedistance);
}
// integrate google local search
$url = "http://ajax.googleapis.com/ajax/services/search/local?";
$url .= "q=Off+licence"; // query
$url .= "&v=1.0"; // version number
$url .= "&rsz=8"; // number of results
$url .= "&key=ABQIAAAAtG"
."Pcon1WB3b0oiqER"
."FZ-TRQgsWYVg721Z"
."IDPMPlc4-CwM9Xt"
."FBSTZxHDVqCffQ2"
."W6Lr4bm1_zXeYoQ"; // api key
$url .= "&sll=".$lat.",".$long;
// sendRequest
// note how referer is set manually
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, /* url */);
$body = curl_exec($ch);
curl_close($ch);
// now, process the JSON string
$json = json_decode($body, true);
foreach($json['responseData']['results'] as $array){
$thedistance = getDistance($startPoint,array("lat"=>$array['lat'],"long"=>$array['lng']));
$data[] = array('id' => '999',
'name' => $array['title'],
'description' => '',
'lat' => $array['lat'],
'long' => $array['lng'],
'address1' => $array['streetAddress'],
'address2' => $array['city'],
'county' => $array['region'],
'postcode' => '',
'phone' => $array['phoneNumbers'][0],
'email' => '',
'web' => $array['url'],
'distance' => $thedistance);
}
// sort the array
foreach ($data as $key => $row) {
$id[$key] = $row['id'];
$distance[$key] = $row['distance'];
}
array_multisort($distance, SORT_ASC, $data);
header("Content-type: text/xml");
echo '<?xml version="1.0" encoding="UTF-8"?>'."\n";
echo '<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">'."\n";
echo '<plist version="1.0">'."\n";
echo '<array>'."\n";
for($i = 0; isset($distance[$i]); $i++){
//echo $data[$i]['id']." -> ".$distance[$i]."<br />";
echo '<dict>'."\n";
foreach($data[$i] as $key => $val){
echo '<key><![CDATA['.$key.']]></key>'."\n";
echo '<string><![CDATA['.htmlspecialchars_decode($val, ENT_QUOTES).']]></string>'."\n";
}
echo '</dict>'."\n";
}
echo '</array>'."\n";
echo '</plist>'."\n";
?>
Now, this runs fast enough with only 2 or 3 businesses in the database, but I'm currently loading 5k businesses into the database and I'm worried that its going to be incredibly slow running this for EACH entry? What do you think?
现在,这运行得足够快,数据库中只有 2 或 3 个企业,但我目前正在将 5k 个企业加载到数据库中,我担心为每个条目运行它会非常慢?你怎么认为?
Its not the kind of data I could cache either, as the likelihood of two users having the same lat/long is liable to be incredibly rare, and therefore wouldn't help.
它也不是我可以缓存的数据类型,因为两个用户具有相同纬度/经度的可能性非常罕见,因此无济于事。
What can I do about this?
我该怎么办?
Thanks for any help and any suggestions. They're all much appreciated.
感谢您的任何帮助和任何建议。他们都非常感激。
采纳答案by Mark Baker
Option 1: Do the calculation on the database by switching to a database that supports GeoIP.
选项 1:通过切换到支持 GeoIP 的数据库对数据库进行计算。
Option 2: Do the calculation on the database: you're using MySQL, so the following stored procedure should help
选项 2:对数据库进行计算:您使用的是 MySQL,因此以下存储过程应该会有所帮助
CREATE FUNCTION distance (latA double, lonA double, latB double, LonB double)
RETURNS double DETERMINISTIC
BEGIN
SET @RlatA = radians(latA);
SET @RlonA = radians(lonA);
SET @RlatB = radians(latB);
SET @RlonB = radians(LonB);
SET @deltaLat = @RlatA - @RlatB;
SET @deltaLon = @RlonA - @RlonB;
SET @d = SIN(@deltaLat/2) * SIN(@deltaLat/2) +
COS(@RlatA) * COS(@RlatB) * SIN(@deltaLon/2)*SIN(@deltaLon/2);
RETURN 2 * ASIN(SQRT(@d)) * 6371.01;
END//
EDIT
编辑
If you have an index on latitude and longitude in your database, you can reduce the number of calculations that need to be calculated by working out an initial bounding box in PHP ($minLat, $maxLat, $minLong and $maxLong), and limiting the rows to a subset of your entries based on that (WHERE latitude BETWEEN $minLat AND $maxLat AND longitude BETWEEN $minLong AND $maxLong). Then MySQL only needs to execute the distance calculation for that subset of rows.
如果您的数据库中有纬度和经度的索引,则可以通过在 PHP 中计算出初始边界框($minLat、$maxLat、$minLong 和 $maxLong)来减少需要计算的计算次数,并限制基于该行的条目子集(WHERE latitude BETWEEN $minLat AND $maxLat AND longitude BETWEEN $minLong AND $maxLong)。然后 MySQL 只需要对该行子集执行距离计算。
FURTHER EDIT(as an explanation for the previous edit)
进一步编辑(作为对之前编辑的解释)
If you're simply using the SQL statement provided by Jonathon (or a stored procedure to calculate the distance) then SQL still has to look through every record in your database, and to calculate the distance for every record in your database before it can decide whether to return that row or discard it.
如果您只是使用 Jonathon 提供的 SQL 语句(或用于计算距离的存储过程),那么 SQL 仍然必须查看数据库中的每条记录,并计算数据库中每条记录的距离,然后才能做出决定是返回该行还是丢弃它。
Because the calculation is relatively slow to execute, it would be better if you could reduce the set of rows that need to be calculated, eliminating rows that will clearly fall outside of the required distance, so that we're only executing the expensive calculation for a smaller number of rows.
因为计算执行起来相对较慢,如果你能减少需要计算的行集,消除明显落在所需距离之外的行,这样我们只会执行昂贵的计算行数较少。
If you consider that what you're doing is basically drawing a circle on a map, centred on your initial point, and with a radius of distance; then the formula simply identifies which rows fall within that circle... but it still has to checking every single row.
如果你认为你所做的基本上是在地图上画一个圆,以你的初始点为中心,距离半径;然后公式简单地确定哪些行落在该圆圈内......但它仍然必须检查每一行。
Using a bounding box is like drawing a square on the map first with the left, right, top and bottom edges at the appropriate distance from our centre point. Our circle will then be drawn within that box, with the Northmost, Eastmost, Southmost and Westmost points on the circle touching the borders of the box. Some rows will fall outside that box, so SQL doesn't even bother trying to calculate the distance for those rows. It only calculates the distance for those rows that fall within the bounding box to see if they fall within the circle as well.
使用边界框就像首先在地图上绘制一个正方形,左、右、上和下边缘与我们的中心点相距适当的距离。然后我们的圆将被画在那个盒子里,圆上最北、最东、最南和最西的点接触盒子的边界。某些行将落在该框之外,因此 SQL 甚至不会费心计算这些行的距离。它只计算落入边界框内的那些行的距离,以查看它们是否也落入圆内。
Within PHP, we can use a very simple calculation that works out the minimum and maximum latitude and longitude based on our distance, then set those values in the WHERE clause of your SQL statement. This is effectively our box, and anything that falls outside of that is automatically discarded without any need to actually calculate its distance.
在 PHP 中,我们可以使用非常简单的计算,根据距离计算出最小和最大纬度和经度,然后在 SQL 语句的 WHERE 子句中设置这些值。这实际上是我们的盒子,任何落在它之外的东西都会被自动丢弃,而无需实际计算其距离。
There's a good explanation of this (with PHP code) on the Movable Type websitethat should be essential reading for anybody planning to do any GeoPositioning work in PHP.
在Movable Type 网站上有一个很好的解释(使用 PHP 代码),对于任何计划在 PHP 中进行任何地理定位工作的人来说,这应该是必不可少的读物。
回答by Jonathon Bolster
I think what you're trying to achieve could be done better using the Haversine formulain your SQL. Google has a tutorial on how to get the nearest locations in a MySQL databasebut the general idea is this SQL:
我认为使用SQL 中的Haversine 公式可以更好地实现您想要实现的目标。谷歌有一个关于如何在 MySQL 数据库中获取最近位置的教程,但总体思路是这个 SQL:
SELECT id, ( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) )
* cos( radians( lng ) - radians(-122) ) + sin( radians(37) )
* sin( radians( lat ) ) ) ) AS distance
FROM markers
HAVING distance < 25
ORDER BY distance LIMIT 0 , 20;
Then all the work you need to do is done on the database, so you don't have to pull all the businesses into your PHP script before you even check the distance.
那么你需要做的所有工作都是在数据库上完成的,所以你甚至不必在检查距离之前将所有业务都拉到你的PHP脚本中。
回答by jjrv
If you have a lot of points, queries with distance formulas in them will be very slow because it's not using an index for the search. For efficiency you'd either have to use a rectangular bounding box to make it faster, or you can use a database with GIS features built in. PostGIS is free and here's an article on doing nearest neighbor search:
如果您有很多点,其中包含距离公式的查询将非常慢,因为它没有使用索引进行搜索。为了提高效率,您要么必须使用矩形边界框使其更快,要么您可以使用内置 GIS 功能的数据库。 PostGIS 是免费的,这里有一篇关于进行最近邻搜索的文章:
http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgis_nearest_neighbor_generic
http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgis_nearest_neighbor_generic
回答by Liox
There is much easier way to work this one out.
有更简单的方法来解决这个问题。
We know that 0.1 difference in latitude at exact same longitude equal to distance of 11.12 km. (1.0 in lat will make that distance 111.2 km)
Also with 0.1 difference in longitude and same latitude distance is 3.51 km (1.0 in lon will make that distance 85.18 km) (to convert into miles we multiply that by 1.60934)
我们知道,在完全相同的经度上,0.1 纬度差异等于 11.12 公里的距离。(1.0 纬度将使该距离为 111.2 公里)
同样,经度和同纬度相差 0.1 的距离为 3.51 公里(1.0 经度将使该距离为 85.18 公里)(要转换为英里,我们将其乘以 1.60934)
NOTE.Be aware that longitude goes from -180 to 180, so difference between -180 to 179.9 is 0.1 which is 3.51 km.
笔记。请注意,经度从 -180 到 180,因此 -180 到 179.9 之间的差异是 0.1,即 3.51 公里。
All we need to know now is list of all zipcodes with lon and lat (you already have that)
我们现在需要知道的是所有带有 lon 和 lat 的邮政编码列表(你已经有了)
So now to narrow your search by 90% you only need to cut out all results that will definitely not be within 100 kilometers, for example. our coordinates $lat1 and $lon2 for 100 kilomiters difference of 2 in both lat and lon will be more than enough.
因此,现在要将搜索范围缩小 90%,您只需要删除所有肯定不会在 100 公里以内的结果。我们的坐标 $lat1 和 $lon2 对于 100 公里的 lat 和 lon 差异为 2 就足够了。
$lon=...;
$lat=...;
$dif=2;
SELECT zipcode from zipcode_table WHERE latitude>($lan-$dif) AND latitude<($lan+$dif) AND longitude>($lon-$dif) AND longitude<($lon+$dif)
Something like that. Of course if you need to cover smaller or larger area you will need to change $dif accordingly.
类似的东西。当然,如果您需要覆盖更小或更大的区域,则需要相应地更改 $dif。
This way Mysql will only look into very limited saving comp resources.
这样 Mysql 只会考虑非常有限的节省资源。