Python实现下载当前年份的谷歌影像

在GIS项目和地图应用中,获取最新的地理影像数据是非常重要的。本文将介绍如何使用Python代码从Google地图自动下载当前年份的影像数据,并将其保存为高分辨率的TIFF格式文件。这个过程涉及地理坐标转换、多线程下载和图像处理。

关键功能

该脚本的核心功能包括:

  • 坐标转换:支持WGS-84与Web Mercator投影之间转换,以及处理中国GCJ-02偏移。
  • 自动化下载:多线程下载地图瓦片,提高效率。
  • 图像合并:将多个瓦片合并成单一的地理TIFF图像。
  • 地理信息编码:将合并后的图像保存为TIFF格式,包括地理编码信息。

输入

  • 地理坐标:您需要指定下载区域的左上角和右下角的经纬度。例如:

    • 左上角经度和纬度(例如:100.361, 38.866)
    • 右下角经度和纬度(例如:100.972, 38.456)
    • 可以通过clusters_bounding_boxes.txt文件进行指定
Cluster_0 -96.12334405 -96.10529700000001 40.74125305 40.70810245
Cluster_1 -71.96097335 -71.9594839 45.3810689 45.38053285
Cluster_2 -118.8591978 -118.8240935 34.1401192 34.13542375
Cluster_3 6.4987245 6.51150505 45.6492401 45.6264406
Cluster_4 8.67063735 8.68108305 45.67584345 45.66658435
  • 缩放级别:地图的缩放级别(从0到22),影响下载的细节程度。

  • 影像样式:选择所需的影像类型,如卫星影像(s)、带标签的卫星影像(y)等。

  • 服务器选择:可选Google或Google中国服务器。

代码示例

下面的例子展示了如何调用主函数来下载一个指定区域的当前年份的谷歌卫星影像:

left, top = 100.361, 38.866  # 左上角坐标
right, bottom = 100.972, 38.456  # 右下角坐标
zoom = 17  # 缩放级别
filePath = './path_to_save_image/satellite_image.tif'  # 保存路径
style = 's'  # 影像样式
server = "Google"  # 使用Google服务器

main(left, top, right, bottom, zoom, filePath, style, server)

输出

  • TIFF格式的影像文件:包含合并后的图像,具有完整的地理参考信息,适合在GIS软件中使用。

代码

# -*- coding: utf-8 -*
#%% function
'''
This code is used to download image from google
输出的影像是长宽比为1的图像
'''


import json
import io
import math
import multiprocessing
import os
import time
import urllib.request as ur
import urllib3
from math import floor, pi, log, tan, atan, exp
from threading import Thread
import PIL.Image as pil
import cv2
import numpy as np
from osgeo import gdal, osr
import re
import socket
import urllib.error
from math import ceil

import pandas as pd


# ------------------Interchange between WGS-84 and Web Mercator-------------------------
# WGS-84 to Web Mercator
def wgs_to_mercator(x, y):
    y = 85.0511287798 if y > 85.0511287798 else y
    y = -85.0511287798 if y < -85.0511287798 else y

    x2 = x * 20037508.34 / 180
    y2 = log(tan((90 + y) * pi / 360)) / (pi / 180)
    y2 = y2 * 20037508.34 / 180
    return x2, y2


# Web Mercator to WGS-84
def mercator_to_wgs(x, y):
    x2 = x / 20037508.34 * 180
    y2 = y / 20037508.34 * 180
    y2 = 180 / pi * (2 * atan(exp(y2 * pi / 180)) - pi / 2)
    return x2, y2


# --------------------------------------------------------------------------------------

# -----------------Interchange between GCJ-02 to WGS-84---------------------------
# All public geographic data in mainland China need to be encrypted with GCJ-02, introducing random bias
# This part of the code is used to remove the bias
def transformLat(x, y):
    ret = -100.0 + 2.0 * x + 3.0 * y + 0.2 * y * y + 0.1 * x * y + 0.2 * math.sqrt(abs(x))
    ret += (20.0 * math.sin(6.0 * x * math.pi) + 20.0 * math.sin(2.0 * x * math.pi)) * 2.0 / 3.0
    ret += (20.0 * math.sin(y * math.pi) + 40.0 * math.sin(y / 3.0 * math.pi)) * 2.0 / 3.0
    ret += (160.0 * math.sin(y / 12.0 * math.pi) + 320 * math.sin(y * math.pi / 30.0)) * 2.0 / 3.0
    return ret


def transformLon(x, y):
    ret = 300.0 + x + 2.0 * y + 0.1 * x * x + 0.1 * x * y + 0.1 * math.sqrt(abs(x))
    ret += (20.0 * math.sin(6.0 * x * math.pi) + 20.0 * math.sin(2.0 * x * math.pi)) * 2.0 / 3.0
    ret += (20.0 * math.sin(x * math.pi) + 40.0 * math.sin(x / 3.0 * math.pi)) * 2.0 / 3.0
    ret += (150.0 * math.sin(x / 12.0 * math.pi) + 300.0 * math.sin(x / 30.0 * math.pi)) * 2.0 / 3.0
    return ret


def delta(lat, lon):
    ''' 
    Krasovsky 1940
    //
    // a = 6378245.0, 1/f = 298.3
    // b = a * (1 - f)
    // ee = (a^2 - b^2) / a^2;
    '''
    a = 6378245.0  # a: Projection factor of satellite ellipsoidal coordinates projected onto a flat map coordinate system
    ee = 0.00669342162296594323  # ee: Eccentricity of ellipsoid
    dLat = transformLat(lon - 105.0, lat - 35.0)
    dLon = transformLon(lon - 105.0, lat - 35.0)
    radLat = lat / 180.0 * math.pi
    magic = math.sin(radLat)
    magic = 1 - ee * magic * magic
    sqrtMagic = math.sqrt(magic)
    dLat = (dLat * 180.0) / ((a * (1 - ee)) / (magic * sqrtMagic) * math.pi)
    dLon = (dLon * 180.0) / (a / sqrtMagic * math.cos(radLat) * math.pi)
    return {'lat': dLat, 'lon': dLon}


def outOfChina(lat, lon):
    if (lon < 72.004 or lon > 137.8347):
        return True
    if (lat < 0.8293 or lat > 55.8271):
        return True
    return False


def gcj_to_wgs(gcjLon, gcjLat):
    if outOfChina(gcjLat, gcjLon):
        return (gcjLon, gcjLat)
    d = delta(gcjLat, gcjLon)
    return (gcjLon - d["lon"], gcjLat - d["lat"])


def wgs_to_gcj(wgsLon, wgsLat):
    if outOfChina(wgsLat, wgsLon):
        return wgsLon, wgsLat
    d = delta(wgsLat, wgsLon)
    return wgsLon + d["lon"], wgsLat + d["lat"]


# --------------------------------------------------------------

# ---------------------------------------------------------
# Get tile coordinates in Google Maps based on latitude and longitude of WGS-84
def wgs_to_tile(j, w, z):
    '''
    Get google-style tile cooridinate from geographical coordinate
    j : Longittude
    w : Latitude
    z : zoom
    '''
    isnum = lambda x: isinstance(x, int) or isinstance(x, float)
    if not (isnum(j) and isnum(w)):
        raise TypeError("j and w must be int or float!")

    if not isinstance(z, int) or z < 0 or z > 22:
        raise TypeError("z must be int and between 0 to 22.")

    if j < 0:
        j = 180 + j
    else:
        j += 180
    j /= 360  # make j to (0,1)

    w = 85.0511287798 if w > 85.0511287798 else w
    w = -85.0511287798 if w < -85.0511287798 else w
    w = log(tan((90 + w) * pi / 360)) / (pi / 180)
    w /= 180  # make w to (-1,1)
    w = 1 - (w + 1) / 2  # make w to (0,1) and left top is 0-point

    num = 2 ** z
    x = floor(j * num)
    y = floor(w * num)
    return x, y


def pixls_to_mercator(zb):
    # Get the web Mercator projection coordinates of the four corners of the area according to the four corner coordinates of the tile
    inx, iny = zb["LT"]  # left top
    inx2, iny2 = zb["RB"]  # right bottom
    length = 20037508.3427892
    sum = 2 ** zb["z"]
    LTx = inx / sum * length * 2 - length
    LTy = -(iny / sum * length * 2) + length

    RBx = (inx2 + 1) / sum * length * 2 - length
    RBy = -((iny2 + 1) / sum * length * 2) + length

    # LT=left top,RB=right buttom
    # Returns the projected coordinates of the four corners
    res = {'LT': (LTx, LTy), 'RB': (RBx, RBy),
           'LB': (LTx, RBy), 'RT': (RBx, LTy)}
    return res


def tile_to_pixls(zb):
    # Tile coordinates are converted to pixel coordinates of the four corners
    out = {}
    width = (zb["RT"][0] - zb["LT"][0] + 1) * 256
    height = (zb["LB"][1] - zb["LT"][1] + 1) * 256
    out["LT"] = (0, 0)
    out["RT"] = (width, 0)
    out["LB"] = (0, -height)
    out["RB"] = (width, -height)
    return out


# -----------------------------------------------------------

# ---------------------------------------------------------
class Downloader(Thread):
    # 多线程下载器
    def __init__(self, index, count, urls, datas, proxy_address):
        super().__init__()
        self.urls = urls
        self.datas = datas
        self.index = index
        self.count = count
        self.total_urls = len(urls)
        self.proxy_address = proxy_address
        self.http = urllib3.ProxyManager(proxy_address, headers={
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/118.0"
        })

    def download(self, url, timeout=30):
        err = 0
        while err < 3:
            try:
                response = self.http.request('GET', url, timeout=timeout)
                return response.data
            except urllib.error.URLError as e:
                if isinstance(e.reason, socket.timeout):
                    print(f"Attempt {err + 1}: Timeout error downloading data - {e.reason}")
                else:
                    print(f"Attempt {err + 1}: URLError downloading data - {e.reason}")
            except socket.timeout:
                print(f"Attempt {err + 1}: Socket timeout error downloading data")
            except Exception as e:
                print(f"Attempt {err + 1}: General error downloading data - {e}")
            err += 1
            time.sleep(5)  # 等待5秒后重试
        raise Exception("Failed to download after 3 attempts.")

    def run(self):
        completed = 0
        failed_urls = []  # 用于保存下载失败的 URL
        for i, url in enumerate(self.urls):
            if i % self.count != self.index:  # 分配任务到特定线程
                continue
            try:
                data = self.download(url)
                if data:
                    self.datas[i] = data
                    completed += 1
                    print(f"Thread {self.index} has downloaded {completed}/{self.total_urls} tiles.")
                else:
                    raise Exception("No data returned.")
            except Exception as e:
                print(f"Thread {self.index} failed to download {url}. Exception: {e}")
                failed_urls.append((i, url))  # 记录失败的 URL 和它的索引

        # 尝试重新下载失败的 URL
        for i, url in failed_urls:
            try:
                print(f"Thread {self.index} retrying download for {url}.")
                data = self.download(url)
                if data:
                    self.datas[i] = data
                    print(f"Thread {self.index} successfully downloaded {url} on retry.")
                else:
                    print(f"Thread {self.index} failed to download {url} on retry.")
            except Exception as e:
                print(f"Thread {self.index} failed to download {url} on retry. Exception: {e}")


# ---------------------------------------------------------

# ---------------------------------------------------------
def getExtent(x1, y1, x2, y2, z, source="Google China"):
    pos1x, pos1y = wgs_to_tile(x1, y1, z)
    pos2x, pos2y = wgs_to_tile(x2, y2, z)
    pos1x, pos1y, pos2x, pos2y = adjust_coordinates(pos1x, pos1y, pos2x, pos2y)
    Xframe = pixls_to_mercator(
        {"LT": (pos1x, pos1y), "RT": (pos2x, pos1y), "LB": (pos1x, pos2y), "RB": (pos2x, pos2y), "z": z})
    for i in ["LT", "LB", "RT", "RB"]:
        Xframe[i] = mercator_to_wgs(*Xframe[i])
    if source == "Google":
        pass
    elif source == "Google China":
        for i in ["LT", "LB", "RT", "RB"]:
            Xframe[i] = gcj_to_wgs(*Xframe[i])
    else:
        raise Exception("Invalid argument: source.")
    return Xframe


def saveTiff(r, g, b, gt, filePath):
    fname_out = filePath
    driver = gdal.GetDriverByName('GTiff')
    # Create a 3-band dataset
    dset_output = driver.Create(fname_out, r.shape[1], r.shape[0], 3, gdal.GDT_Byte)
    dset_output.SetGeoTransform(gt)
    try:
        proj = osr.SpatialReference()
        proj.ImportFromEPSG(4326)
        dset_output.SetSpatialRef(proj)
    except:
        print("Error: Coordinate system setting failed")
    dset_output.GetRasterBand(1).WriteArray(r)
    dset_output.GetRasterBand(2).WriteArray(g)
    dset_output.GetRasterBand(3).WriteArray(b)
    dset_output.FlushCache()
    dset_output = None
    print("Image Saved")


def adjust_coordinates(x1, y1, x2, y2):

    width = abs(x2 - x1)
    height = abs(y1 - y2)

    max_side = max(width, height)

    # 重新计算边界,使长宽相等
    # 计算中心点
    center_x = (x1 + x2) / 2
    center_y = (y1 + y2) / 2

    # 更新坐标,确保边界中心不变且长宽相等
    x1 = center_x - max_side / 2
    x2 = center_x + max_side / 2
    y1 = center_y - max_side / 2
    y2 = center_y + max_side / 2

    # 检查并确保新的长宽相等
    # 如果不等,可能需要调整坐标
    new_width = x2 - x1
    new_height = y2 - y1
    if new_width != new_height:
        if new_width > new_height:
            # 如果宽度大于高度,调整y坐标
            diff = new_width - new_height
            y1 -= diff // 2
            y2 += diff // 2
        else:
            # 如果高度大于宽度,调整x坐标
            diff = new_height - new_width
            x1 -= diff // 2
            x2 += diff // 2
    return int(x1), int(y1), int(x2), int(y2)


# ---------------------------------------------------------

# ---------------------------------------------------------
MAP_URLS = {
    "Google": "http://mts0.googleapis.com/vt?lyrs={style}&x={x}&y={y}&z={z}",
    "Google China": "http://mt2.google.cn/vt/lyrs={style}&hl=zh-CN&gl=CN&src=app&x={x}&y={y}&z={z}"}


def get_url(source, x, y, z, style):  #
    if source == 'Google China':
        url = MAP_URLS["Google China"].format(x=x, y=y, z=z, style=style)
    elif source == 'Google':
        url = MAP_URLS["Google"].format(x=x, y=y, z=z, style=style)
    else:
        raise Exception("Unknown Map Source ! ")
    return url


def get_urls(x1, y1, x2, y2, z, source, style):
    pos1x, pos1y = wgs_to_tile(x1, y1, z)
    pos2x, pos2y = wgs_to_tile(x2, y2, z)
    pos1x, pos1y, pos2x, pos2y = adjust_coordinates(pos1x, pos1y, pos2x, pos2y)
    lenx = pos2x - pos1x + 1
    leny = pos2y - pos1y + 1
    print("Total tiles number:{x} X {y}".format(x=lenx, y=leny))
    urls = [get_url(source, i, j, z, style) for j in range(int(pos1y), int(pos1y) + int(leny)) for i in range(int(pos1x), int(pos1x) + int(lenx))]
    return urls


# ---------------------------------------------------------

# ---------------------------------------------------------
def merge_tiles(datas, x1, y1, x2, y2, z):
    pos1x, pos1y = wgs_to_tile(x1, y1, z)
    pos2x, pos2y = wgs_to_tile(x2, y2, z)
    pos1x, pos1y, pos2x, pos2y = adjust_coordinates(pos1x, pos1y, pos2x, pos2y)
    lenx = pos2x - pos1x + 1
    leny = pos2y - pos1y + 1
    outpic = pil.new('RGBA', (lenx * 256, leny * 256))
    for i, data in enumerate(datas):
        picio = io.BytesIO(data)
        small_pic = pil.open(picio)
        y, x = i // lenx, i % lenx
        outpic.paste(small_pic, (x * 256, y * 256))
    print('Tiles merge completed')
    return outpic


def download_tiles(urls, multi=10):
    url_len = len(urls)
    datas = [None] * url_len
    if multi < 1 or multi > 20 or not isinstance(multi, int):
        raise Exception("multi of Downloader shuold be int and between 1 to 20.")
    tasks = [Downloader(i, multi, urls, datas, 'http://127.0.0.1:7890') for i in range(multi)]
    for i in tasks:
        i.start()
    for i in tasks:
        i.join()
    return datas

# ---------------------------------------------------------

# ---------------------------------------------------------
def main(left, top, right, bottom, zoom, filePath, style='s', server="Google China"):
    """
    Download images based on spatial extent.

    East longitude is positive and west longitude is negative.
    North latitude is positive, south latitude is negative.

    Parameters
    ----------
    left, top : left-top coordinate, for example (100.361,38.866)
        
    right, bottom : right-bottom coordinate
        
    z : zoom

    filePath : File path for storing results, TIFF format
        
    style : 
        m for map; 
        s for satellite; 
        y for satellite with label; 
        t for terrain; 
        p for terrain with label; 
        h for label;
    
    source : Google China (default) or Google
    """



    # Check if the number of tiles will result in an image with an equal width and height

    # ---------------------------------------------------------
    # Get the urls of all tiles in the extent
    urls = get_urls(left, top, right, bottom, zoom, server, style)
    # Group URLs based on the number of CPU cores to achieve roughly equal amounts of tasks
    urls_group = [urls[i:i + math.ceil(len(urls) / multiprocessing.cpu_count())] for i in
                  range(0, len(urls), math.ceil(len(urls) / multiprocessing.cpu_count()))]

    # Each set of URLs corresponds to a process for downloading tile maps
    print('Tiles downloading......')
    pool = multiprocessing.Pool(multiprocessing.cpu_count())
    print("begin")
    results = pool.map(download_tiles, urls_group)
    print("end")
    pool.close()
    pool.join()
    result = [x for j in results for x in j]
    print('Tiles download complete')

    # Combine downloaded tile maps into one map
    outpic = merge_tiles(result, left, top, right, bottom, zoom)
    r, g, b = cv2.split(np.array(outpic.convert('RGB')))

    # Get the spatial information of the four corners of the merged map and use it for outputting
    extent = getExtent(left, top, right, bottom, zoom, server)
    gt = (extent['LT'][0], (extent['RB'][0] - extent['LT'][0]) / r.shape[1], 0, extent['LT'][1], 0,
          (extent['RB'][1] - extent['LT'][1]) / r.shape[0])
    saveTiff(r, g, b, gt, filePath)



def isbet(str1):  # 判断是否问英文
    for x in str1:
        if ('a' <= x <= 'z'):
            return 1
    return 0

def parse_line(line):
    """使用空格分隔字符串并提取点名称和坐标"""
    parts = line.split()
    coordinates = parts[-4:]
    jpg_name = ' '.join(parts[:-4])
    return jpg_name, coordinates


if __name__ == '__main__':

    count = 0
    max_images = 10000  # 设置最大下载数量
    fp_error = open("error.txt", "w", encoding="utf-8")

    with open("clusters_bounding_boxes.txt", 'r', encoding="utf-8") as fp:
        lines = fp.readlines()

    for count, line in enumerate(lines, start=1):
        if count > max_images:  # 检查是否已下载10000张图像
            print("Reached the limit of 10000 images, stopping downloads.")
            break  # 停止下载更多图像

        jpg_name, coordinates = parse_line(line)
        left, right, top, below = map(float, coordinates)

        # 处理坐标并进行30%扩展
        alpha = 1
        beta = 0.0
        left, right = left - abs(left - right) * alpha - beta, right + abs(left - right) * alpha + beta
        top, below = top + abs(top - below) * alpha + beta, below - abs(top - below) * alpha - beta

        # 首先计算当前范围的宽度和高度
        width = abs(right - left)
        height = abs(top - below)

        max_side = max(width, height)

        center_w = (left + right) / 2
        center_h = (top + below) / 2

        # 重新计算边界,使长宽相等
        left = center_w - max_side / 2
        right = center_w + max_side / 2
        top = center_h + max_side / 2
        below = center_h - max_side / 2

        # 构造文件路径,使用点名称作为文件名,替换空格以避免文件命名错误
        count_fig = 1
        safe_name = jpg_name.replace(' ', '_').replace('/', '_').replace('\\', '_')
        image_path = f"./downloaded_images/{safe_name}.tif"

        try:
            # 调用 main 函数下载图像
            main(left, top, right, below, 17, image_path, style='s', server="Google")
            print(f"Downloading: {count}/{len(lines)}, Name: {jpg_name}")
        except Exception as e:
            with open("error.txt", "a", encoding="utf-8") as fp_error:
                fp_error.write(f"Error with name {jpg_name}: {e}\n")

    fp_error.close()
    fp.close()

执行过程

  1. 计算需要下载的瓦片:根据输入的地理坐标和缩放级别,计算所需瓦片的URL。
  2. 多线程下载:使用多线程技术并行下载瓦片,提高下载速度。
  3. 瓦片合并:将下载的瓦片合并成一个完整的图像。
  4. 保存为TIFF:将合并后的图像保存为TIFF格式,包含地理空间信息。

使用Python从Google地图自动下载和处理地图瓦片是一个高效且灵活的方法,适用于需要大量地理图像数据的应用场景。通过本文介绍的方法,用户可以轻松实现对最新地理影像数据的访问和利用。希望对大家有帮助!

你可能感兴趣的:(python,开发语言)