bash 下载 Azure 存储容器中的所有 blob

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37079951/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 14:36:22  来源:igfitidea点击:

Download all blobs within an Azure Storage container

pythonbashazurecontainers

提问by privateer35

I've managed to write a python script to list out all the blobs within a container.

我设法编写了一个 python 脚本来列出容器中的所有 blob。

import azure
from azure.storage.blob import BlobService
from azure.storage import *

blob_service = BlobService(account_name='<CONTAINER>', account_key='<ACCOUNT_KEY>')


blobs = []
marker = None
while True:
    batch = blob_service.list_blobs('<CONAINER>', marker=marker)
    blobs.extend(batch)
    if not batch.next_marker:
        break
    marker = batch.next_marker
for blob in blobs:
    print(blob.name)

Like I said this only lists the blobs that I want to download. I've moved onto the Azure CLI to see if that could aid in what I want to do. I'm able to download a single blob with

就像我说的,这只列出了我想要下载的 blob。我已经转向 Azure CLI,看看这是否可以帮助我完成我想做的事情。我可以下载一个 blob

azure storage blob download [container]

it then prompts me specify a blob which I can grab from the python script. The way I would have to download all those blobs is to copy and paste them into the prompt after the command used above. Is there a way I can either:

然后它提示我指定一个可以从 python 脚本中获取的 blob。我必须下载所有这些 blob 的方法是在上面使用的命令之后将它们复制并粘贴到提示中。有没有办法我可以:

A. Write a bash script to iterate through the list of blobs by executing the command, then pasting the next blob name in the prompt.

一个。编写一个 bash 脚本,通过执行命令来遍历 blob 列表,然后在提示中粘贴下一个 blob 名称。

B. Specify to download the container in either the python script or Azure CLI. Is there something I'm not seeing to download the whole container?

。指定在 python 脚本或 Azure CLI 中下载容器。有什么我没有看到下载整个容器的东西吗?

回答by Brij Raj Singh - MSFT

@gary-liu-msft solution is correct. I made some more changes to the same, now the code can iterate through the containers and the folder structure in it (PS - there are no folders in containers, just path), check if the same directory structure exists in client and if not then create that directory structure and download the blobs in those path. It supports the long paths with embedded sub directories.

@gary-liu-msft 解决方案是正确的。我对其进行了更多更改,现在代码可以遍历容器及其中的文件夹结构(PS - 容器中没有文件夹,只有路径),检查客户端中是否存在相同的目录结构,如果不存在创建该目录结构并下载这些路径中的 blob。它支持带有嵌入式子目录的长路径。

from azure.storage.blob import BlockBlobService
from azure.storage.blob import PublicAccess
import os

#name of your storage account and the access key from Settings->AccessKeys->key1
block_blob_service = BlockBlobService(account_name='storageaccountname', account_key='accountkey')

#name of the container
generator = block_blob_service.list_blobs('testcontainer')

#code below lists all the blobs in the container and downloads them one after another
for blob in generator:
    print(blob.name)
    print("{}".format(blob.name))
    #check if the path contains a folder structure, create the folder structure
    if "/" in "{}".format(blob.name):
        print("there is a path in this")
        #extract the folder path and check if that folder exists locally, and if not create it
        head, tail = os.path.split("{}".format(blob.name))
        print(head)
        print(tail)
        if (os.path.isdir(os.getcwd()+ "/" + head)):
            #download the files to this directory
            print("directory and sub directories exist")
            block_blob_service.get_blob_to_path('testcontainer',blob.name,os.getcwd()+ "/" + head + "/" + tail)
        else:
            #create the diretcory and download the file to it
            print("directory doesn't exist, creating it now")
            os.makedirs(os.getcwd()+ "/" + head, exist_ok=True)
            print("directory created, download initiated")
            block_blob_service.get_blob_to_path('testcontainer',blob.name,os.getcwd()+ "/" + head + "/" + tail)
    else:
        block_blob_service.get_blob_to_path('testcontainer',blob.name,blob.name)

The same code is also available here https://gist.github.com/brijrajsingh/35cd591c2ca90916b27742d52a3cf6ba

此处也提供相同的代码https://gist.github.com/brijrajsingh/35cd591c2ca90916b27742d52a3cf6ba

回答by Gary Liu - MSFT

Currently, it seems we cannot directly download all the blobs from a container with a single API. And we can get all the available operations with blobs at https://msdn.microsoft.com/en-us/library/azure/dd179377.aspx.

目前,我们似乎无法使用单个 API 从容器中直接下载所有 blob。我们可以在https://msdn.microsoft.com/en-us/library/azure/dd179377.aspx获得所有可用的 blob 操作。

So we can list the ListGeneratorof blobs, then download the blobs in loop. E.G.:

所以我们可以列出ListGeneratorblob,然后循环下载 blob。例如:

result = blob_service.list_blobs(container)
for b in result.items:
    r = blob_service.get_blob_to_path(container,b.name,"folder/{}".format(b.name))

update

更新

import blockblob service when using azure-storage-python:

使用azure-storage-python时导入 blockblob 服务:

from azure.storage.blob import BlockBlobService

from azure.storage.blob import BlockBlobService