Python 如何从 URL 中提取文件名并在其中添加一个单词?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18727347/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 11:36:54  来源:igfitidea点击:

How to extract a filename from a URL & append a word to it?

pythondjango

提问by deadlock

I have the following url:

我有以下网址:

url = http://photographs.500px.com/kyle/09-09-201315-47-571378756077.jpg

网址 = http://photographs.500px.com/kyle/09-09-201315-47-571378756077.jpg

I would like to extract the file name in this url: 09-09-201315-47-571378756077.jpg

我想提取这个网址中的文件名:09-09-201315-47-571378756077.jpg

Once I get this file name, I'm going to save it with this name to the Desktop.

获得此文件名后,我将使用此名称将其保存到桌面。

filename = **extracted file name from the url**     
download_photo = urllib.urlretrieve(url, "/home/ubuntu/Desktop/%s.jpg" % (filename))

After this, I'm going to resize the photo, once that is done, I've going to save the resized version and append the word "_small" to the end of the filename.

在此之后,我将调整照片大小,完成后,我将保存调整后的版本并将“_small”一词附加到文件名的末尾。

downloadedphoto = Image.open("/home/ubuntu/Desktop/%s.jpg" % (filename))               
resize_downloadedphoto = downloadedphoto.resize.((300, 300), Image.ANTIALIAS)
resize_downloadedphoto.save("/home/ubuntu/Desktop/%s.jpg" % (filename + _small))

From this, what I am trying to achieve is to get two files, the original photo with the original name, then the resized photo with the modified name. Like so:

由此,我想要实现的是获取两个文件,原始名称的原始照片,然后是修改后名称的调整大小的照片。像这样:

09-09-201315-47-571378756077.jpg

09-09-201315-47-571378756077.jpg

09-09-201315-47-571378756077_small.jpg

09-09-201315-47-571378756077_small.jpg

How can I go about doing this?

我该怎么做呢?

采纳答案by Ofir Israel

You can use urllib.parse.urlparsewith os.path.basename:

你可以用urllib.parse.urlparseos.path.basename

import os
from urllib.parse import urlparse

url = "http://photographs.500px.com/kyle/09-09-201315-47-571378756077.jpg"
a = urlparse(url)
print(a.path)                    # Output: /kyle/09-09-201315-47-571378756077.jpg
print(os.path.basename(a.path))  # Output: 09-09-201315-47-571378756077.jpg

回答by RickyA

filename = url[url.rfind("/")+1:]
filename_small = filename.replace(".", "_small.")

maybe use ".jpg" in the last case since a . can also be in the filename.

也许在最后一种情况下使用“.jpg”,因为 . 也可以在文件名中。

回答by Moj

Python split url to find image name and extension

Python拆分url以查找图像名称和扩展名

helps you to extract the image name. to append name :

帮助您提取图像名称。附加名称:

imageName =  '09-09-201315-47-571378756077'

new_name = '{0}_small.jpg'.format(imageName) 

回答by Bryan

You could just split the url by "/" and retrieve the last member of the list:

您可以通过“/”分割网址并检索列表的最后一个成员:

    url = "http://photographs.500px.com/kyle/09-09-201315-47-571378756077.jpg"
    filename = url.split("/")[-1] 
    #09-09-201315-47-571378756077.jpg

Then use replaceto change the ending:

然后使用replace更改结尾:

    small_jpg = filename.replace(".jpg", "_small.jpg")
    #09-09-201315-47-571378756077_small.jpg

回答by P i

os.path.basename(url)

os.path.basename(url)

Why try harder?

为什么要更努力?

In [1]: os.path.basename("https://foo.com/bar.html")
Out[1]: 'bar.html'

In [2]: os.path.basename("https://foo.com/bar")
Out[2]: 'bar'

In [3]: os.path.basename("https://foo.com/")
Out[3]: ''

In [4]: os.path.basename("https://foo.com")
Out[4]: 'foo.com'

回答by Tactopoda

Sometimes there is a query string:

有时有一个查询字符串:

filename = url.split("/")[-1].split("?")[0] 
new_filename = filename.replace(".jpg", "_small.jpg")

回答by Boris

Use urllib.parse.urlparseto get just the path part of the URL, and then use pathlib.Pathon that path to get the filename:

用于urllib.parse.urlparse仅获取URL路径部分,然后pathlib.Path在该路径上使用以获取文件名:

from urllib.parse import urlparse
from pathlib import Path


url = "http://example.com/some/long/path/a_filename.jpg?some_query_params=true&some_more=true#and-an-anchor"
a = urlparse(url)
a.path             # '/some/long/path/a_filename.jpg'
Path(a.path).name  # 'a_filename.jpg'