Python 如何正确获取文件扩展名?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37896386/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to get file extension correctly?
提问by Page David
I know that this question is asked many times on this website. But I found that they missed an important point: only file extension with one period was taken into consider like *.png *.mp3
, but how do I deal with these filename with two period like .tar.gz
.
我知道这个问题在这个网站上被问过很多次。但是我发现他们错过了一个重要的点:只考虑了一个句点的文件扩展名*.png *.mp3
,但是我如何处理这些带有两个句点的文件名,例如.tar.gz
.
The basic code is:
基本代码是:
filename = '/home/lancaster/Downloads/a.ppt'
extention = filename.split('/')[-1]
But obviously, this code do not work with the file like a.tar.gz
.
How to deal with it? Thanks.
但显然,此代码不适用于a.tar.gz
. 如何处理?谢谢。
采纳答案by John Burger
The role of a file extension is to tell the viewer (and sometimes the computer) which application to use to handle the file.
文件扩展名的作用是告诉查看者(有时是计算机)使用哪个应用程序来处理文件。
Taking your worst-case example in your comments (a.ppt.tar.gz
), this is a PowerPoint file that has been tar-balled and then gzipped. So you need to use a gzip-handling program to open it. Using PowerPoint or a tarball-handling program wouldn't work. OK, a clever program that knew how to handle both .tar
and .gz
files could understand both operations and work with a .tar.gz
file - but note that it would do that even if the extension was simply .gz
.
在您的评论 ( a.ppt.tar.gz
) 中以最坏的情况为例,这是一个 PowerPoint 文件,已被 tar 压缩然后 gzipped。因此,您需要使用 gzip 处理程序来打开它。使用 PowerPoint 或 tarball 处理程序将不起作用。好的,一个知道如何处理.tar
和.gz
文件的聪明程序可以理解这两种操作并处理.tar.gz
文件 - 但请注意,即使扩展名只是.gz
.
The fact that both tar
and gzip
add their extensions to the original filename, rather than replace them (as zip
does) is a convenience. But the base name of the gzip file is still a.ppt.tar
.
这两个事实tar
和gzip
他们的扩展添加到原文件名,而不是取代它们(如zip
做)是一个便利。但是 gzip 文件的基本名称仍然是a.ppt.tar
.
回答by Or Duan
Python 3.4
蟒蛇 3.4
You can now use Path
from pathlib. It has many features, one of them is suffix
:
您现在可以使用Path
from pathlib。它有很多特点,其中之一是suffix
:
>>> from pathlib import Path
>>> Path('my/library/setup.py').suffix
'.py'
>>> Path('my/library.tar.gz').suffix
'.gz'
>>> Path('my/library').suffix
''
If you want to get more than one suffix, use suffixes
:
如果您想获得多个后缀,请使用suffixes
:
>>> from pathlib import Path
>>> Path('my/library.tar.gar').suffixes
['.tar', '.gar']
>>> Path('my/library.tar.gz').suffixes
['.tar', '.gz']
>>> Path('my/library').suffixes
[]
回答by Rahul K P
Here is a in build module in os
. More about os.path.splitext
.
这是os
. 更多关于os.path.splitext
.
In [1]: from os.path import splitext
In [2]: file_name,extension = splitext('/home/lancaster/Downloads/a.ppt')
In [3]: extension
Out[1]: '.ppt'
If you have to fine the extension of .tar.gz
,.tar.bz2
you have to write a function like this
如果您必须对 的扩展进行细化.tar.gz
,.tar.bz2
则必须编写这样的函数
from os.path import splitext
def splitext_(path):
for ext in ['.tar.gz', '.tar.bz2']:
if path.endswith(ext):
return path[:-len(ext)], path[-len(ext):]
return splitext(path)
Result
结果
In [4]: file_name,ext = splitext_('/home/lancaster/Downloads/a.tar.gz')
In [5]: ext
Out[2]: '.tar.gz'
Edit
编辑
Generally you can use this function
一般你可以使用这个功能
from os.path import splitext
def splitext_(path):
if len(path.split('.')) > 2:
return path.split('.')[0],'.'.join(path.split('.')[-2:])
return splitext(path)
It will work for all extensions.
它将适用于所有扩展。
Working on all files.
处理所有文件。
In [6]: inputs = ['a.tar.gz', 'b.tar.lzma', 'a.tar.lz', 'a.tar.lzo', 'a.tar.xz','a.png']
In [7]: for file_ in inputs:
file_name,extension = splitext_(file_)
print extension
....:
tar.gz
tar.lzma
tar.lz
tar.lzo
tar.xz
.png
回答by no11
One possible way is:
一种可能的方法是:
- Slice at "." =>
tmp_ext = filename.split('.')[1:]
- 在“.”处切片 =>
tmp_ext = filename.split('.')[1:]
Result is a list = ['tar', 'gz']
结果是一个列表 = ['tar', 'gz']
- Join them together =>
extention = ".".join(tmp_ext)
- 加入他们=>
extention = ".".join(tmp_ext)
Result is your extension as string = 'tar.gz'
结果是您的扩展名作为字符串 = 'tar.gz'
Update: Example:
更新: 示例:
>>> test = "/test/test/test.tar.gz"
>>> t2 = test.split(".")[1:]
>>> t2
['tar', 'gz']
>>> ".".join(t2)
'tar.gz'
回答by matt
>>> import os
>>> import re
>>> filename = os.path.basename('/home/lancaster/Downloads/a.ppt')
>>> extensions = re.findall(r'\.([^.]+)', basename)
['ppt']
>>> filename = os.path.basename('/home/lancaster/Downloads/a.ppt.tar.gz')
>>> extensions = re.findall(r'\.([^.]+)', basename)
['ppt','tar','gz']
回答by LetzerWille
with re.findall and python 3.6
filename = '/home/Downloads/abc.ppt.tar.gz'
ext = r'\.\w{1,6}'
re.findall(f'{ext}\b | {ext}$', filename, re.X)
['.ppt', '.tar', '.gz']
回答by Saket Mittal
Simplest One:
最简单的一个:
import os.path
print os.path.splitext("/home/lancaster/Downloads/a.ppt")[1]
# '.ppt'
回答by akshay.s.jagtap
filename = '/home/lancaster/Downloads/a.tar.gz'
extention = filename.split('/')[-1]
if '.' in extention:
extention = extention.split('.')[-1]
if len(extention) > 0:
extention = '.'+extention
print extention