java库从文件内容中查找mime类型
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4348810/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
java library to find the mime type from file content
提问by Ajith Jose
I am searching for a java library which tells you the mime type by looking at the file content(byte array). I found this project using jmimemagic and it no longer supports newer file types (eg. MS word docx format) as it is inactive now (from 2006).
我正在寻找一个 Java 库,它通过查看文件内容(字节数组)来告诉您 MIME 类型。我发现这个项目使用 jmimemagic,它不再支持较新的文件类型(例如 MS word docx 格式),因为它现在处于非活动状态(从 2006 年开始)。
采纳答案by Ajith Jose
Use Apache tika for content detection. Please find the link below. http://tika.apache.org/0.8/detection.html. We have so many jar dependencies which you can find when you build tika using maven
使用 Apache tika 进行内容检测。请找到以下链接。http://tika.apache.org/0.8/detection.html。我们有很多 jar 依赖项,您可以在使用 maven 构建 tika 时找到它们
ByteArrayInputStream bai = new ByteArrayInputStream(pByte);
ContentHandler contenthandler = new BodyContentHandler();
Metadata metadata = new Metadata();
Parser parser = new AutoDetectParser();
try {
parser.parse(bai, contenthandler, metadata);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TikaException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("Mime: " + metadata.get(Metadata.CONTENT_TYPE));
return metadata.get(Metadata.CONTENT_TYPE);
回答by fischermatte
Maybe useful for someone, who needs the most used office formats as well (and does not use Apache Tika):
也许对需要最常用办公格式(并且不使用 Apache Tika)的人有用:
public class MimeTypeUtils {
private static final Map<String, String> fileExtensionMap;
static {
fileExtensionMap = new HashMap<String, String>();
// MS Office
fileExtensionMap.put("doc", "application/msword");
fileExtensionMap.put("dot", "application/msword");
fileExtensionMap.put("docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document");
fileExtensionMap.put("dotx", "application/vnd.openxmlformats-officedocument.wordprocessingml.template");
fileExtensionMap.put("docm", "application/vnd.ms-word.document.macroEnabled.12");
fileExtensionMap.put("dotm", "application/vnd.ms-word.template.macroEnabled.12");
fileExtensionMap.put("xls", "application/vnd.ms-excel");
fileExtensionMap.put("xlt", "application/vnd.ms-excel");
fileExtensionMap.put("xla", "application/vnd.ms-excel");
fileExtensionMap.put("xlsx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
fileExtensionMap.put("xltx", "application/vnd.openxmlformats-officedocument.spreadsheetml.template");
fileExtensionMap.put("xlsm", "application/vnd.ms-excel.sheet.macroEnabled.12");
fileExtensionMap.put("xltm", "application/vnd.ms-excel.template.macroEnabled.12");
fileExtensionMap.put("xlam", "application/vnd.ms-excel.addin.macroEnabled.12");
fileExtensionMap.put("xlsb", "application/vnd.ms-excel.sheet.binary.macroEnabled.12");
fileExtensionMap.put("ppt", "application/vnd.ms-powerpoint");
fileExtensionMap.put("pot", "application/vnd.ms-powerpoint");
fileExtensionMap.put("pps", "application/vnd.ms-powerpoint");
fileExtensionMap.put("ppa", "application/vnd.ms-powerpoint");
fileExtensionMap.put("pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation");
fileExtensionMap.put("potx", "application/vnd.openxmlformats-officedocument.presentationml.template");
fileExtensionMap.put("ppsx", "application/vnd.openxmlformats-officedocument.presentationml.slideshow");
fileExtensionMap.put("ppam", "application/vnd.ms-powerpoint.addin.macroEnabled.12");
fileExtensionMap.put("pptm", "application/vnd.ms-powerpoint.presentation.macroEnabled.12");
fileExtensionMap.put("potm", "application/vnd.ms-powerpoint.presentation.macroEnabled.12");
fileExtensionMap.put("ppsm", "application/vnd.ms-powerpoint.slideshow.macroEnabled.12");
// Open Office
fileExtensionMap.put("odt", "application/vnd.oasis.opendocument.text");
fileExtensionMap.put("ott", "application/vnd.oasis.opendocument.text-template");
fileExtensionMap.put("oth", "application/vnd.oasis.opendocument.text-web");
fileExtensionMap.put("odm", "application/vnd.oasis.opendocument.text-master");
fileExtensionMap.put("odg", "application/vnd.oasis.opendocument.graphics");
fileExtensionMap.put("otg", "application/vnd.oasis.opendocument.graphics-template");
fileExtensionMap.put("odp", "application/vnd.oasis.opendocument.presentation");
fileExtensionMap.put("otp", "application/vnd.oasis.opendocument.presentation-template");
fileExtensionMap.put("ods", "application/vnd.oasis.opendocument.spreadsheet");
fileExtensionMap.put("ots", "application/vnd.oasis.opendocument.spreadsheet-template");
fileExtensionMap.put("odc", "application/vnd.oasis.opendocument.chart");
fileExtensionMap.put("odf", "application/vnd.oasis.opendocument.formula");
fileExtensionMap.put("odb", "application/vnd.oasis.opendocument.database");
fileExtensionMap.put("odi", "application/vnd.oasis.opendocument.image");
fileExtensionMap.put("oxt", "application/vnd.openofficeorg.extension");
}
public static String getContentTypeByFileName(String fileName) {
// 1. first use java's buildin utils
FileNameMap mimeTypes = URLConnection.getFileNameMap();
String contentType = mimeTypes.getContentTypeFor(fileName);
// 2. nothing found -> lookup our in extension map to find types like ".doc" or ".docx"
if (!StringUtils.hasText(contentType)) {
String extension = FilenameUtils.getExtension(fileName);
contentType = fileExtensionMap.get(extension);
}
return contentType;
}
}
回答by Thad
I use javax.activation.MimetypesFileTypeMap
. It starts with a small set: $JRE_HOME/lib/content-types.properties
, but you can add you own. Create a file mime.types
in the format shown in MimetypesFileTypeMap
's javadoc (I started with a large list from the net, massaged it, and added types I found missing). Now you can add that in your code by opening your mime.types
file and adding its contents to your map. However the easier solution is to add your mime.types
file to the META-INF
of your jar. java.activation
will pick that up automagically.
我用javax.activation.MimetypesFileTypeMap
. 它从一个小集合开始:$JRE_HOME/lib/content-types.properties
,但你可以添加你自己的。mime.types
以MimetypesFileTypeMap
的 javadoc 中显示的格式创建一个文件(我从网上的一个大列表开始,对其进行了整理,并添加了我发现丢失的类型)。现在,您可以通过打开mime.types
文件并将其内容添加到地图中来将其添加到代码中。然而,更简单的解决方案是将您的mime.types
文件添加到META-INF
您的 jar文件中。java.activation
会自动捡起来。