如何从Java中的文件中获取魔术号码



我有来自uploadedfile按钮的文件,我想通过在魔术数字中使用扩展文件,

我的代码:

UploadedFile file = (UploadedFile)valueChangeEvent.getNewValue();
byte[] fileByteArray = IOUtils.toByteArray(file.getInputStream());

请注意:MIME类型和内容文件(来自文件和文件名)与魔术数字不同(魔术号来自InputStream的第一个字节)

我该怎么做?

我知道这是一个古老的问题,只需在这里回答我的答案,希望有人在搜索相同的解决方案时会发现它很有用。

import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import javax.servlet.ServletContext;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.Part;
import javax.servlet.annotation.MultipartConfig;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
@MultipartConfig(
    fileSizeThreshold = 0,
    maxFileSize = 1024 * 1024 * 50,       // 50MB
    maxRequestSize = 1024 * 1024 * 100)   // 100MB
public class FileUpload extends HttpServlet {    
    private static final Logger logger = LogManager.getLogger(FileUpload.class);
    private byte[] data = new byte[4];
    public void doPost(HttpServletRequest request, HttpServletResponse response)
        throws IOException, ServletException {
        response.setContentType("text/plain");
        response.setCharacterEncoding("UTF-8");
        try {
            fileSignature(request
              .getPart("image_file")
              .getInputStream());
        } catch (IOException | NullPointerException ex) {
            logger.error(ex);
        }
        String fileType = getFileType(data);
        // return the recognized type 
        response.getWriter().write(fileType);
    }
    /**
     * Get the first 4 byte of a file file signature. 
     * 
     * @param part File from part.
     */
     private void fileSignature(InputStream is)
             throws IOException, NullPointerException {
         is.read(data, 0, 4);
     }
     /**
      * Get the file type based on the file signature.
      * Here restricted to only recognized file type jpeg, jpg, png and
      * pdf where the signature of jpg and jpeg files are the same.
      *
      * @param fileData Byte array of the file.
      * @return String of the file type.
      */
     private String getFileType(byte[] fileData) {
         String type = "undefined";
         if(Byte.toUnsignedInt(fileData[0]) == 0x89 && Byte.toUnsignedInt(fileData[1]) == 0x50)
             type = "png";
         else if(Byte.toUnsignedInt(fileData[0]) == 0xFF && Byte.toUnsignedInt(fileData[1]) == 0xD8)
             type = "jpg";
         else if(Byte.toUnsignedInt(fileData[0]) == 0x25 && Byte.toUnsignedInt(fileData[1]) == 0x50)
             type = "pdf";
        return type;
    }
}

文件魔术数字的引用:

  • fileignatures.net
  • Wikipedia文件签名列表
  • Mimesniff
  • GCK的文件签名表

我正在放置我的解决方案,以防万一人们想要没有Java-Servlet相关代码的替代方案:

public enum MagicBytes {
    PNG(0x89, 0x50),  // Define just like previous answer 
    JPG(0xFF, 0xD8),
    PDF(0x25, 0x50);
    
    private final int[] magicBytes;
    
    private MagicBytes(int...bytes) {
        magicBytes = bytes;
    }
    
    // Checks if bytes match a specific magic bytes sequence
    public boolean is(byte[] bytes) {
        if (bytes.length != magicBytes.length)
            throw new RuntimeException("I need the first "+magicBytes.length
                    + " bytes of an input stream.");
        for (int i=0; i<bytes.length; i++)
            if (Byte.toUnsignedInt(bytes[i]) != magicBytes[i])
                return false;
        return true;
    }
    
    // Extracts head bytes from any stream
    public static byte[] extract(InputStream is, int length) throws IOException {
        try (is) {  // automatically close stream on return
            byte[] buffer = new byte[length];
            is.read(buffer, 0, length);
            return buffer;
        }
    }
    
    /* Convenience methods */
    public boolean is(File file) throws IOException {
        return is(new FileInputStream(file));
    }
    
    public boolean is(InputStream is) throws IOException {
        return is(extract(is, magicBytes.length));
    }
}

然后,请按照您的文件或InputStream的身份这样打电话:

MagicBytes.PNG.is(new File("picture.png"))
MagicBytes.PNG.is(new FileInputStream("picture.png"))

作为枚举也允许我们使用MagicBytes.values()

编辑:我提出的先前代码是我用于自己的自由的实际枚举的简化版本,但是使用以前的答案进行了改编,以帮助人们更快地理解。但是,某些文件格式可能具有不同的标题,因此,如果这是您的特定用例的问题,则此类更合适:GIST

相关内容

  • 没有找到相关文章

最新更新