通过webHDFSREST API将图像上载到HDFS的问题



我正在使用MultiPartEntity进行HttpPut,以便通过webHDFSRESTneneneba API将文件写入HDFS。请求本身通过并给了我正确的响应,307和201。但是,该图像有多个部分的标题,也作为其一部分写入,如下所示,并且它不是要检索和打开的有效图像。

--8DkJ3RkUHahEaNE9Ktw8NC1TFOqegjfA9Ps
内容处置:表单数据;name="file";filename="advert.jpg"
内容类型:应用程序/八位字节流

ÛC//图像的其余内容
--8DkJ3RkUHahEaNE9Ktw8NC1TFOqegjfA9Ps

从图像文件中删除多部分标头会使其成为一个有效的图像,但我不确定如何从一开始就避免它。我甚至不确定我是否能控制它,因为webHDFS负责实际编写文件。

这是我的代码。还有什么我应该做的吗?

final String LOCATION = "Location";
final String writeURI = "http://<ip>:50070/webhdfs/v1/user/hadoop/advert.jpg"; 
HttpPut put = new HttpPut(writeURI);
HttpClient client = HttpClientBuilder.create().build();        
HttpResponse response = client.execute(put);
put.releaseConnection();
String redirectUri = null;
Header[] headers = response.getAllHeaders();
for(Header header : headers)
{
    if(LOCATION.equalsIgnoreCase(header.getName()))
    {
         redirectUri = header.getValue();
    }                    
}
HttpPut realPut = new HttpPut(redirectUri);
realPut.setEntity(buildMultiPartEntity("advert.jpg"));
HttpResponse response2 = client.execute(realPut);

private HttpEntity buildMultiPartEntity(String fileName)
{
   MultipartEntityBuilder multipartEntity = MultipartEntityBuilder.create();
   multipartEntity.setMode(HttpMultipartMode.BROWSER_COMPATIBLE);
   multipartEntity.addPart("file", new FileBody(new File(fileName)));
   return multipartEntity.build();
}    

感谢您的帮助。

我在python请求中遇到了同样的问题。我最终解决这个问题的方法是在发送之前将图像读入内存。并且使用对webhdfsapi的一步调用而不是两步调用。希望这能有一点帮助。

host_url = current_app.config.get('HDFS_URL', '')
adx_img_path = current_app.config.get('ADX_CUSTOMER_IMAGE', '')
real_path = adx_img_path + remotefile
hdfs_username = current_app.config.get('HDFS_USERNAME', 'xdisk')
parameters = '?user.name=' + hdfs_username + '&op=CREATE&data=true'
img = open(localfile, 'rb').read()
url = host_url + real_path + parameters
r = requests.put(url, data=img, headers={"Content-Type": "application/octet-stream"})

通过以二进制/字节的形式读取图像,似乎不会将奇怪的标头添加到文件标头中。对于您正在使用的HttpClient,我建议您尝试InputStreamBodyByteArrayBody

将图像添加为FileEntity、ByteArrayEntity或InputStreamEntity,内容类型为"application/octet stream"。

这是基于公认答案为我工作的代码:

import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPut;
import org.apache.http.entity.FileEntity;
import org.apache.http.impl.client.HttpClientBuilder;
import java.io.File;
import java.io.IOException;
public class Test {
    public void Test(){
        try {
            final String writeURI = "http://<IP>:50075/webhdfs/v1/user/sample.xml?op=CREATE&user.name=istvan&namenoderpcaddress=quickstart.cloudera:8020&overwrite=true";
            HttpClient client = HttpClientBuilder.create().build();
            HttpPut put = new HttpPut(writeURI);
            put.setEntity(buildFileEntity("C:\sample.xml"));
            put.setHeader("Content-Type", "application/octet-stream");
            HttpResponse response = client.execute(put);
            System.out.println(response);
        }catch(IOException e){
            e.printStackTrace();
        }
    }

    private static FileEntity buildFileEntity (String fileName)
    {
        FileEntity inputData = new FileEntity(new File(fileName));
        return inputData;
    }
    public static void main(String[] args) {
        new Test().Test();
    }
}

Maven:

        <dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
            <version>4.4</version>
        </dependency>
        <dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpmime</artifactId>
            <version>4.3.1</version>
        </dependency>

相关内容

  • 没有找到相关文章

最新更新