从 DSX python 2.7 notebook 写入 csv 到 IBM bluemix 对象存储



我正在尝试从 DSX Python 笔记本将 pandas 数据帧作为 CSV 写入 Bluemix 对象存储。 我首先将数据帧保存到"本地"CSV 文件。 然后,我有一个例程,尝试将文件写入对象存储。 我收到 413 响应 - 对象太大。 该文件只有大约 3MB。 这是我的代码,基于我在这里找到的 JSON 示例:http://datascience.ibm.com/blog/working-with-object-storage-in-data-science-experience-python-edition/

import requests
def put_file(credentials, local_file_name):  
    """This function writes file content to Object Storage V3 """
    url1 = ''.join(['https://identity.open.softlayer.com', '/v3/auth/tokens'])
    data = {'auth': {'identity': {'methods': ['password'],
        'password': {'user': {'name': credentials['name'],'domain': {'id': credentials['domain']},
        'password': credentials['password']}}}}}
    headers = {'Content-Type': 'text/csv'}
    with open(local_file_name, 'rb') as f:
        resp1 = requests.post(url=url1, data=f, headers=headers)
    return resp1  

任何帮助或指示都非常感谢。

本教程中的这段代码片段对我来说效果很好(对于 12 MB 的文件(。

from io import BytesIO  
import requests  
import json  
import pandas as pd
def put_file(credentials, local_file_name):  
    """This functions returns a StringIO object containing
    the file content from Bluemix Object Storage V3."""
    f = open(local_file_name,'r')
    my_data = f.read()
    url1 = ''.join(['https://identity.open.softlayer.com', '/v3/auth/tokens'])
    data = {'auth': {'identity': {'methods': ['password'],
            'password': {'user': {'name': credentials['username'],'domain': {'id': credentials['domain_id']},
            'password': credentials['password']}}}}}
    headers1 = {'Content-Type': 'application/csv'}
    resp1 = requests.post(url=url1, data=json.dumps(data), headers=headers1)
    resp1_body = resp1.json()
    for e1 in resp1_body['token']['catalog']:
        if(e1['type']=='object-store'):
            for e2 in e1['endpoints']:
                        if(e2['interface']=='public'and e2['region']=='dallas'):
                            url2 = ''.join([e2['url'],'/', credentials['container'], '/', local_file_name])
    s_subject_token = resp1.headers['x-subject-token']
    headers2 = {'X-Auth-Token': s_subject_token, 'accept': 'application/json'}
    resp2 = requests.put(url=url2, headers=headers2, data = my_data )
    print resp2

我使用以下命令创建了一个随机熊猫数据帧:

df = pd.DataFrame(np.random.randint(0,100,size=(1000000, 4)), columns=list('ABCD'))

保存到csv

df.to_csv('myPandasData_1000000.csv',index=False)

然后将其放入对象存储

put_file(credentials_1,'myPandasData_1000000.csv')

您可以通过单击对象存储中任何对象的insert to code -> Insert credentials来获取credentials_1对象。

相关内容

  • 没有找到相关文章

最新更新