提取内存中的多个压缩JSON文件,并使用Python将其保存到Azure Blob存储



我有一个函数,它与SOAP API通信,接收二进制文件,并最终提取我希望使用Python保存到Azure Blob存储容器的少量JSON文件。

微软的官方文档和示例对于保存单个文件很有用,但当我尝试对多个文件执行相同操作时,我会得到一个错误代码:

类型错误:Blob数据的类型应为字节

请参阅下面的代码单元和错误代码。

# Extract Pre Survey JSON responses from binaries and send to Azure Blob storage:
import os
import io, zipfile
from io import BytesIO
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient, __version__
from functools import reduce
blob = BlobClient.from_connection_string(conn_str="connection string", container_name="container name", blob_name=name)
local_path = "./temp"
def temp_extract():
for i in binaries: ---> N.B from previous cell.  
with zipfile.ZipFile(io.BytesIO(i)) as zfile: 
for name in zfile.namelist():
if name.endswith('.json'):
zfile.extract(name, local_path)
def  upload_blobs():
upload_file_path = os.path.join(local_path, name)
onlyfiles = reduce(lambda x,y : x+y, [map(lambda x: root + "/" + x, files) for root, dirs, files in os.walk(local_path)])
onlyfiles = [file for file in onlyfiles if file.endswith('.json')]
for file in onlyfiles:
print(os.path.getsize(file))
with open(file, 'r') as f:
blob.upload_blob(data = f, overwrite=True)
if __name__ == '__main__':
temp_extract()
upload_blobs()

我得到以下错误代码:

TypeError                                 Traceback (most recent call last)
<ipython-input-273-fb49f396fab5> in <module>
26 if __name__ == '__main__':
27     temp_extract()
---> 28     upload_blobs()
29 
<ipython-input-273-fb49f396fab5> in upload_blobs()
22       print(os.path.getsize(file))
23       with open(file, 'r') as f:
---> 24         blob.upload_blob(data = f, overwrite=True)
25 
26 if __name__ == '__main__':
~/Library/Python/3.7/lib/python/site-packages/azure/core/tracing/decorator.py in wrapper_use_tracer(*args, **kwargs)
81             span_impl_type = settings.tracing_implementation()
82             if span_impl_type is None:
---> 83                 return func(*args, **kwargs)
84 
85             # Merge span is parameter is set, but only if no explicit parent are passed
~/Library/Python/3.7/lib/python/site-packages/azure/storage/blob/_blob_client.py in upload_blob(self, data, blob_type, length, metadata, **kwargs)
683             **kwargs)
684         if blob_type == BlobType.BlockBlob:
--> 685             return upload_block_blob(**options)
686         if blob_type == BlobType.PageBlob:
687             return upload_page_blob(**options)
~/Library/Python/3.7/lib/python/site-packages/azure/storage/blob/_upload_helpers.py in upload_block_blob(client, data, stream, length, overwrite, headers, validate_content, max_concurrency, blob_settings, encryption_options, **kwargs)
86                 data = data.read(length)
87                 if not isinstance(data, six.binary_type):
---> 88                     raise TypeError('Blob data should be of type bytes.')
89             except AttributeError:
90                 pass
**TypeError: Blob data should be of type bytes.**

出现此错误的原因是将文件对象作为数据传递给upload_blob方法,而该方法需要内容。

您要做的是读取文件的内容,然后将文件内容传递给upload_blob方法。

类似于:

with open(file, 'r') as f:
file_content = f.read()  
blob.upload_blob(data = file_content, overwrite=True)

最新更新