如何将azure函数转换为常规python脚本?



我是python的新手& &;我遇到了一个问题,这是:我必须将azure函数代码转换为正常的python脚本。我以前没有使用过azure,所以我有点无能为力。下面是代码

这是一个分析文档并返回键值对的过程,但我不知道如何将此代码转换为常规python脚本&在本地运行

import logging
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
from azure.storage.blob import BlockBlobService, PublicAccess
import json
import re
import uuid
logger = logging.getLogger(__name__)
import azure.functions as func
def upload_blob(account_name, container_name, account_key, blob_name):
# Create the BlockBlobService that is used to call the Blob service for the storage account
blob_service_client = BlockBlobService(
account_name=account_name,
account_key=account_key)
# Set the permission so the blobs are public.
blob_service_client.set_container_acl(container_name, public_access=PublicAccess.Container)
#blob_name = doc_path.split('/')[-1][:-4] + str(uuid.uuid4()) + ".pdf"
# Upload the created file, use blob_name for the blob name
#blob_service_client.create_blob_from_path(container_name, blob_name, doc_path)
blob_url = blob_service_client.make_blob_url(container_name, blob_name)
return blob_url
def delete_blob(account_name, container_name, account_key, blob_name):
blob_service_client = BlockBlobService(
account_name=account_name,
account_key=account_key)
# Delete blob from container
blob_service_client.delete_blob(container_name, blob_name)
def search_value(kvs, search_key):
for key, value in kvs.items():
if re.search(search_key, key, re.IGNORECASE):
return value
def analyze_general_documents(endpoint, api_key, doc_url):
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(api_key)
)
poller = document_analysis_client.begin_analyze_document_from_url("prebuilt-document", doc_url)
result = poller.result()
#print("----Key-value pairs found in document----")
kvs = {}
content = result.content.replace("n", "").replace("r", "").strip()
for kv_pair in result.key_value_pairs:
if kv_pair.key:
key = kv_pair.key.content
if kv_pair.value:
val = kv_pair.value.content
kvs[key] = val
return content, kvs
def main(req: func.HttpRequest) -> func.HttpResponse:
try:
# Query parameters
endpoint = ""
api_key = ""
account_name = ""
container_name = ""
account_key = ""

if "blob_name" in req.get_json() and "search_keys" in req.get_json():
blob_name = req.get_json()["blob_name"]
search_keys = req.get_json()["search_keys"]
logger.info(" search_keys = "+str(search_keys))
# Upload file to Azure Storage container.
logger.info("Creating blob url")
blob_url = upload_blob(account_name, container_name, account_key, blob_name)
#logger.info("Blob url = "+str(blob_url))
# Analyze the document
content, kvs = analyze_general_documents(endpoint, api_key, blob_url)
#logger.info("content = "+content)
#logger.info("kvs = "+str(kvs))
# Search for specified keys
search_results = {}
for search_key in search_keys:
val = search_value(kvs, search_key)
if val:
search_results[search_key] = val
#logger.info("search_results = "+str(search_results))
# Delete the uploaded file
delete_blob(account_name, container_name, account_key, blob_name)
# Return search results
return func.HttpResponse(json.dumps(search_results))
else:
return func.HttpResponse("Please pass in end_point, api_key, and blob_name", status_code=400)
except Exception as e:
return func.HttpResponse("Error: " + str(e), status_code=500)

首先-这可能不是您的问题的完整解决方案,但可能有助于您推导下一步。脚本应该重新编码,因为它是基于一个可能不再维护的旧库。然而,以下是一些想法;这绝不是一个真正的解决方案,不能用于生产数据。

您导入的库可以保持原样。请注意,当您通过pip install {library_name}安装库时,您将需要使用旧的azure-storage库而不是azure-storage-blob,因为这个库没有BlockBlobService

另外,如果您希望从命令行运行脚本,您可能希望传递函数最初通过HTTP请求接收的参数。为此,您可以使用argparse。此外,您可能不希望在脚本文件中使用凭据,而是希望将它们导出为环境变量——那么您也需要os库。

也就是说,您的导入看起来像这样:

import logging
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
from azure.storage.blob import BlockBlobService, PublicAccess
import json
import re
import uuid
# Importing argparse for being able to pass parameters
import argparse
# Importing os to read environment variables
import os

你将不再需要import azure.functions as func

因为它在本地运行,所以在执行脚本时,可以传递参数blobnamesearchkeys。这需要像这样的内容:

parser = argparse.ArgumentParser()
parser.add_argument("-n", "--blobname", type=str)
parser.add_argument("-s", "--searchkeys", type=str)
args = parser.parse_args()
blob_name = args.blobname
search_keys = args.searchkeys

这将允许您保持变量名,因为他们现在。就像最初提到的,函数可以保持原样,但是凭据不应该在脚本中。导入os后,您可以添加以下内容:

# Query parameters
endpoint = os.getenv('form_recognizer_endpoint')
api_key = os.getenv('form_recognizer_api_key')
account_name = os.getenv('storage_account_name')
container_name = os.getenv('storage_container_name')
account_key = os.getenv('storage_account_key')

…然后使用shell中的export功能将它们添加为环境变量,即:

export form_recognizer_endpoint="your_endpoint"
export form_recognizer_api_key="your_api_key"
export storage_account_name="your_account_name"
export storage_container_name="your_container"
export storage_account_key="your_key"

最后,您可以删除周围的def maintry-except块以及if语句,以便您的主块将沿着这些行:

logger.info(" search_keys = "+str(search_keys))
# Upload file to Azure Storage container.
logger.info("Creating blob url")
blob_url = upload_blob(account_name, container_name, account_key, blob_name)
#logger.info("Blob url = "+str(blob_url))
# Analyze the document
content, kvs = analyze_general_documents(endpoint, api_key, blob_url)
#logger.info("content = "+content)
#logger.info("kvs = "+str(kvs))
# Search for specified keys
search_results = {}
for search_key in search_keys:
val = search_value(kvs, search_key)
if val:
search_results[search_key] = val
#logger.info("search_results = "+str(search_results))
# Delete the uploaded file
delete_blob(account_name, container_name, account_key, blob_name)

最后,您可以更改返回行以打印结果,即:

# Return search results
print(json.dumps(search_results))

可以这样执行:python script.py --blobname testfile.pdf --searchkeys "text"

最新更新