如何通过获取mongodb集合值来调用python函数



如何在mongodb中创建文档和集合来进行python代码配置。获取属性名称,数据类型,函数从mongodb调用?

mongodb collection sample example

db.attributes.insertMany([
{ attributes_names: "email", attributes_datype: "string", attributes_isNull="false", attributes_std_function = "email_valid" }
{ attributes_names: "address", attributes_datype: "string", attributes_isNull="false", attributes_std_function = "address_valid" }
]);

Python脚本和函数

def email_valid(df):
df1 = df.withColumn(df.columns[0], regexp_replace(lower(df.columns[0]), "^a-zA-Z0-9@._-| ", ""))
extract_expr = expr(
"regexp_extract_all(emails, '(\w+([\.-]?\w+)*@\[A-Za-z-.]+([\.-]?\w+)*(\.\w{2,3})+)', 0)")
df2 = df1.withColumn(df.columns[0], extract_expr) 
.select(df.columns[0])
return df2

如何在python脚本中获取所有mongodb的值,并根据属性调用函数。

从python脚本创建MongoDB集合:

import pymongo
# connect to your mongodb client
client = pymongo.MongoClient(connection_url)
# connect to the database
db = client[database_name]
# get the collection
mycol = db[collection_name]
from bson import ObjectId
from random_object_id import generate
# create a sample dictionary for the collection data
mydict = { "_id": ObjectId(generate()),
"attributes_names": "email", 
"attributes_datype": "string", 
"attributes_isNull":"false", 
"attributes_std_function" : "email_valid" }
# insert the dictionary into the collection
mycol.insert_one(mydict)

要在MongoDB中插入多个值,使用insert_many()而不是insert_one(),并将字典列表传递给它。那么你的list of dictionary就会像这样

mydict = [{ "_id": ObjectId(generate()),
"attributes_names": "email", 
"attributes_datype": "string", 
"attributes_isNull":"false", 
"attributes_std_function" : "email_valid" },
{ "_id": ObjectId(generate()),
"attributes_names": "email", 
"attributes_datype": "string", 
"attributes_isNull":"false", 
"attributes_std_function" : "email_valid" }]

MongoDB收集的所有数据放入python脚本:

data = list()
for x in mycol.find():
data.append(x)
import pandas as pd
data = pd.json_normalize(data)

然后像访问字典列表中的元素一样访问数据:

value = data[0]["attributes_names"]

最新更新