在这里,我使用mongodb驱动程序的ruby。但在此之后,这将工作完美,我想运行它作为一个计划任务在Ruby on Rails 3与Mongoid ODB。
所以现在,我正在用ruby做实验。
我注意到,当涉及到将XML文件转换为可以插入mongodb的格式时,crack gem非常方便。当我为ruby使用mongodb驱动程序时,crack转换为接近JSON的格式(它使用"=>"而不是":"列),这是在我将其插入mondodb数据库之前所需的条件,如图所示。
问题的方式,我使用裂缝下面它导入的一切是在XML文件。
sample.xml
<?xml version="1.0" encoding="utf-8"?>
<ShipmentRequest>
<Envelope>
<TransmissionDateTime>05/08/2013 23:06:02</TransmissionDateTime>
</Envelope>
<Message>
<Comment />
<Header>
<MemberId>A00000001</MemberId>
<MemberName>Bruce</MemberName>
<DeliveryId>6377935</DeliveryId>
<ShipToAddress1>123-4567</ShipToAddress1>
<OrderDate>05/08/13</OrderDate>
<Payments>
<PayType>Credit Card</PayType>
<Amount>1000</Amount>
</Payments>
<Payments>
<PayType>Points</PayType>
<Amount>5390</Amount>
</Payments>
</Header>
<Line>
<LineNumber>3.1</LineNumber>
<Item>fruit-004</Item>
<Description>Peach</Description>
<Quantity>1</Quantity>
<UnitCost>1610</UnitCost>
<DeclaredValue>0</DeclaredValue>
<PointValue>13</PointValue>
</Line>
<Line>
<LineNumber>8.1</LineNumber>
<Item>fruit-001</Item>
<Description>Fruit Set</Description>
<Quantity>1</Quantity>
<UnitCost>23550</UnitCost>
<PointValue>105</PointValue>
<PickLine>
<PickLineNumber>8.1..1</PickLineNumber>
<PickItem>fruit-002</PickItem>
<PickDescription>Apple</PickDescription>
<PickQuantity>1</PickQuantity>
</PickLine>
<PickLine>
<PickLineNumber>8.1..2</PickLineNumber>
<PickItem>fruit-003</PickItem>
<PickDescription>Orange</PickDescription>
<PickQuantity>2</PickQuantity>
</PickLine>
</Line>
</Message>
</ShipmentRequest>
sample_crack.rb
#!/usr/bin/ruby
require "crack"
require 'mongo'
include Mongo
mongo_client = MongoClient.new("localhost", 27017)
db = mongo_client.db("somedb")
coll = db.collection("somecoll")
myXML = Crack::XML.parse(File.read("sample.xml"))
coll.insert(myXML)
puts myXML
在控制台上打印:
{"ShipmentRequest"=>{"Envelope"=>{"TransmissionDateTime"=>"05/08/2013 23:06:02"}, "Message"=>{"Comment"=>nil, "Header"=>{"MemberId"=>"A00000001", "MemberName"=>"Bruce", "DeliveryId"=>"6377935", "ShipToAddress1"=>"123-4567", "OrderDate"=>"05/08/13", "Payments"=>[{"PayType"=>"Credit Card", "Amount"=>"1000"}, {"PayType"=>"Points", "Amount"=>"5390"}]}, "Line"=>[{"LineNumber"=>"3.1", "Item"=>"fruit-004", "Description"=>"Peach", "Quantity"=>"1", "UnitCost"=>"1610", "DeclaredValue"=>"0", "PointValue"=>"13"}, {"LineNumber"=>"8.1", "Item"=>"fruit-001", "Description"=>"Fruit Set", "Quantity"=>"1", "UnitCost"=>"23550", "PointValue"=>"105", "PickLine"=>[{"PickLineNumber"=>"8.1..1", "PickItem"=>"fruit-002", "PickDescription"=>"Apple", "PickQuantity"=>"1"}, {"PickLineNumber"=>"8.1..2", "PickItem"=>"fruit-003", "PickDescription"=>"Orange", "PickQuantity"=>"2"}]}]}}, :_id=>BSON::ObjectId('51ad8d83a3d24b3b9f000001')}
在mongodb中,转换后的XML文件如下:
{
"_id" : ObjectId("51ad8d83a3d24b3b9f000001"),
"ShipmentRequest" : {
"Envelope" : {
"TransmissionDateTime" : "05/08/2013 23:06:02"
},
"Message" : {
"Comment" : null,
"Header" : {
"MemberId" : "A00000001",
"MemberName" : "Bruce",
"DeliveryId" : "6377935",
"ShipToAddress1" : "123-4567",
"OrderDate" : "05/08/13",
"Payments" : [
{
"PayType" : "Credit Card",
"Amount" : "1000"
},
{
"PayType" : "Points",
"Amount" : "5390"
}
]
},
"Line" : [
{
"LineNumber" : "3.1",
"Item" : "fruit-004",
"Description" : "Peach",
"Quantity" : "1",
"UnitCost" : "1610",
"DeclaredValue" : "0",
"PointValue" : "13"
},
{
"LineNumber" : "8.1",
"Item" : "fruit-001",
"Description" : "Fruit Set",
"Quantity" : "1",
"UnitCost" : "23550",
"PointValue" : "105",
"PickLine" : [
{
"PickLineNumber" : "8.1..1",
"PickItem" : "fruit-002",
"PickDescription" : "Apple",
"PickQuantity" : "1"
},
{
"PickLineNumber" : "8.1..2",
"PickItem" : "fruit-003",
"PickDescription" : "Orange",
"PickQuantity" : "2"
}
]
}
]
}
}
}
但是我想导入它,比如消除不需要的节点并忽略空节点:
{
"_id" : ObjectId("51ad8d83a3d24b3b9f000001"),
"MemberId" : "A00000001",
"MemberName" : "Bruce",
"DeliveryId" : "6377935",
"ShipToAddress1" : "123-4567",
"OrderDate" : "05/08/13",
"Payments" : [
{
"PayType" : "Credit Card",
"Amount" : "1000"
},
{
"PayType" : "Points",
"Amount" : "5390"
}
],
"Line" : [
{
"LineNumber" : "3.1",
"Item" : "fruit-004",
"Description" : "Peach",
"Quantity" : "1",
"UnitCost" : "1610",
"DeclaredValue" : "0",
"PointValue" : "13"
},
{
"LineNumber" : "8.1",
"Item" : "fruit-001",
"Description" : "Fruit Set",
"Quantity" : "1",
"UnitCost" : "23550",
"PointValue" : "105",
"PickLine" : [
{
"PickLineNumber" : "8.1..1",
"PickItem" : "fruit-002",
"PickDescription" : "Apple",
"PickQuantity" : "1"
},
{
"PickLineNumber" : "8.1..2",
"PickItem" : "fruit-003",
"PickDescription" : "Orange",
"PickQuantity" : "2"
}
]
}
]
}
这可以用裂缝来完成吗?或者这可以用nokoogiri更好地完成?
非常感谢@Alex Peachey,在这里我放了更新的代码。
sample_crack/rb(更新):
#!/usr/bin/ruby
require "crack"
require 'mongo'
include Mongo
mongo_client = MongoClient.new("localhost", 27017)
db = mongo_client.db("somedb")
coll = db.collection("somecoll")
myXML = Crack::XML.parse(File.read("sample.xml"))
myXML.merge!(myXML.delete("ShipmentRequest")) # not needed hash
myXML.merge!(myXML.delete("Message")) # not needed hash
myXML.merge!(myXML.delete("Header")) # not needed hash
myXML.delete("Envelope") # not needed hash
# planning to put here a code to remove hashes with empty values
coll.insert(myXML)
puts myXML
很难说如何定义"不需要的"节点,但空节点很容易理解。无论哪种方式,Crack都非常擅长它为您做的事情,它基本上是将XML转换为哈希。一旦你有了哈希,在你把它插入Mongo之前,根据你的规则对它进行修剪。
根据你的评论,我更好地理解了你的问题。我的答案仍然成立,只要操作哈希。具体来说,你可以这样做:myXML.merge!(myXML.delete("ShipmentRequest"))
myXML.delete("Envelope")
myXML.merge!(myXML.delete("Message"))