我有一个递归树数据结构,其中每个节点可以有零个或多个子节点,就像代码片段中一样。
public class TreeNode
{
public string Value { get; set; }
public List<TreeNode> Children { get; set; }
}
有一个根节点,所有其他节点都是它的子节点(或其子节点的子节点…(这是一个大集合(数百万个项目(,所以我不想在内存中加载。我正在考虑一种将其保存/持久保存到文件中的方法。对于每一项,我都要保存每个子项的值(字符串(和文件中的位置。
|value of node1|number of children|child1 location (int), child2, ...,childN|
|value of node2|number of childs|child locations of node 2|
不确定我是否解释得很好,但我希望Children的指针/引用是文件中的实际位置。我的问题是预先计算每个节点项开始的位置。我想这将是一种递归方法,为每个节点保存其值、子节点的数量和每个子节点的位置。问题是,对于每个节点,值的长度都是可变的,并且项的数量不是固定的。
public class TreeNode
{
public string Value { get; set; }
public void Test()
{
BinaryWriter binaryWriter = new BinaryWriter(File.OpenWrite("my file.dat"));
//node 1 has 2 childs
//node 1
binaryWriter.Write("node1 value");
binaryWriter.Write(2); //2 children
binaryWriter.Write(10); // location of child 1
binaryWriter.Write(20); // location of child 2
//child 1
binaryWriter.BaseStream.Position = 10;
binaryWriter.Write("child1 value");
binaryWriter.Write(5); //5 children
// write locations of childen of child 1
//child2
binaryWriter.BaseStream.Position = 20;
binaryWriter.Write("child2 value");
binaryWriter.Write(3); //3 children
// write locations of childen of child 2
//...
//continue for every
}
}
文件的使用将类似
- 转到根位置0
- 读取值和任何子位置,即456和789
- 到达456号位置并阅读
- 转到789
这还没有经过测试,但类似的东西应该可以工作:
public long WriteNode(BinaryWriter writer, TreeNode node)
{
// record the position where we will write this node
long initialPosition = writer.BaseStream.Position;
// write node content
writer.Write(node.Value);
// write number of children
writer.Write(node.Children.Count);
// write child position vector initialized with zeros
long vectorPosition = writer.BaseStream.Position;
for (int i = 0; i < node.Children.Count; ++i)
{
writer.Write(0L);
}
// write children, updating position vector as we go
for (int i = 0; i < node.Children.Count; ++i)
{
// write the child and get the position where it was written
var childNode = node.Children[i];
long childPosition = WriteNode(writer, childNode);
// compute the file position where we need to store the child's position
long childVectorPosition = vectorPosition + (i * sizeof(long));
// write the child's file position into the vector while retaining the current file position
long currentPosition = writer.BaseStream.Position;
writer.BaseStream.Seek(childVectorPosition, SeekOrigin.Begin);
writer.Write(childPosition);
writer.BaseStream.Seek(currentPosition, SeekOrigin.Begin);
}
// return file position where this node was written
return initialPosition;
}
递归地编写节点,将子位置初始化为零,并在必要时回溯以使用实际值更新子位置。