如何递归下降 System.Text.Json JsonNode 层次结构(相当于 Json.NET 的 JToken.DescendantsAndSelf())?



我有一个任意的JSON文档(即没有预先已知的固定模式(,我想递归地对其进行下降,以搜索文档中任何级别上与某个谓词匹配的所有节点,这样我就可以进行一些必要的修改。如何使用JsonNode文档对象模型执行这样的递归搜索?

具体如下。

假设我有一些JSON,例如以下内容,其中可能包含属性"password"的一个或多个实例:

[
{
"column1": "val_column1",
"column2": "val_column2",
"sheet2": [
{
"sheet2col1": "val_sheet2column1",
"sheet3": [
{
"sheet3col1": "val_sheet3column1",
"password": "password to remove"
}
]
},
{
"sheet2col1": "val_sheet2column1",
"sheet3": [
{
"sheet3col1": "val_sheet3column1"
}
]
}
]
},
{
"column1": "val2_column1",
"column2": "val2_column2",
"password": "password to remove",
"sheet2": [
{
"sheet2col1": "val_sheet2column1",
"sheet3": [
{
"sheet3col2": "val_sheet3column2"
},
null,
null,
19191
],
"password": "password to remove"
},
{
"sheet2col1": "val_sheet2column1",
"sheet3": [
{
"sheet3col2": "val_sheet3column2"
}
]
}
]
}
]

我需要将其解析为CCD_ 3层次结构;密码";属性,无论它们在JSON层次结构中出现在哪里。使用Json.NET,我可以解析为JToken并使用DescendantsAndSelf():

var root = JToken.Parse(json);
var propertyToRemove = "password";
if (root is JContainer c)
foreach (var obj in c.DescendantsAndSelf().OfType<JObject>().Where(o => o.ContainsKey(propertyToRemove)))
obj.Remove(propertyToRemove);
var newJson = root.ToString();

JsonNode没有一个等效的方法。如何使用System.Text.Json完成此操作?

由于JsonNode没有等效的DescendantsAndSelf(),我们将不得不自己创建一个:

public static partial class JsonExtensions
{
public static IEnumerable<JsonNode?> Descendants(this JsonNode? root) => root.DescendantsAndSelf(false);
/// Recursively enumerates all JsonNodes in the given JsonNode object in document order.
public static IEnumerable<JsonNode?> DescendantsAndSelf(this JsonNode? root, bool includeSelf = true) => 
root.DescendantItemsAndSelf(includeSelf).Select(i => i.node);

/// Recursively enumerates all JsonNodes (including their index or name and parent) in the given JsonNode object in document order.
public static IEnumerable<(JsonNode? node, int? index, string? name, JsonNode? parent)> DescendantItemsAndSelf(this JsonNode? root, bool includeSelf = true) => 
RecursiveEnumerableExtensions.Traverse(
(node: root, index: (int?)null, name: (string?)null, parent: (JsonNode?)null),
(i) => i.node switch
{
JsonObject o => o.AsDictionary().Select(p => (p.Value, (int?)null, p.Key.AsNullableReference(), i.node.AsNullableReference())),
JsonArray a => a.Select((item, index) => (item, index.AsNullableValue(), (string?)null, i.node.AsNullableReference())),
_ => i.ToEmptyEnumerable(),
}, includeSelf);

static IEnumerable<T> ToEmptyEnumerable<T>(this T item) => Enumerable.Empty<T>();
static T? AsNullableReference<T>(this T item) where T : class => item;
static Nullable<T> AsNullableValue<T>(this T item) where T : struct => item;
static IDictionary<string, JsonNode?> AsDictionary(this JsonObject o) => o;
}
public static partial class RecursiveEnumerableExtensions
{
// Rewritten from the answer by Eric Lippert https://stackoverflow.com/users/88656/eric-lippert
// to "Efficient graph traversal with LINQ - eliminating recursion" http://stackoverflow.com/questions/10253161/efficient-graph-traversal-with-linq-eliminating-recursion
// to ensure items are returned in the order they are encountered.
public static IEnumerable<T> Traverse<T>(
T root,
Func<T, IEnumerable<T>> children, bool includeSelf = true)
{
if (includeSelf)
yield return root;
var stack = new Stack<IEnumerator<T>>();
try
{
stack.Push(children(root).GetEnumerator());
while (stack.Count != 0)
{
var enumerator = stack.Peek();
if (!enumerator.MoveNext())
{
stack.Pop();
enumerator.Dispose();
}
else
{
yield return enumerator.Current;
stack.Push(children(enumerator.Current).GetEnumerator());
}
}
}
finally
{
foreach (var enumerator in stack)
enumerator.Dispose();
}
}
}

现在我们将能够做到:

var root = JsonNode.Parse(json);
var propertyToRemove = "password";
foreach (var obj in root.DescendantsAndSelf().OfType<JsonObject>().Where(o => o.ContainsKey(propertyToRemove)))
obj.Remove(propertyToRemove);
var options = new JsonSerializerOptions { WriteIndented = true /* Use whatever you want here */ };
var newJson = JsonSerializer.Serialize(root, options);

在这里演示小提琴#1。

请记住Json.NET的LINQ到Json的以下差异:

  1. nullJSON值(例如{"value":null}(返回的JsonNode实际为null。LINQ to JSON将nullJSON值表示为非空JValue,其中JValue.Type等于JTokenType.Null

  2. JsonNode与Json.NET的JProperty没有任何等价物。对象中某个值的父对象将是对象本身。因此,没有直接的方法可以通过JsonNode文档对象模型来确定所选JsonNode属性值的属性名称。

因此,如果需要按值(而不是按名称(搜索和修改属性,可以使用第二种扩展方法DescendantItemsAndSelf(),该方法包括父节点、名称或索引以及当前节点。例如,要删除所有null属性值,请执行以下操作:

foreach (var item in root.DescendantItemsAndSelf().Where(i => i.name != null && i.node == null).ToList())
((JsonObject)item.parent!).Remove(item.name!);

在这里演示小提琴#2。

基于@dbc的综合答案,您可以通过利用MoreLINQ的TraverseDepthFirst(或TraverseBreadthFirst(来避免许多辅助代码和扩展。

下面是一个简单的例子:

using MoreLinq;
var node = JsonSerializer.Deserialize<JsonNode>("""
{ "array": [true, false, null, 42, {}] }
""");
var result = MoreEnumerable.TraverseDepthFirst((object?)node, node => node switch
{
JsonObject obj => from m in obj
from mn in new object?[] { m, m.Value }
select mn,
JsonArray arr  => from object? e in arr.Index() // Index from MoreLINQ
select e,
_              => Enumerable.Empty<object?>(),
});

result将是以下序列之一:

  • null
  • JsonArray
  • JsonObject
  • JSON标量的JsonValue(布尔值、数字、trueJsonNode0(
  • 对象成员/属性的KeyValuePair<string, JsonNode?>,其中key是名称
  • 关键字为索引的数组元素的KeyValuePair<int, JsonNode?>

不幸的是,有三个问题:

  1. result将是IEnumerable<object?>类型,因为object是上述类型的唯一共同祖先
  2. 转换为object将导致键值对(KeyValuePair<string, JsonNode?>KeyValuePair<int, JsonNode?>(被装箱
  3. 您将失去与对象成员和数组元素的父对象的连接

这些问题可以通过引入一个轻量级结构作为涵盖所有情况的并集来解决:

readonly record struct JsonTreeNode
{
readonly int index; // 0 = undefined
JsonTreeNode(int index, string? name, JsonNode? node, JsonNode? parent) => (this. Index, Name, Node, Parent) = (index, name, node, parent);
public JsonTreeNode(JsonNode? node, JsonNode? parent) : this(0, null, node, parent) { }
public JsonTreeNode(string name, JsonNode? node, JsonObject parent) : this(0, name, node, parent) { }
public JsonTreeNode(int index, JsonNode? node, JsonArray parent) : this(index + 1, null, node, parent) { }
public int Index => this. Index - 1;
public string? Name { get; }
public JsonNode? Node { get; }
public JsonNode? Parent { get; }
public bool IsObjectMember => Name is not null;
public bool IsArrayElement => Index >= 0;
public KeyValuePair<string, JsonNode?> AsObjectMember() =>
Name is { } name
? KeyValuePair.Create(name, Node)
: throw new InvalidOperationException("Node is not a JSON object member");
public KeyValuePair<int, JsonNode?> AsArrayElement() =>
IsArrayElement
? KeyValuePair.Create(Index, Node)
: throw new InvalidOperationException("Node is not a JSON array element.");
}

有了这个,您可以按如下方式使用TraverseDepthFirst

var result = MoreEnumerable.TraverseDepthFirst(new JsonTreeNode(node, null), (node => node switch
{
{ Node: JsonObject obj } => obj.Select(p => new JsonTreeNode(p.Key, p.Value, obj)),
{ Node: JsonArray arr  } => arr.Select((e, i) => new JsonTreeNode(i, e, arr)),
_                        => Enumerable.Empty<JsonTreeNode>(),
}));

result将被键入为IEnumerable<JsonTreeNode>,父级将被链接,并且不会有任何装箱。

另请参阅.NET Fiddle上的演示。

首先定义以下扩展方法:

public static IEnumerable<JsonNode> Descendants(this JsonNode node)
{
if (node is JsonObject jObj)
{
foreach (var pty in jObj)
{
yield return pty.Value;
foreach (var child in pty.Value.Descendants())
yield return child;
}
}
else if (node is JsonArray jArray)
{
foreach (var elt in jArray)
{
yield return elt;
foreach (var child in elt.Descendants())
yield return child;
}
}
else
yield break;
}

您可以按如下方式使用它:

var doc = JsonNode.Parse(raw);
var descendants = doc.Descendants().ToArray();
foreach (var o in descendants.OfType<JsonObject>().Where(o => o.ContainsKey("password")))
{
o.Remove("password");
}

最新更新