使用jq将JSON文件中的键值中的字符串分配给文件中的其他键



目标是在JSON文件中使用jq将hrefFull的部分分配给hrefSimplehrefSubsite。可能有更好的方法来实现这一点,但我已经通过寻找一种解决方案来解决这个问题,该解决方案删除了键值中的字符串articles之前的所有内容,但保留了字符串。因此,像下面的示例对象这样的多个对象包含在一个JSON文件中,格式为[在开始,]在结束。

所需结果:

  • hrefFull不变。从hrefFull中提取的字符串应用于hrefSimplehrefSubsite
  • hrefSimplearticles之后的所有内容。如果articles不在字符串中,则hrefSimple为最后一个/后面的字符串。参见示例对象7。
  • hrefSubsitehttps://docs.mysite.com//articles...之间的字符串。

示例结果-对象1:

{
"hrefFull": "https://docs.mysite.com/product-a/articles/page-a.html",
"hrefSimple": "articles/page-a.html",
"hrefSubsite": "product-a"
}

示例结果-对象2:

{
"hrefFull": "https://docs.mysite.com/product-b/articles/guide-b/page-b.html",
"hrefSimple": "articles/guide-b/page-b.html",
"hrefSubsite": "product-b"
}

示例结果-对象3:

{
"hrefFull": "https://docs.mysite.com/product-c/articles/guide-c/section-c/page-c.html",
"hrefSimple": "articles/guide-c/section-c/page-c.html",
"hrefSubsite": "product-c"
}

示例结果-对象4:

{
"hrefFull": "https://docs.mysite.com/product-d/sub-product-d/articles/page-d.html",
"hrefSimple": "articles/page-d.html",
"hrefSubsite": "product-d/sub-product-d"
}

示例结果-对象5:

{
"hrefFull": "https://docs.mysite.com/product-e/sub-product-e/articles/guide-e/page-e.html",
"hrefSimple": "articles/guide-e/page-e.html",
"hrefSubsite": "product-e/sub-product-e"
}

示例结果-对象6:

{
"hrefFull": "https://docs.mysite.com/product-f/sub-product-f/articles/guide-f/section-f/page-f.html",
"hrefSimple": "articles/guide-f/section-f/page-f.html",
"hrefSubsite": "product-f/sub-product-f"
}

示例结果-对象7:

{
"hrefFull": "https://docs.mysite.com/product-g/index.html",
"hrefSimple": "index.html",
"hrefSubsite": "product-g"
}

尝试失败(在Bash脚本中):

siteUrl="docs.mysite.com"
jq '
(.hrefSimple = .hrefFull)
| .hrefSimple |= (gsub("https://($siteUrl)/.*?/"; ""))
| (.hrefSubsite = .hrefFull)
| .hrefSubsite |= (gsub("https://($siteUrl)/"; ""))
' file-1.json > file-2.json

脚本生成准确和不准确的结果。

准确的结果:

  • 对象1
  • 对象2
  • 对象3
  • 对象7

不准确的结果:

  • 对象4:
    • hrefSimple错误的是sub-product-d/articles/page-d.html而不是articles/page-d.html
    • hrefSubsite是错误的sub-product-d而不是product-d/sub-product-d
  • 对象5:
    • hrefSimple是错误的sub-product-e/articles/guide-e/page-e.html而不是articles/guide-e/page-e.html
    • hrefSubsite错误地是sub-product-e而不是product-e/sub-product-e
  • 对象6:
    • hrefSimple是错误的sub-product-f/articles/guide-f/section-f/page-f.html而不是articles/guide-f/section-f/page-f.html
    • hrefSubsite是错误的sub-product-f而不是product-f/sub-product-f

其他不成功的尝试(如果有帮助,我可以提供确切的结果):

  • articles.hrefSimple |= (gsub("https://($siteUrl)/.*?/"; "")).hrefSubsite |= (gsub("https://($siteUrl)/"; ""))形式的各种迭代
  • .hrefSimple |= split("articles")[0]的各种迭代(也在.hrefSubsite内)

对于上下文,如果重要的话,hrefFull来自Azure App Insights导出的文档网站的页面视图。导出的数据将用于分析报告。我正在创建hrefSimple来连接两个表,并希望对hrefSubsite进行过滤。hrefFull中的路径是在使用DocFx静态站点生成器生成网站并部署到Azure Blob时产生的。

我会使用capture与一个正则表达式:

. + (.hrefFull | capture(
"^https://docs.mysite.com/(?<hrefSubsite>.*?)/(?<hrefSimple>articles.*|[^/]*)$"
))
{
"hrefFull": "https://docs.mysite.com/product-a/articles/page-a.html",
"hrefSubsite": "product-a",
"hrefSimple": "articles/page-a.html"
}
{
"hrefFull": "https://docs.mysite.com/product-b/articles/guide-b/page-b.html",
"hrefSubsite": "product-b",
"hrefSimple": "articles/guide-b/page-b.html"
}
{
"hrefFull": "https://docs.mysite.com/product-c/articles/guide-c/section-c/page-c.html",
"hrefSubsite": "product-c",
"hrefSimple": "articles/guide-c/section-c/page-c.html"
}
{
"hrefFull": "https://docs.mysite.com/product-d/sub-product-d/articles/page-d.html",
"hrefSubsite": "product-d/sub-product-d",
"hrefSimple": "articles/page-d.html"
}
{
"hrefFull": "https://docs.mysite.com/product-e/sub-product-e/articles/guide-e/page-e.html",
"hrefSubsite": "product-e/sub-product-e",
"hrefSimple": "articles/guide-e/page-e.html"
}
{
"hrefFull": "https://docs.mysite.com/product-f/sub-product-f/articles/guide-f/section-f/page-f.html",
"hrefSubsite": "product-f/sub-product-f",
"hrefSimple": "articles/guide-f/section-f/page-f.html"
}
{
"hrefFull": "https://docs.mysite.com/product-g/index.html",
"hrefSubsite": "product-g",
"hrefSimple": "index.html"
}

演示如果您的输入对象位于数组中,请将此过滤器包装到map(…).

相关内容

  • 没有找到相关文章

最新更新