如何访问嵌套列表中的每个元素，以便在snakemake中命名输出

这是一个类似的问题：Snakemake:使用python嵌套列表理解进行条件分析

我有以下内容：

RUN_ID = ["run1", "run2"]
SAMPLES = [["A", "B", "C"], ["D","E","F"]]

rule all:
input:
summary = expand("foo/{run}/{sample}/outs/{sample}_report.html", run=RUN_ID, sample=SAMPLES)

问题1:RUN_ID中的每个运行应仅与SAMPLES中相应的样本关联(基于索引(。因此，run1仅与A、B、C配对，run2只与D、E、F配对。

问题2:每个输出文件的命名应该反映这种基于索引的配对。目前，我正在努力使SAMPLES中每个嵌套列表的每个元素与每个RUN_ID配对

基于以上内容，我想要以下输出：

"foo/run1/A/outs/A_report.html"
"foo/run1/B/outs/B_report.html"
"foo/run1/C/outs/C_report.html"
"foo/run2/D/outs/D_report.html"
"foo/run2/E/outs/E_report.html"
"foo/run2/F/outs/F_report.html"

最初我得到的是：

"foo/run1/["A", "B", "C"]/outs/["A", "B", "C"]_report.html"
"foo/run1/["D", "E", "F"]/outs/["D", "E", "F"]_report.html"
"foo/run2/["A", "B", "C"]/outs/["A", "B", "C"]_report.html"
"foo/run2/["D", "E", "F"]/outs/["D", "E", "F"]_report.html"

我在expand函数中使用zip克服了不希望的配对：

summary= expand(["foo/{run}/{sample}/outs/{sample}_report.html", "foo/{run}/{sample}/outs/{sample}_report.html"], zip, run=RUN_ID, sample=SAMPLES)

留给我RUN_ID和SAMPLES之间所需的配对：

"foo/run1/["A", "B", "C"]/outs/["A", "B", "C"]_report.html"
"foo/run2/["D", "E", "F"]/outs/["D", "E", "F"]_report.html"

但是，如上所述，每个嵌套列表都被传递到输出路径，而不是每个嵌套列表的每个元素。我可以通过将SAMPLES分为两个不同的列表来实现我想要的，但我想要一种更优雅和自动化的方法。

我也不拘泥于使用嵌套列表；感谢对解决方案或更好方法的任何见解。谢谢

expand是一个方便的实用程序，对于更复杂的情况，直接使用python:生成所需列表通常会更快

RUN_ID = ["run1", "run2"]
SAMPLES = [["A", "B", "C"], ["D","E","F"]]
desired_files = []
for run, SAMPLE in zip(RUN_ID, SAMPLES):
for sample in SAMPLE:
file = f"foo/{run}/{sample}/outs/{sample}_report.html"
desired_files.append(file)

rule all:
input: desired_files

相关内容

最新更新

热门标签：