因此,我正在尝试使用snakemake构建管道,并且我有一些问题可以访问配置文件中字典" small_reference"中的值。根据样品的不同,我想使用其他参考来对齐。
配置文件:
samples: ['C130165', 'C014044p', 'C130166', 'C130157', 'C014040p', 'C014054b-1', 'C051198-A', 'C014042p', 'C052007W-C', 'C130167', 'C051198-B', 'C130157A', 'C130165A', 'C014038p', 'C052004-B', 'C051198-C', 'C052004-C', 'C130167', 'C052003-B', 'C130165', 'C052003-A', 'C052004-A', 'C052002-C', 'C130157', 'C052005-C', 'C130157W', 'C130167A', 'C130157A', 'C130166A', 'C052002-A', 'C130157N', 'C052006-B', 'C014063pW', 'C130157W', 'C130157N', 'C014054b-2', 'C052002-B', 'C130167A', 'C052006-C', 'C130166A', 'C052007W-B', 'C052003-C', 'C130165A', 'C014064bW', 'C052005-B', 'C130166', 'C052006-A', 'C052005-A']
reference: "/mnt/storage/refs/human_1kg/human_g1k_v37.fasta"
index: "/mnt/storage/refs/human_1kg/human_g1k_v37.fasta.fai"
dbsnp: "/mnt/storage/refs/human_1kg/dbsnp_137.b37.vcf"
small_reference: {
C01: "/mnt/storage/projects/hiv_data/refs/BRCA/BRCA12_PALB2.fasta",
Z01: "/mnt/storage/projects/hiv_data/refs/BRCA/BRCA12.fasta",
C02: "/mnt/storage/projects/hiv_data/refs/STICKLERS/STICKERS_ext.fasta",
C03: "/mnt/storage/projects/hiv_data/refs/TS/TS.fasta",
C04: "/mnt/storage/projects/hiv_data/refs/STICKLERS/STICKERS.fasta",
C05: "/mnt/storage/projects/hiv_data/refs/PKD_GANAB/PKD.fasta",
C07: "/mnt/storage/projects/hiv_data/refs/NEMO/NEMO.fasta",
C08: "/mnt/storage/projects/hiv_data/refs/HNPCC/HNPCC.fasta",
C09: "/mnt/storage/projects/hiv_data/refs/TAU/TAU.fasta",
C10: "/mnt/storage/projects/hiv_data/refs/THYROID/THYROID.fasta",
C12: "/mnt/storage/projects/hiv_data/refs/VWF/VWF.fasta",
C13: "/mnt/storage/refs/human_1kg/human_g1k_v37.fasta",
C17: "/mnt/storage/projects/hiv_data/refs/DICER_PALB2/DICER_PALB2.fasta",
C18: "/mnt/storage/projects/hiv_data/refs/DICER_PALB2/DICER_PALB2.fasta",
}
baits: {
C01: "/mnt/storage/projects/hiv_data/refs/BRCA/BRCA12_PALB2.bed",
Z01: "/mnt/storage/projects/hiv_data/refs/BRCA/BRCA12_exons.bed",
C02: "/mnt/storage/projects/hiv_data/refs/STICKLERS/STICKERS_ext.bed",
C03: "/mnt/storage/projects/hiv_data/refs/TS/TS_exons.bed",
C04: "/mnt/storage/projects/hiv_data/refs/STICKLERS/STICKERS.bed",
C05: "/mnt/storage/projects/hiv_data/refs/PKD_GANAB/PKD.bed",
C07: "/mnt/storage/projects/hiv_data/refs/NEMO/NEMO.bed",
C08: "/mnt/storage/projects/hiv_data/refs/HNPCC/HNPCC.bed",
C09: "/mnt/storage/projects/hiv_data/refs/TAU/TAU.bed",
C10: "/mnt/storage/projects/hiv_data/refs/THYROID/THYROID_v2.bed",
C12: "/mnt/storage/projects/hiv_data/refs/VWF/VWF.bed",
C13: "/mnt/storage/refs/human_1kg/human_g1k_v37.bed",
C17: "/mnt/storage/projects/hiv_data/refs/DICER_PALB2/DICER_PALB2.bed",
C18: "/mnt/storage/projects/hiv_data/refs/DICER_PALB2/DICER_PALB2.bed",
}
根据样本的前3个字符,我想选择其他参考。我编写了一个函数,可以在config [" samples"]只是一个字符串时可以解决问题。但是现在我想考虑运行者,所以我有一个样本列表。
工作功能:
def get_ref(wildcards):
prefix = config["samples"][0:3]
return config["small_reference"][prefix]
我首先遇到此错误Duplicate output file pattern in rule
时,当我更改配置文件时(运行完整管道时(
测试规则:
rule test:
input:
fq = expand("{sample}.1.fq.gz", sample = config["samples"]),
ref = get_ref
shell:
"echo {input.fq} {input.ref}"
现在,我在运行测试规则时有一个错误:
InputFunctionException in line 17 of /mnt/storage/home/kimy/projects/automate_CP/scripts/Snakefile:
TypeError: unhashable type: 'list'
Wildcards:
示例:c014038p-> c01->/mnt/storage/projects/hiv_data/refs/refs/brca/brca/brca12_palb2.fasta
如何根据管道分析的样本前缀获得正确的" small_reference"?
事实证明,我遇到了snakemake as nask a nock,explive((用于创建通配符 ->仅在规则中并非在每个规则中指定它。
有效的修改功能:
def get_ref(wildcards):
prefix = wildcards.sample[0:3]
return config["small_reference"][prefix]