r-我如何使用生物RT来获得相应的基因ID



我有一个txt文件,它看起来像这样。我需要在R中使用生物RT来获得不同Refseq和肽的完整列表的相应基因ID。除此之外,我还需要保留肽序列和最终结果。我该怎么做?请帮助

myData = read.delim("phosphopeptides.txt", header = FALSE)

使用refseq_peptide匹配我们的ID:

library(biomaRt)
ensembl <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl")
refseq_peptide = unique(myData$RefSeq)
res <- getBM(attributes = c("refseq_peptide", "hgnc_symbol"), 
filters = "refseq_peptide",
values = refseq_peptide, 
mart = ensembl)
res
#   refseq_peptide hgnc_symbol
# 1      NP_000007       ACADM
# 2      NP_000009      ACADVL
# 3      NP_000012       PSEN1
#merge
merge(myData, res, by.x = "RefSeq", by.y = "refseq_peptide")
#      RefSeq                            Peptide hgnc_symbol
# 1 NP_000007                    R.SDPDPKAPANK.A       ACADM
# 2 NP_000009                    K.SDSHPSDALTR.K      ACADVL
# 3 NP_000012 K.YNAESTERESQDTVAENDDGGFSEEWEAQR.D       PSEN1
# 4 NP_000012            R.AAVQELSSSILAGEDPEER.G       PSEN1
# 5 NP_000012            R.AAVQELSSSILAGEDPEER.G       PSEN1
# 6 NP_000012                  R.S*LGHPEPLSNGR.P       PSEN1

注意:当我们不知道正确的属性名称时,查找属性的有用功能-searchAttributes

searchAttributes(mart = ensembl, pattern = "refseq")
#                        name                 description         page
# 86              refseq_mrna              RefSeq mRNA ID feature_page
# 87    refseq_mrna_predicted    RefSeq mRNA predicted ID feature_page
# 88             refseq_ncrna             RefSeq ncRNA ID feature_page
# 89   refseq_ncrna_predicted   RefSeq ncRNA predicted ID feature_page
# 90           refseq_peptide           RefSeq peptide ID feature_page
# 91 refseq_peptide_predicted RefSeq peptide predicted ID feature_page
searchAttributes(mart = ensembl, pattern = "hgnc")
#               name        description         page
# 64         hgnc_id            HGNC ID feature_page
# 65     hgnc_symbol        HGNC symbol feature_page
# 95 hgnc_trans_name Transcript name ID feature_page

最新更新