r语言 - 利马批量效应校正后的负值



我有几个RNA-seq数据集。使用去卷积方法,我对它们中的每一个进行了细胞类型富集分析,然后将结果组合成一个数据帧,产生了1000+个列样本和38个行样本。数据集来自不同的癌症文章。因此,自然地,在使用火山图或t-SNE可视化数据之前,我需要纠正癌症类型AND数据集的批量效应,这样源就不会影响结果。我使用了以下代码:

scores.batch = limma::removeBatchEffect(scores ,metadata$Cancer_Type, metadata$dataset)

然而,对于一些样本中的某些细胞类型,我得到了负分,这当然毫无意义。出了问题。

得分矩阵的dput:

structure(c(0, 0.0853672252935787, 0.0472255148477786, 0.0505467828272972, 
0, 0.0308325695761715, 0, 0.157955518619051, 0.00989687292281167, 
0.03263377453636, 0.174135667551838, 0.0287360256296349, 0.0647519562755579, 
0, 0, 0, 0.0131324709303641, 0, 0, 0.131356081785285, 0, 0.0389487771231123, 
0.143102950679691, 0, 0, 0, 0, 0, 0.0120903254374909, 0, 0, 0.0146819273419876, 
0.00547214738400891, 0.0128837171879466, 0, 0, 0.0458955588957287, 
0, 0, 0.0132395370608289, 0, 0.188105588373935, 0.458389955317805, 
0, 0, 0, 0, 0.202322209601319, 0.0070140951370079, 0.0674561160550705, 
0.257105522741856, 0.0187125792218268, 0.132650077873857, 0.0464882832616245, 
0, 0.267398408455589, 0.257988913719892, 0, 0.0327859369672344, 
0.190621289930972, 0.00595393276866058, 0.0257929623669804, 0.0286417150045293, 
0.00692628582485207, 0, 0, 0, 0.103067773960521, 0.0254486580186314, 
0.0280937010981759, 0.0571003379986667, 0, 0.0129979208251825, 
0.0665627159432736, 0, 0, 0.224047712805128, 0, 0.0136182944729644, 
0.0432680414524333, 0.0399461338850251, 0.0292693281178669, 0.366507229736257, 
0, 0, 0, 0, 0, 0.00669552195301393, 0.0185218739472336, 0.0255519328964942, 
0, 0.0733344287554076, 0.0255903177243924, 0, 0.39146057499213, 
0.0111881442508292, 0, 0, 0.0511976959994528, 0, 0.00579928556115081, 
0.0732688902065305, 0, 0, 0, 0, 0, 0.0110070088566591, 0, 0, 
0.0106345827415758, 0, 0.0161657089454384, 0, 0, 0.181452946136114, 
0, 0, 0.0142759665417351, 0, 0.0462555010600369, 0.0827203733228943, 
0.0145026248884816, 0, 0.013260218865482, 0, 0.0933479614509043, 
0.00774602641682057, 0.0119668338387216, 0.129131677414995, 0.0239962329230613, 
0.0204322461340539, 0.0493841568846939, 0, 0, 0.121244199785082, 
0, 0.0121972859914857, 0.140024857933727, 0.174619321250637, 
0.220714394591806, 0.0357916262655448, 0.0545063692410225, 0, 
0, 0, 0.137161377602957, 0.00590382216872294, 0.0201750503599633, 
0.142034903521219, 0.0985879590151414, 0.0335131516620065, 0.090677099547935, 
0.00507741177479126, 0.037356635316015, 0.201168399838889, 0, 
0.0314008923657083, 0.365359437170722, 0, 0.0335843135244289, 
0.0715582133522154, 0, 0, 0, 0, 0.0474378875432263, 0.0209691496515952, 
0.0172794413473455, 0.0135720611847538, 0.033428514409707, 0.0105720693466205, 
0, 0, 0, 0.06773450392978, 0, 0, 0.00586743854873324, 0.0168258454272402, 
0.0210951521159853, 0.243788369183411, 0.0220752365135898, 0, 
0, 0, 0, 0.0306987963385464, 0, 0, 0.0214207906806456, 0, 0.00976336517826329, 
0.0271958080474159, 0.00901990828923357, 0.0107550873887369, 
0, 0, 0.0447193474549512, 0, 0.0893397248630643, 0.0628755200633895, 
0.00689545950391347, 0, 0, 0, 0, 0.0317975383226698, 0.0326506928899438, 
0.0585870944941104, 0.0325612902279177, 0.015309108818366, 0.00884806480492375, 
0, 0, 0.090018886142378, 0, 0, 0.0252109197429766, 0, 0.0575320783388593, 
0.0360786525651136, 0, 0, 0, 0, 0.033985164346289, 0.0224789756565266, 
0, 0.0110759152952279, 0.0117488957883667, 0.0308459132319819, 
0.0280619366351415, 0, 0.118913468206155, 0.104597268143716, 
0, 0, 0.0725014944946794, 0.0178909285824974, 0.107561160668656, 
0.103657490882649, 0.00912992981258696, 0, 0, 0, 0.0705798568396863, 
0.0358671574380446, 0.0436978038106949, 0.0947966583779633, 0.00754414348305365, 
0.0427209871099505, 0.0293558198269896, 0, 0, 0.186905499424745, 
0, 0.00921431451770207, 0.0392728923365106, 0.457917600754677, 
0.0346030375240686, 0, 0.00559973035365259, 0, 0, 0, 0.0752873255178603, 
0, 0, 0.0146401887588438, 0.0149177458753822, 0.0746262560762416, 
0.263898927149848, 0.00724132393694323, 0.0356656672388469, 0.408802700748097, 
0, 0.105516874044416, 0.0759265575312881, 0, 0, 0, 0, 0, 0, 0, 
0.112847654739959, 0.0142421329460783, 0.0261230668401576, 0, 
0.00638014939397572, 0.0315337646404048, 0.0165989987896856, 
0, 0.359720477023771, 0.119577703639366, 0, 0, 0.00856089558535689, 
0, 0.00683428737177429, 0.0668329268575581, 0, 0, 0, 0, 0.047734960135924, 
0, 0, 0, 0, 0, 0, 0, 0, 0.107598342312041, 0, 0, 0.0121935326110086, 
0, 0, 0.135919574921458, 0, 0, 0, 0, 0, 0.0128474874078213, 0, 
0, 0, 0, 0.0153109234303942, 0, 0, 0.158969240948426, 0, 0, 0, 
0, 0.0232230943661847, 0.140426779187137, 0, 0, 0, 0, 0.0336919132251274, 
0.0340940005017551, 0.00546712364093891, 0.013544098663573, 0.00839243775091744, 
0, 0.00548575092813788, 0, 0, 0.0553411208343392, 0, 0, 0.0125206755368664, 
0.0182216981318242, 0.121162914437175, 0.114036773041914, 0.0357279266394587, 
0, 0, 0, 0.211648365695196, 0, 0.0354172784066678, 0.169262066444321, 
0.0630110794062426, 0.0606400985951092, 0.101323746391281, 0, 
0.021421960793034, 0.288751473459872, 0, 0.024183015113392, 0.352175638353027, 
0, 0.0258095846797072, 0.0228475888849942, 0, 0, 0, 0, 0.0802318746420269, 
0, 0, 0.0209026176594105, 0.0167803844651298, 0.0668381647275034, 
0.0264858410661633, 0, 0.00902616117758849, 0.10613905468228, 
0, 0, 0.0868373941926339), .Dim = c(20L, 20L), .Dimnames = list(
c("Adipocytes", "B-cells", "Basophils", "CD4+ memory T-cells", 
"CD4+ naive T-cells", "CD4+ T-cells", "CD4+ Tcm", "CD4+ Tem", 
"CD8+ naive T-cells", "CD8+ T-cells", "CD8+ Tcm", "Class-switched memory B-cells", 
"DC", "Endothelial cells", "Eosinophils", "Epithelial cells", 
"Fibroblasts", "Hepatocytes", "ly Endothelial cells", "Macrophages"
), c("Pt1", "Pt10", "Pt103", "Pt106", "Pt11", "Pt17", "Pt2", 
"Pt24", "Pt26", "Pt27", "Pt28", "Pt29", "Pt31", "Pt36", "Pt37", 
"Pt38", "Pt39", "Pt4", "Pt46", "Pt47")))

我再次厌倦了只校正Cancer_Type,但结果并不好(t-SNE没有根据我的需要对数据进行聚类(。

这里可能有什么问题?

Limma假设这两种类型的批处理效果是相加的,因此是独立的。写limma::removeBatchEffect(x = scores, batch = cancer_type, batch2 = study意味着癌症类型与研究数据集之间没有关系。然而,一项研究很可能是关于一种癌症类型,另一项研究是关于另一种癌症类型。因此,limma模型被打破的假设很可能被打破了。

但是,您可以只创建一个批处理,并将其用作单个批处理参数:

metadata$merged_batch <- paste0(metadata$Cancer_Type, metadata$dataset)
scores.batch = limma::removeBatchEffect(scores, batch = metadata$merged_batch)

最新更新