我最近学习了多维数组,并被分配了分析RNA链并将其翻译成蛋白质序列的任务。我决定用我的多维数组知识来创建一个密码子(一组3个RNA碱基)将翻译成的每个氨基酸的定义。
//RNA codon to amino acid mapping
char aminoAcid[4][4][4];
//A = 0, C = 1, G = 2, U = 3
//phenylalanine - F
aminoAcid[3][3][3] = 'F';
aminoAcid[3][3][1] = 'F';
//Leucine - L
aminoAcid[3][3][0] = 'L';
aminoAcid[3][3][2] = 'L';
//Serine - S
aminoAcid[3][1][3] = 'S';
aminoAcid[3][1][1] = 'S';
aminoAcid[3][1][0] = 'S';
aminoAcid[3][1][2] = 'S';
//tyrosine - Y
aminoAcid[3][0][3] = 'Y';
aminoAcid[3][0][1] = 'Y';
//stop codon
aminoAcid[3][0][0] = '-';
aminoAcid[3][0][2] = '-';
//cysteine - C
aminoAcid[3][2][3] = 'C';
aminoAcid[3][2][1] = 'C';
//stop codon
aminoAcid[3][2][0] = '-';
//tryptophan - W
aminoAcid[3][2][2] = 'W';
//leucine - L
aminoAcid[1][3][3] = 'L';
aminoAcid[1][3][1] = 'L';
aminoAcid[1][3][0] = 'L';
aminoAcid[1][3][2] = 'L';
//proline - P
aminoAcid[1][1][3] = 'P';
aminoAcid[1][1][1] = 'P';
aminoAcid[1][1][0] = 'P';
aminoAcid[1][1][2] = 'P';
//histidine - H
aminoAcid[1][0][3] = 'H';
aminoAcid[1][0][1] = 'H';
//glutamine - Q
aminoAcid[1][0][0] = 'Q';
aminoAcid[1][0][2] = 'Q';
//arginine - R
aminoAcid[1][2][3] = 'R';
aminoAcid[1][2][1] = 'R';
aminoAcid[1][2][0] = 'R';
aminoAcid[1][2][2] = 'R';
//isoleucine - I
aminoAcid[0][3][3] = 'I';
aminoAcid[0][3][1] = 'I';
aminoAcid[0][3][0] = 'I';
//methionine(start codon) - M
aminoAcid[0][3][2] = 'M';
//threonine -T
aminoAcid[0][1][3] = 'T';
aminoAcid[0][1][1] = 'T';
aminoAcid[0][1][0] = 'T';
aminoAcid[0][1][2] = 'T';
//asparagine - N
aminoAcid[0][0][3] = 'N';
aminoAcid[0][0][1] = 'N';
//lysine - K
aminoAcid[0][0][0] = 'K';
aminoAcid[0][0][2] - 'K';
//serine - S
aminoAcid[0][2][3] = 'S';
aminoAcid[0][2][1] = 'S';
//arginine - R
aminoAcid[0][2][0] = 'R';
aminoAcid[0][2][2] = 'R';
//valine - V
aminoAcid[2][3][3] = 'V';
aminoAcid[2][3][1] = 'V';
aminoAcid[2][3][0] = 'V';
aminoAcid[2][3][2] = 'V';
//alanine - A
aminoAcid[2][1][3] = 'A';
aminoAcid[2][1][1] = 'A';
aminoAcid[2][1][0] = 'A';
aminoAcid[2][1][2] = 'A';
//aspartic acid - D
aminoAcid[2][0][3] = 'D';
aminoAcid[2][0][1] = 'D';
//glutamic acid - E
aminoAcid[2][0][0] = 'E';
aminoAcid[2][0][2] = 'E';
//glycine - G
aminoAcid[2][2][3] = 'G';
aminoAcid[2][2][1] = 'G';
aminoAcid[2][2][0] = 'G';
aminoAcid[2][2][2] = 'G';
我创建了下面的函数来翻译链。在这种情况下,请注意我的rna链是:
AUGCUUAUUAACUGAAAACAUAUGGGUAGUCGAUGA
string rnaAnalysis::translateRna()
{
string protein = "";
int firstBase, secondBase, thirdBase;
for(int i = 0; i < newSequence.length() - 2; i+3)
{
if(newSequence[i] == 'A')
{
firstBase = 0;
}
else if(newSequence[i] == 'C')
{
firstBase = 1;
}
else if(newSequence[i] == 'G')
{
firstBase = 2;
}
else if(newSequence[i] == 'U')
{
firstBase = 3;
}
if(newSequence[i+1] == 'A')
{
secondBase = 0;
}
else if(newSequence[i+1] == 'C')
{
secondBase = 1;
}
else if(newSequence[i+1] == 'G')
{
secondBase = 2;
}
else if(newSequence[i+1] == 'U')
{
secondBase = 3;
}
if(newSequence[i+2] == 'A')
{
thirdBase = 0;
}
else if(newSequence[i+2] == 'C')
{
thirdBase = 1;
}
else if(newSequence[i+2] == 'G')
{
thirdBase = 2;
}
else if(newSequence[i+2] == 'U')
{
thirdBase = 3;
}
bool readSequence = true;
if (aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[0][3][2])
{
readSequence = true;
}
else if (aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[3][0][0] ||
aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[3][0][2] ||
aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[3][2][0])
{
readSequence = false;
}
if(readSequence)
{
protein = protein + aminoAcid[firstBase][secondBase][thirdBase] + " ";
}
else
{
continue;
}
}
return protein;
}
bool用于"开始密码子"one_answers"停止密码子",基本上是链内的密码子,它将告诉您何时记录和何时停止。newSequence将是RNA链。
编辑:我是新手,所以我知道我的代码可能看起来很丑。任何关于如何清理它的反馈也是非常感谢的。for(int i = 0; i < newSequence.length() - 2; i+3)
应该for(int i = 0; i < newSequence.length() - 2; i += 3)
你的代码永远不会改变i
的值,这就是为什么它永远不会停止运行。
循环从同一段代码开始三次,将字母转换为'基索引'。这显然是使用函数
的地方for (int i = 0; i < newSequence.length() - 2; i += 3)
{
int firstBase = baseIndex(newSequence[i]);
int secondBase = baseIndex(newSequence[i + 1]);
int thirdBase = baseIndex(newSequence[i + 2]);
...
baseIndex函数由您来编写