我想将文件 2 上的每个字母与文件 1 进行比较。例:
文件1:我的名字
文件 2 : 小米n@mes
我想得到差异数是 3,在文件 2 上:(i、@ 和 s(。你可以帮我吗
这是我的代码
public float getCER(String originalteks,String extractteks){
int end=0;
int start=0;
int different_char=0;
if(originalteks.length()!=extractteks.length()){
different_char=Math.abs(originalteks.length()-extractteks.length());
}
while(start<end){
if(originalteks.charAt(start)!=originalteks.charAt(start++))
different_char++;//jumlah diferent chart
}
return (float) different_char/originalteks.length();
}
它只计算字符的数量,而不是不同的字符。
以下实现测试您需要的总差异,并且能够通过比较较短的字符串与较长的子字符串的每个子字符串,直到其差异的最大偏移量来处理具有不同长度的字符串。从这些差异中选择最小的。当然,如果 handleOffset 为 false,那么我们将自己限制为字符串的开头并将差值添加到结果中;
public int getCER(String originalteks,String extractteks, boolean handleOffset){
String shorter = originalteks;
String longer = extractteks;
if (shorter.length() > longer.length()) {
shorter = extractteks;
longer = originalteks;
}
int[] differences = new int[handleOffset ? (longer.length() - shorter.length + 1) : 1];
for (int i = 0; i < differences.length; i++) differences[i] = 0;
for (int i = 0; i < minLength; i++) {
for (j = 0; j < differences.length; j++) {
if (shorter.charAt(i) !== longer.charAt(i + j)) {
differences[j]++;
}
}
}
int min = shorter.length() + 1;
for (int i = 0; i < differences.length; i++) {
if (differences[i] < min) min = differences[i];
}
if (!handleOffset) min += longer.length() - shorter.length();
return min;
}
这应该适合您。我只是在示例中注释我的更改。
public int getCER(String originalteks,String extractteks){
int end;
int different_char=0;
//define the shorter end
if(originalteks.length < extractteks.length)
end = originalteks.length();
else
end = extractteks.length();
//no if needed -> same length, diff will be 0
different_char=Math.abs(originalteks.length()-extractteks.length());
for(int start = 0; start < end; start++){
if(originalteks.charAt(start)!=extractteks.charAt(start))
different_char++;//jumlah diferent chart
}
return different_char;
}