比较单元格数组中的字符串

我试图在单词列表中找到最常用的单词。这是我到目前为止的代码：

uniWords = unique(lower(words));
for i = 1:length(words)
    for j = 1:length(uniWords)
        if (uniWords(j) == lower(words(i)))
            freq(j) = freq(j) + 1;
        end
    end
end

当我尝试运行脚本时，出现以下错误：

Undefined function 'eq' for input arguments of
type 'cell'.
Error in Biweekly3 (line 106)
    if (uniWords(j) == lower(words(i)))

任何帮助不胜感激！

您需要

使用{}提取单元格的内容：

strcmpi(uniWords{j},words{i})

另外，我建议将字符串与strcmp进行比较，或者在这种情况下strcmpi，这会忽略大小写，因此您无需调用lower。

对字符串使用 == 时要小心，因为它们的长度必须相同，否则会出现错误：

>> s1='first';
>> s2='second';
>> s1==s2
Error using  == 
Matrix dimensions must agree.

不需要循环。 unique 为每个单词提供一个唯一标识符，然后您可以使用 sparse 将每个标识符的出现次数相加。从中，您可以轻松找到最大值和最大化词：

[~, ~, jj ] = unique(lower(words));
freq = full(sparse(ones(1,length(jj)),jj,1)); % number of occurrences of each word
m = max(freq);
result = lower(words(jj(freq==m))); % return more than one word if there's a tie

例如，与

words = {'The','hello','one','bye','the','one'}

结果是

>> result
result = 
    'one'    'the'

我想你需要做：

if (uniWords{j} == lower(words{i}))

另外，我建议不要在 MATLAB 中使用 i 和 j 作为变量。

更新

正如Chappjc指出的那样，最好使用strcmp（或者在您的情况下strcmpi并跳过lower），因为您想忽略案例。

相关内容

最新更新

热门标签：