
我有两个字符串列表作为表中的一列(PM25_spr{i}.MonitorIDO3_spr{i}.MonitorID)。列表的长度不同。我想比较每个条目的前 11 个字符,并为每个列表提取它们相同的索引。

List 1:
List 2:

我尝试了intersect,这不是我想做的事情的正确方法。我不确定如何使用ismember因为我只想看前 11 个字符。

我试过strncmp,但Inputs must be the same size or either one can be a scalar.

chars2compare = length('18-097-0083'); 
strncmp(O3_spr{i}.MonitorID, PM25_spr{i}.MonitorID,chars2compare)
PM25_spr_MID = cell(length(years),1); % Preallocate cell array
for n = 1:length(PM25_spr{i}.MonitorID) 
    s = char(PM25_spr{i}.MonitorID(n)); % Convert string to char
    PM25_spr_MID{i}(n) = cellstr(s(1:11)); % Pull out 1-11 characters and convert to cell
O3_spr_MID = cell(length(years),1); % Preallocate cell array
for n = 1:length(O3_spr{i}.MonitorID)
    s = char(O3_spr{i}.MonitorID(n));
    O3_spr_MID{i}(n) = cellstr(s(1:11));
[C, ia, ib] = intersect(O3_spr_MID{i}, PM25_spr_MID{i}) 
PerCap_spr_O3{i} = O3_spr{i}(ia,:);
PerCap_spr_PM25{i} = PM25_spr{i}(ib,:);


I. 对单元阵列进行操作

随着intersect -

%// Clip off after first 11 characters in each cell of the input cell arrays
list1_f11 = arrayfun(@(n) list1{n}(1:11),1:numel(list1),'uni',0)
list2_f11 = arrayfun(@(n) list2{n}(1:11),1:numel(list2),'uni',0)
%// Use intersect to find common indices in the input cell arrays
[~,idx_list1,idx_list2] = intersect(list1_f11,list2_f11)

随着ismember -

%// Clip off after first 11 characters in each cell of the input cell arrays
list1_f11 = arrayfun(@(n) list1{n}(1:11),1:numel(list1),'uni',0)
list2_f11 = arrayfun(@(n) list2{n}(1:11),1:numel(list2),'uni',0)
%// Use ismember to find common indices in the input cell arrays
[LocA,LocB] = ismember(list1_f11,list2_f11);
idx_list1 = find(LocA)
idx_list2 = LocB(LocA)

II. 对字符数组进行操作


使用 intersect + "行" -

%// Convert to char arrays
list1c = char(list1)
list2c = char(list2)
%// Clip char arrays after first 11 columns
list1c_f11 = list1c(:,1:11)
list2c_f11 = list2c(:,1:11)
%// Use intersect with 'rows' option
[~,idx_list1,idx_list2] = intersect(list1c_f11,list2c_f11,'rows')

III. 对数字数组进行操作

我们可以将 char 数组进一步转换为仅使用一列的数字数组,因为这可能会导致更快的解决方案。

%// Convert to char arrays
list1c = char(list1)
list2c = char(list2)
%// Clip char arrays after first 11 columns
list1c_f11 = list1c(:,1:11)
list2c_f11 = list2c(:,1:11)
%// Remove char columns of hyphens (3 and 7 for the given input)
list1c_f11(:,[3 7])=[];
list2c_f11(:,[3 7])=[];
%// Convert char arrays to numeric arrays
ncols = size(list1c_f11,2);
list1c_f11num = (list1c_f11 - '0')*(10.^(ncols-1:-1:0))'
list2c_f11num = (list2c_f11 - '0')*(10.^(ncols-1:-1:0))'


使用ismember(将节省内存,但可能不会在所有数据大小中快速) -

[LocA,LocB] = ismember(list1c_f11num,list2c_f11num);
idx_list1 = find(LocA)
idx_list2 = LocB(LocA)


[~,idx_list1,idx_list2] = intersect(list1c_f11num,list2c_f11num)


[idx_list1,idx_list2] = find(bsxfun(@eq,list1c_f11num,list2c_f11num'))


  • 没有找到相关文章
