我有两个来自两个单独供应商的数据阵列。对于两个数组
维度1:日期
维度2:仪器(不同的期货交付(
尺寸3:六个仪器属性(开放,高,低,关闭,音量,开放式(
对于每个3D数组,我都有两个用于日期和仪器的变量(例如,我的代码中的A1Times
和A1Inst
(。
然而,尽管存在显着重叠,但在两个阵列中的日期和仪器并不相同。Array1
中可能存在一些日期和/或乐器,而不是Array2
,反之亦然。
我正在尝试创建Array3
,这是第三个数据阵列,其中第一个维度是两个来源的日期结合,第二维是可用仪器的联合,第三维是六个仪器属性。
如果可能,我想从array2填充array3。只有在array2中没有任何内容时,我才想从array1填充。因此,对于给定的仪器和日期,如果数据存在于array1和array2中,我想从array2中填充array3。
我尝试了一个解决方案将数组的切片转换为时间表,并使用续航时间将切片达到相同的时间长度,并将数据复制到第三阵列。这很慢,我认为必须有更好的方法。如果有人可以向我展示一种矢量化的方法,我将不胜感激。
Array1 = randn(4,5,6); % time x instrument x attribute
A1Times = datetime([today-3:today]', 'ConvertFrom','datenum'); % times of first dimension of Array1
A1Inst = [3 4 5 6 7]'; % instruments of second dimension of Array1
Array1(round(1 + (numel(Array1)-1).*rand(round(numel(Array1)/5),1))) = NaN; % put a few random NaNs in the array
Array2 = randn(6,8,6);
A2Times = datetime([today-2:today+3]','ConvertFrom','datenum'); % times of first dimension of Array2
A2Inst = [1 2 5 6 7 8 9 10]'; % instruments of second dimension of Array2
Array2(round(1 + (numel(Array2)-1).*rand(round(numel(Array2)/5),1))) = NaN; % put a few random NaNs in the array
% third dimension will always be the same for both matrices
dateUnion = union(A1Times,A2Times);
instrumentUnion = union(A1Inst,A2Inst);
% Initialize A3:
Array3 = NaN(numel(dateUnion),numel(instrumentUnion),6);
% what I want to do:
% if data exists for both Array1 and Array2, populate Array3 with data from Array1
% if data doesn't exist for Array1 and does exist for Array2, populate Array3 from Array2
%% clumsy retime solution, with two for loops
A1varnames = matlab.lang.makeValidName(cellstr([repmat('Array1Instrument',numel(A1Inst),1) num2str(A1Inst)]));
A2varnames = matlab.lang.makeValidName(cellstr([repmat('Array2Instrument',numel(A2Inst),1) num2str(A2Inst)]));
for ij = 1:6 % looping through third dimension
A1layer = array2timetable(Array1(:,:,ij),'RowTimes',A1Times);
A1layer.Properties.VariableNames = A1varnames;
A2layer = array2timetable(Array2(:,:,ij),'RowTimes',A2Times);
A2layer.Properties.VariableNames = A2varnames;
A1layer = retime(A1layer,dateUnion);
A2layer = retime(A2layer,dateUnion);
for ii = 1:numel(instrumentUnion)
[~,A1loc] = ismember(instrumentUnion(ii),A1Inst);
[~,A2loc] = ismember(instrumentUnion(ii),A2Inst);
if (A1loc == 0)
Array3(:,ii,ij) = A2layer{:,A2loc};
elseif A2loc == 0
Array3(:,ii,ij) = A1layer{:,A1loc};
else % if instrument exists in both sources
A1vec = A1layer{:,A1loc};
A2vec = A2layer{:,A2loc};
% if data exists in Array2 and Array1, choose Array2
% if data exists in Array2 and not Array1, choose Array2
% if data exists in Array1 and not Array2, choose Array1
bothpopulated = ~isnan(A1vec) & ~isnan(A2vec);
onlyA2populated = ~isnan(A2vec) & isnan(A1vec);
onlyA1populated = isnan(A2vec) & ~isnan(A1vec);
Array3(bothpopulated,ii,ij) = A2vec(bothpopulated);
Array3(onlyA2populated,ii,ij) = A2vec(onlyA2populated);
Array3(onlyA1populated,ii,ij) = A1vec(onlyA1populated);
end
end
end
首先,您需要将AxTimes
和AxInst
映射到顺序整数,以便它们可用于多维数组索引。unique
的第三个输出给出了这些索引。之后,您只需要使用逻辑和多维数组索引来分配值即可。在这里,我简化了您的示例,并将A1Times
更改为数字。
Array1 = randn(4,5,6);
A1Times = [1 2 3 4].'
A1Inst = [3 4 5 6 7].';
Array1(round(1 + (numel(Array1)-1).*rand(round(numel(Array1)/5),1))) = NaN;
Array2 = randn(6,8,6);
A2Times = [3 4 5 6 7 8].';
A2Inst = [1 2 5 6 7 8 9 10].';
Array2(round(1 + (numel(Array2)-1).*rand(round(numel(Array2)/5),1))) = NaN;
[ut,~,iut] = unique([A1Times; A2Times]);
[ui,~,iui] = unique([A1Inst; A2Inst]);
Array3 = NaN(numel(ut), numel(ui), 6);
Array3(iut(numel(A1Times)+1:end), iui(numel(A1Inst)+1:end), :) = Array2;
idx3 = false(size(Array3));
idx3(iut(1:numel(A1Times)), iui(1:numel(A1Inst)), :) = true;
idx3 = idx3 & isnan(Array3);
idx1 = idx3(iut(1:numel(A1Times)), iui(1:numel(A1Inst)), :);
Array3(idx3) = Array1(idx1);