如何使用"textscan"将字符串转换为表

我正在使用matlab使用urlread以.csv文件的形式读取约翰·霍普金斯大学提供的新冠肺炎数据，但我不确定如何在下一步中使用textscan将字符串转换为表。.csv文件的前两列是指定区域的字符串，后面是大量列，其中包含按日期登记的感染人数。

目前，我只是将urlread返回的字符串保存在本地，然后用importdata打开这个文件，但肯定会有一个更优雅的解决方案。

您混淆了两件事：要么您想使用"textscan"(当然还有"fopen"one_answers"fclose"(从下载的csv文件中读取，要么您想要使用"urlread"(或者更确切地说是"webread"，因为MATLAB建议不再使用"urrlead"(。我选择后者，因为我自己从来没有这样做过^^

因此，首先我们读取数据并将其拆分成行

url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv";
% read raw data as single character array
web = webread(url);
% split the array into a cell array representing each row of the table
row = strsplit(web,'n');

然后我们分配一个表(预分配对MATLAB来说很好，因为它将变量存储在RAM中的连续地址上，所以事先告诉MATLAB你需要多少空间(：

len = length(row);
% get the CSV-header as information about the number of columns
Head = strsplit(row{1},',');
% allocate table 
S = strings(len,2);
N = NaN(len,length(Head)-2);
T = [table(strings(len,1),strings(len,1),'VariableNames',Head(1:2)),...
repmat(table(NaN(len,1)),1,length(Head)-2)];
% rename columns of table
T.Properties.VariableNames = Head;

请注意，我做了一个小技巧，通过重复一个表来分配这么多"NaN"的重分析列。但是，将该表与字符串表连接起来很困难，因为两者都包含列名var1和var2。这就是为什么我立即重命名了第一个表的列。

现在我们可以实际填写表格了(这有点令人讨厌，因为有人觉得把"Korea，South"写进逗号分隔的文件中很好(

for i = 2:len
% split this row into columns
col = strsplit(row{i},',');
% quick conversion
num = str2double(col);
% keep strings where the result is NaN
lg = isnan(num);
str = cellfun(@string,col(lg)); 
T{i,1} = str(1);
T{i,2} = strjoin(str(2:end));% this is a nasty workaround necessary due to "Korea, South"
T{i,3:end} = num(~lg);
end

这也应该适用于即将到来的日子。让我知道你将如何处理数据

相关内容

最新更新

热门标签：