Matlab :从文件夹创建数组,其中包含包含分隔文件名和完整路径的列



我有一个文件夹,其中包含以特定方式命名的收据图片。日期以相反格式排在第一位(例如 21/11/2015 -> 15_11_21 ) 后跟一个空格,然后是收据的值(例如 18,45 -> 18_45

假设文件存储在位置C:picturesreceipts 中。在这个文件夹中,我有 3 个文件:

15_11_21 18_45.jpg
15_11_22 115_28.jpg
15_12_02 3_00.jpg

我想创建一个有 3 列的数组。第一列包含正常格式的收货日期,第二列包含负值,第三列包含文件的绝对路径。数组应如下所示:

Receipts = [21/11/2015|-18,45 |C:picturesreceipts15_11_21 18_45.jpg
            22/11/2015|-115,28|C:picturesreceipts15_11_22 115_28.jpg 
            02/12/2015| -3,00 |C:picturesreceipts15_12_02 3_00.jpg];

我尝试修改/组合各种函数,例如获取完整路径:

[status, list] = system( 'dir /B /S *.mp3' );
result = textscan( list, '%s', 'delimiter', 'n' );
fileList = result{1}

strsplit分隔文件名的值,甚至这个函数,但我无法得到想要的结果。

看起来strsplit应该做你想做的事。尝试:

strsplit (文件名, {' ', '.'})

另外,我会使用 dir 而不是 system,因为它可能更独立于操作系统的变化。

有点"黑客":

filename = 'C:picturesreceipts15_11_21 18_45.jpg';
filename = strsplit(filename,'');
filename = filename(end);
d = textscan('15_11_21 18_45.jpg', '%d_%d_%d %d_%d.jpg');
day   = d{1};
month = d{2};
year  = d{3};
a     = -d{4};
b     = d{5};
receipt = sprintf('%d/%d/20%d|%d,%d|%s', year, month, day, a, b, filename{1})

查看格式运算符(例如键入 doc sprintf )。您可能需要添加一些用于对齐/间距的标志。

一个选项,使用正则表达式和数据结构作为最终输出:

% Get list of JPEGs in the current directory + subdirectories
[~, list] = system( 'dir /B /S *.jpg' );
result = textscan( list, '%s', 'delimiter', 'n' );
fileList = result{1};
% Split out file names, could use a regex but why bother. Using cellfun
% rather than an explicit loop
[~, filenames] = cellfun(@fileparts, fileList, 'UniformOutput', false);
% Used named tokens to pull out our data for analysis
Receipts = regexp(filenames, '(?<date>d*_d*_d*)s*(?<cost>d*_d*)', 'names');
Receipts = [Receipts{:}];  % Dump out our nested data
[Receipts(:).fullpath] = fileList{:};  % Add file path to our structure
% Reformat costs
% Replace underscore with decimal, convert to numeric array and negate
tmp = -str2double(strrep({Receipts(:).cost}, '_', '.')); 
tmp = num2cell(tmp);  % Necessary intermediate step, because MATLAB...
[Receipts(:).cost] = tmp{:};  % Replace field in our data structure
clear tmp
% Reformat dates
formatIn = 'yy_mm_dd';
formatOut = 'dd/mm/yyyy';
pivotYear = 2000;  % Pivot year needed since we have 2-digit years
% datenum needed because we have a custom input date format
tmp = datestr(datenum({Receipts(:).date}, formatIn, pivotYear), formatOut);
tmp = cellstr(tmp);  % Necessary intermediate step, because MATLAB...
[Receipts(:).date] = tmp{:};
clear tmp

这将生成一个结构数组,Receipts .我之所以走这条路,是因为将来访问数据会更明确。例如,如果我想要第二张收据的费用,我可以做:

Employee2Cost = Receipts(2).cost;

其中返回:

Employee2Cost =
 -115.2800

最新更新