我有一个文件夹,其中包含以特定方式命名的收据图片。日期以相反格式排在第一位(例如 21/11/2015 -> 15_11_21
) 后跟一个空格,然后是收据的值(例如 18,45 -> 18_45
)
假设文件存储在位置C:picturesreceipts
中。在这个文件夹中,我有 3 个文件:
15_11_21 18_45.jpg
15_11_22 115_28.jpg
15_12_02 3_00.jpg
我想创建一个有 3 列的数组。第一列包含正常格式的收货日期,第二列包含负值,第三列包含文件的绝对路径。数组应如下所示:
Receipts = [21/11/2015|-18,45 |C:picturesreceipts15_11_21 18_45.jpg
22/11/2015|-115,28|C:picturesreceipts15_11_22 115_28.jpg
02/12/2015| -3,00 |C:picturesreceipts15_12_02 3_00.jpg];
我尝试修改/组合各种函数,例如获取完整路径:
[status, list] = system( 'dir /B /S *.mp3' );
result = textscan( list, '%s', 'delimiter', 'n' );
fileList = result{1}
strsplit
分隔文件名的值,甚至这个函数,但我无法得到想要的结果。
看起来strsplit应该做你想做的事。尝试:
strsplit (文件名, {' ', '.'})
另外,我会使用 dir 而不是 system,因为它可能更独立于操作系统的变化。
有点"黑客":
filename = 'C:picturesreceipts15_11_21 18_45.jpg';
filename = strsplit(filename,'');
filename = filename(end);
d = textscan('15_11_21 18_45.jpg', '%d_%d_%d %d_%d.jpg');
day = d{1};
month = d{2};
year = d{3};
a = -d{4};
b = d{5};
receipt = sprintf('%d/%d/20%d|%d,%d|%s', year, month, day, a, b, filename{1})
查看格式运算符(例如键入 doc sprintf
)。您可能需要添加一些用于对齐/间距的标志。
一个选项,使用正则表达式和数据结构作为最终输出:
% Get list of JPEGs in the current directory + subdirectories
[~, list] = system( 'dir /B /S *.jpg' );
result = textscan( list, '%s', 'delimiter', 'n' );
fileList = result{1};
% Split out file names, could use a regex but why bother. Using cellfun
% rather than an explicit loop
[~, filenames] = cellfun(@fileparts, fileList, 'UniformOutput', false);
% Used named tokens to pull out our data for analysis
Receipts = regexp(filenames, '(?<date>d*_d*_d*)s*(?<cost>d*_d*)', 'names');
Receipts = [Receipts{:}]; % Dump out our nested data
[Receipts(:).fullpath] = fileList{:}; % Add file path to our structure
% Reformat costs
% Replace underscore with decimal, convert to numeric array and negate
tmp = -str2double(strrep({Receipts(:).cost}, '_', '.'));
tmp = num2cell(tmp); % Necessary intermediate step, because MATLAB...
[Receipts(:).cost] = tmp{:}; % Replace field in our data structure
clear tmp
% Reformat dates
formatIn = 'yy_mm_dd';
formatOut = 'dd/mm/yyyy';
pivotYear = 2000; % Pivot year needed since we have 2-digit years
% datenum needed because we have a custom input date format
tmp = datestr(datenum({Receipts(:).date}, formatIn, pivotYear), formatOut);
tmp = cellstr(tmp); % Necessary intermediate step, because MATLAB...
[Receipts(:).date] = tmp{:};
clear tmp
这将生成一个结构数组,Receipts
.我之所以走这条路,是因为将来访问数据会更明确。例如,如果我想要第二张收据的费用,我可以做:
Employee2Cost = Receipts(2).cost;
其中返回:
Employee2Cost =
-115.2800