matlab - 将纯文本作为数字导入 MATLAB 矩阵以进行语义神经网络分析
问题描述
似乎必须使用脚本将纯文本导入 MATLAB,将其转换为数字格式,并将其存储在矩阵中。在 MATLAB 中对文本数据进行任何类型的分析都需要此步骤,例如使用神经网络提取语义特征或对已知特征的文本进行分类。如果没有脚本,我没有找到任何方法来做到这一点。这是我写的:
%texttoconvertincellarray- is a cell array where each row has text data imported from a csv file
longestrow=texttoconvertincellarray(cellfun(@(x) numel(x),texttoconvertincellarray)==max(cellfun(@(x) numel(x),texttoconvertincellarray))); %find the longest string in cell array rows
maxrowsize=cellfun('length',longestrow); %get longest string size to set max matrix Y
A=zeros(maxrowsize,length(texttoconvertincellarray)); %create an empty matrix the size of data
%Y - rows of imported cell array
%X - size of the longest string in rows
%later will transpose the matrix
j=1; %matrix element sequential index
for i=1:length(texttoconvertincellarray) %loop through each row of text data imported
%into cell array
conv = double(texttoconvertincellarray{i}); %convert characters of each row of the cell
%array to integer code representations
A(j:j+length(conv)-1) = conv; %update matrix element range starting at each
%column's first element with the converted data
j=j+size(A,1); %increment matrix element sequential index to
%the next column's first element
end
A=transpose(A); %transpose the matrix; this is necessary because matrix sequential element addressing works verically, then horizontally
%now each row in matrix A represents sentences in the imported cell array, with letters converted to
%numerical representations
这是否可以进一步简化,如果不是为了其他什么,那就是为了优雅?