首页 > 解决方案 > unexpected behavior textscan matlab when skipping characters

问题描述

I'm trying to skip a set # of characters in Matlab using textscan. The documentation states that %*nc will skip n characters (EVEN delimiters).

I thought the following would work:

str = '   0:   1/ 1    1|        ASDFASDF |   1 |   5 ';
pat = '%f: %f/%f %f|%*17c|%f | %f';
wtf = textscan(str,pat,'MultipleDelimsAsOne',true)

which yields:

{[0]}    {[1]}    {[1]}    {[1]}    {0×1 double}    {0×1 double}

This is not correct, the last entries should be 1 and 5.

By luck I happened to try capturing those characters rather than skipping ...

%Notice I removed the '*' character
pat = '%f: %f/%f %f|%17c|%f | %f'; 
wtf = textscan(str,pat,'MultipleDelimsAsOne',true)

which yields:

{[0]}    {[1]}    {[1]}    {[1]}    {'ASDFASDF |   1 | '}    {0×1 double}    {0×1 double}

This was completely unexpected, the returned string starts at the first non-space and consumes past where I had intended. Based on this I tried:

%I've increased the size and removed the bordering '|' characters 
old_pat = '%f: %f/%f %f|%*17c|%f | %f';
new_pat = '%f: %f/%f %f%*19c%f | %f';
wtf = textscan(str,new_pat,'MultipleDelimsAsOne',true)

which yields my targeted result:

{[0]}    {[1]}    {[1]}    {[1]}    {[1]}    {[5]}

I would have thought that after matching the literal character '|' that %*nc would start consuming/skipping at the space, not after the first non-space. Is this behavior expected based on the documentation? What's happening here?

Note, I originally had %[^|] in this area but I ran into problems when the text itself was '|' or '||'. So now I'm going to do a double pass (or maybe actually capture and then trim the whitespace) since I couldn't figure out how to adjust the textscan call to gracefully ignore those cases.

标签: matlabtextscan

解决方案


推荐阅读