Overlapping Regexps
Bill Denney
bill at denney.ws
Mon Mar 31 16:50:41 CDT 2008
Kim Hansen wrote:
> On Sun, Mar 30, 2008 at 6:26 PM, Bill Denney <bill at denney.ws> wrote:
>
>> When running the following,
>>
>> frag = {"MGTGGR" "R" "GAAAAPLLVAVAALLLGAAGHLYPGEVCPGMDIR" "NNLTR" \
>> "LHELENCSVIEGHLQILLMFK" "TRPEDFR" "DLSFPK" "LIMITDYLLLFR" \
>> "VYGLESLK" "DLFPNLTVIR"};
>> seq = strcat (frag{:});
>> cuts = regexp (seq, '[KR][^P]');
>>
>> the result is
>> cuts = [6 41 46 67 74 80 92 100],
>> but I expect for cuts to also find 7. In other words, I expected
>> cuts = [6 7 41 46 67 74 80 92 100].
>>
>> On a related note, if there is overlap in matches, is there a way to
>> make regexp return the overlapping matches? For example:
>>
>> a = "ababababab"
>> b = regexp (a, "aba")
>>
>> returns b = [1 5] when I would like for it to return b = [1 3 5 7].
>>
>> Is this a bug in my understanding of regexp or in regexp?
>>
>
> What you need is the "zero-width positive look-ahead assertion", it is
> documented for Perl in "man perlre". I have just tested it in Octave
> and it works there too (octave uses libpcre for regexpes).
>
> Your first regexp should be: "[KR](?=[^P])" or "[KR](?!P)"
>
> The second: "a(?=ba)"
Thanks, that was just what I was looking for. I didn't know about those
(and I thought that I knew regexps-- there is apparently always more to
know about them).
Have a good day,
Bill
More information about the Help-octave
mailing list