Overlapping Regexps

David Bateman David.Bateman at motorola.com
Mon Mar 31 03:25:48 CDT 2008


Bill Denney wrote:
> When running the following,
>
> frag = {"MGTGGR" "R" "GAAAAPLLVAVAALLLGAAGHLYPGEVCPGMDIR" "NNLTR" \
>         "LHELENCSVIEGHLQILLMFK" "TRPEDFR" "DLSFPK" "LIMITDYLLLFR" \
>         "VYGLESLK" "DLFPNLTVIR"};
> seq = strcat (frag{:});
> cuts = regexp (seq, '[KR][^P]');
>
> the result is
> cuts = [6 41 46 67 74 80 92 100],
> but I expect for cuts to also find 7.  In other words, I expected
> cuts = [6 7 41 46 67 74 80 92 100].
>
> On a related note, if there is overlap in matches, is there a way to 
> make regexp return the overlapping matches?  For example:
>
> a = "ababababab"
> b = regexp (a, "aba")
>
> returns b = [1 5] when I would like for it to return b = [1 3 5 7].
>
> Is this a bug in my understanding of regexp or in regexp?
>
>   
This seems to be the matlab compatible behavior.. See

>> frag = {'MGTGGR' 'R' 'GAAAAPLLVAVAALLLGAAGHLYPGEVCPGMDIR' 'NNLTR' ...
        'LHELENCSVIEGHLQILLMFK' 'TRPEDFR' 'DLSFPK' 'LIMITDYLLLFR' ...
        'VYGLESLK' 'DLFPNLTVIR'};
>> seq = strcat (frag{:});
>> cuts = regexp (seq, '[KR][^P]');
>> cuts

cuts =

     6    41    46    67    74    80    92   100

with matlab R2007b

D.


-- 
David Bateman                                David.Bateman at motorola.com
Motorola Labs - Paris                        +33 1 69 35 48 04 (Ph) 
Parc Les Algorithmes, Commune de St Aubin    +33 6 72 01 06 33 (Mob) 
91193 Gif-Sur-Yvette FRANCE                  +33 1 69 35 77 01 (Fax) 

The information contained in this communication has been classified as: 

[x] General Business Information 
[ ] Motorola Internal Use Only 
[ ] Motorola Confidential Proprietary



More information about the Help-octave mailing list