Quantcast

Re: [Gramps-users] Regular expression

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

robhealey1
Greetings:

What I would like to be able to do is, and it might be able now!

I would like to be able to see all people with the last name of ? and who lived in Ohio...

I do not know too much about the filtering system that is built into Gramps!

Sincerely yours,
Rob G. Healey


On Wed, May 25, 2011 at 10:44 AM, Peter Landgren <[hidden email]> wrote:
Hi,

I'm definitely not an expert on regular expressions, so I need some help:
I would like to easily find people with names spelled with on or two "s":
Like Nilson and Nilsson in the same person filter search.

/Peter

------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users



--
Sincerely yours,
Rob G. Healey

"Always surround yourself with people that inspire you to
greatness!"


------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

robhealey1
Greetings:

I did not even know about the [] and (), so I am grateful that someone asked the question...

Sincerely yours,
Rob G. Healey


On Thu, May 26, 2011 at 3:27 AM, doug <[hidden email]> wrote:
On 25/05/11 21:08, Serge Noiraud wrote:
> Le 25/05/2011 20:36, doug a écrit :
>> On 25/05/11 18:44, Peter Landgren wrote:
>>> Hi,
>>>
>>> I'm definitely not an expert on regular expressions, so I
>>> need some help:
>>> I would like to easily find people with names spelled
>>> with on or two "s":
>>> Like Nilson and Nilsson in the same person filter search.
>>>
>>> /Peter
>> Does this work?
>>
>> \s*[a-rt-zA-Z]*[s|ss]\w*
> I don't really know how it works in gramps, but the solution
> should be :
> (s|ss)
>
> The [] means only one character : from a to z and from A to Z
> the () means several characters : in our case s or ss
>
>> Doug
>
>
Ah! thanks for that. I hadn't appreciated the difference
between [] and ()

Doug

------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users



--
Sincerely yours,
Rob G. Healey

"Always surround yourself with people that inspire you to
greatness!"


------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

John Ralls-2

On May 26, 2011, at 3:37 AM, Rob Healey wrote:

Greetings:

I did not even know about the [] and (), so I am grateful that someone asked the question...

Sincerely yours,
Rob G. Healey


On Thu, May 26, 2011 at 3:27 AM, doug <[hidden email]> wrote:
On 25/05/11 21:08, Serge Noiraud wrote:
> Le 25/05/2011 20:36, doug a écrit :
>> On 25/05/11 18:44, Peter Landgren wrote:
>>> Hi,
>>>
>>> I'm definitely not an expert on regular expressions, so I
>>> need some help:
>>> I would like to easily find people with names spelled
>>> with on or two "s":
>>> Like Nilson and Nilsson in the same person filter search.
>>>
>>> /Peter
>> Does this work?
>>
>> \s*[a-rt-zA-Z]*[s|ss]\w*
> I don't really know how it works in gramps, but the solution
> should be :
> (s|ss)
>
> The [] means only one character : from a to z and from A to Z
> the () means several characters : in our case s or ss
>
>> Doug
>
>
Ah! thanks for that. I hadn't appreciated the difference
between [] and ()

Better and easier to use a lazy quantifier: \b[a-zA-Z]+?(s|ss)[a-z]*\b. Note that \w adds [0-9_], and you probably don't want that when you're matching names. I trust that the code behind this has re.M set so that [a-z] will be interpreted correctly (i.e., not literally, but as any unicode alphabetic character).  "\b" means word boundary, and is better than \s (whitespace) for isolating words... especially "zero or more" whitespace (\s*).

Regards,
John Ralls


------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Benny Malengier
Somebody should add some examples in the manual about regex search in the filter sidebar gramplet
Peter, as you obtained an answer ....

Benny

2011/5/26 John Ralls <[hidden email]>

On May 26, 2011, at 3:37 AM, Rob Healey wrote:

Greetings:

I did not even know about the [] and (), so I am grateful that someone asked the question...

Sincerely yours,
Rob G. Healey


On Thu, May 26, 2011 at 3:27 AM, doug <[hidden email]> wrote:
On 25/05/11 21:08, Serge Noiraud wrote:
> Le 25/05/2011 20:36, doug a écrit :
>> On 25/05/11 18:44, Peter Landgren wrote:
>>> Hi,
>>>
>>> I'm definitely not an expert on regular expressions, so I
>>> need some help:
>>> I would like to easily find people with names spelled
>>> with on or two "s":
>>> Like Nilson and Nilsson in the same person filter search.
>>>
>>> /Peter
>> Does this work?
>>
>> \s*[a-rt-zA-Z]*[s|ss]\w*
> I don't really know how it works in gramps, but the solution
> should be :
> (s|ss)
>
> The [] means only one character : from a to z and from A to Z
> the () means several characters : in our case s or ss
>
>> Doug
>
>
Ah! thanks for that. I hadn't appreciated the difference
between [] and ()

Better and easier to use a lazy quantifier: \b[a-zA-Z]+?(s|ss)[a-z]*\b. Note that \w adds [0-9_], and you probably don't want that when you're matching names. I trust that the code behind this has re.M set so that [a-z] will be interpreted correctly (i.e., not literally, but as any unicode alphabetic character).  "\b" means word boundary, and is better than \s (whitespace) for isolating words... especially "zero or more" whitespace (\s*).

Regards,
John Ralls


------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel



------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

John Ralls-2
In reply to this post by John Ralls-2

On May 26, 2011, at 8:44 AM, John Ralls wrote:


On May 26, 2011, at 3:37 AM, Rob Healey wrote:

Greetings:

I did not even know about the [] and (), so I am grateful that someone asked the question...

Sincerely yours,
Rob G. Healey


On Thu, May 26, 2011 at 3:27 AM, doug <[hidden email]> wrote:
On 25/05/11 21:08, Serge Noiraud wrote:
> Le 25/05/2011 20:36, doug a écrit :
>> On 25/05/11 18:44, Peter Landgren wrote:
>>> Hi,
>>>
>>> I'm definitely not an expert on regular expressions, so I
>>> need some help:
>>> I would like to easily find people with names spelled
>>> with on or two "s":
>>> Like Nilson and Nilsson in the same person filter search.
>>>
>>> /Peter
>> Does this work?
>>
>> \s*[a-rt-zA-Z]*[s|ss]\w*
> I don't really know how it works in gramps, but the solution
> should be :
> (s|ss)
>
> The [] means only one character : from a to z and from A to Z
> the () means several characters : in our case s or ss
>
>> Doug
>
>
Ah! thanks for that. I hadn't appreciated the difference
between [] and ()

Better and easier to use a lazy quantifier: \b[a-zA-Z]+?(s|ss)[a-z]*\b. Note that \w adds [0-9_], and you probably don't want that when you're matching names. I trust that the code behind this has re.M set so that [a-z] will be interpreted correctly (i.e., not literally, but as any unicode alphabetic character).  "\b" means word boundary, and is better than \s (whitespace) for isolating words... especially "zero or more" whitespace (\s*).

Oops, that's wrong. There isn't any unicode magic in [a-z] with re.M, so the only way to make it work with non-ascii characters is \b\w+?(s|ss)\w*\b . Python 3 is supposed to support POSIX character classes, so eventually you'll be able to use \b[[:alpha:]]+?(s|ss)[[:alpha:]]*\b, which will avoid matching numbers and underscores.

Regards,
John Ralls



------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Peter Landgren
Den Thursday 26 May 2011 19.24.18 skrev John Ralls:

> On May 26, 2011, at 8:44 AM, John Ralls wrote:
> > On May 26, 2011, at 3:37 AM, Rob Healey wrote:
> >> Greetings:
> >>
> >> I did not even know about the [] and (), so I am grateful that someone
> >> asked the question...
> >>
> >> Sincerely yours,
> >> Rob G. Healey
> >>
> >>
> >> On Thu, May 26, 2011 at 3:27 AM, doug <[hidden email]> wrote:
> >>
> >> On 25/05/11 21:08, Serge Noiraud wrote:
> >> > Le 25/05/2011 20:36, doug a écrit :
> >> >> On 25/05/11 18:44, Peter Landgren wrote:
> >> >>> Hi,
> >> >>>
> >> >>> I'm definitely not an expert on regular expressions, so I
> >> >>> need some help:
> >> >>> I would like to easily find people with names spelled
> >> >>> with on or two "s":
> >> >>> Like Nilson and Nilsson in the same person filter search.
> >> >>>
> >> >>> /Peter
> >> >>
> >> >> Does this work?
> >> >>
> >> >> \s*[a-rt-zA-Z]*[s|ss]\w*
> >> >
> >> > I don't really know how it works in gramps, but the solution
> >> > should be :
> >> > (s|ss)
> >> >
> >> > The [] means only one character : from a to z and from A to Z
> >> > the () means several characters : in our case s or ss
> >> >
> >> >> Doug
> >>
> >> Ah! thanks for that. I hadn't appreciated the difference
> >> between [] and ()
> >
> > Better and easier to use a lazy quantifier: \b[a-zA-Z]+?(s|ss)[a-z]*\b.
> > Note that \w adds [0-9_], and you probably don't want that when you're
> > matching names. I trust that the code behind this has re.M set so that
> > [a-z] will be interpreted correctly (i.e., not literally, but as any
> > unicode alphabetic character).  "\b" means word boundary, and is better
> > than \s (whitespace) for isolating words... especially "zero or more"
> > whitespace (\s*).
>
> Oops, that's wrong. There isn't any unicode magic in [a-z] with re.M, so
> the only way to make it work with non-ascii characters is
> \b\w+?(s|ss)\w*\b . Python 3 is supposed to support POSIX character
> classes, so eventually you'll be able to use
> \b[[:alpha:]]+?(s|ss)[[:alpha:]]*\b, which will avoid matching numbers and
> underscores.
>
> Regards,
> John Ralls

Thanks for all input.

But I needed a very simple regular expression. I wanted to filter out persons, spelling their
surnames a little different: There are four versions of "Eriksson":
Erikson
Eriksson
Ericson
Ericsson

Which I get with:
eri[ck](s|ss)on

Regards,

Peter




------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Peter Landgren
In reply to this post by Benny Malengier
> Somebody should add some examples in the manual about regex search in the
> filter sidebar gramplet
> Peter, as you obtained an answer ....
>
Benny,

I have inserted a small example here:

http://www.gramps-
project.org/wiki/index.php?title=People_screenshot#Main_window.2C_People_with_filter_sidebar

/Peter
 

> 2011/5/26 John Ralls <[hidden email]>
>
> > On May 26, 2011, at 3:37 AM, Rob Healey wrote:
> >
> > Greetings:
> >
> > I did not even know about the [] and (), so I am grateful that someone
> > asked the question...
> >
> > Sincerely yours,
> > Rob G. Healey
> >
> > On Thu, May 26, 2011 at 3:27 AM, doug <[hidden email]> wrote:
> >> On 25/05/11 21:08, Serge Noiraud wrote:
> >> > Le 25/05/2011 20:36, doug a écrit :
> >> >> On 25/05/11 18:44, Peter Landgren wrote:
> >> >>> Hi,
> >> >>>
> >> >>> I'm definitely not an expert on regular expressions, so I
> >> >>> need some help:
> >> >>> I would like to easily find people with names spelled
> >> >>> with on or two "s":
> >> >>> Like Nilson and Nilsson in the same person filter search.
> >> >>>
> >> >>> /Peter
> >> >>
> >> >> Does this work?
> >> >>
> >> >> \s*[a-rt-zA-Z]*[s|ss]\w*
> >> >
> >> > I don't really know how it works in gramps, but the solution
> >> > should be :
> >> > (s|ss)
> >> >
> >> > The [] means only one character : from a to z and from A to Z
> >> > the () means several characters : in our case s or ss
> >> >
> >> >> Doug
> >>
> >> Ah! thanks for that. I hadn't appreciated the difference
> >> between [] and ()
> >
> > Better and easier to use a lazy quantifier: \b[a-zA-Z]+?(s|ss)[a-z]*\b.
> > Note that \w adds [0-9_], and you probably don't want that when you're
> > matching names. I trust that the code behind this has re.M set so that
> > [a-z] will be interpreted correctly (i.e., not literally, but as any
> > unicode alphabetic character).  "\b" means word boundary, and is better
> > than \s (whitespace) for isolating words... especially "zero or more"
> > whitespace (\s*).
> >
> > Regards,
> > John Ralls
> >
> >
> >
> > -------------------------------------------------------------------------
> > ----- vRanger cuts backup time in half-while increasing security.
> > With the market-leading solution for virtual backup and recovery,
> > you get blazing-fast, flexible, and affordable data protection.
> > Download your free trial now.
> > http://p.sf.net/sfu/quest-d2dcopy1
> > _______________________________________________
> > Gramps-devel mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/gramps-devel

------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Serge Noiraud-2
In reply to this post by John Ralls-2
Le 26/05/2011 19:24, John Ralls a écrit :

On May 26, 2011, at 8:44 AM, John Ralls wrote:


On May 26, 2011, at 3:37 AM, Rob Healey wrote:

Greetings:

I did not even know about the [] and (), so I am grateful that someone asked the question...

Sincerely yours,
Rob G. Healey


On Thu, May 26, 2011 at 3:27 AM, doug <[hidden email]> wrote:
On 25/05/11 21:08, Serge Noiraud wrote:
> Le 25/05/2011 20:36, doug a écrit :
>> On 25/05/11 18:44, Peter Landgren wrote:
>>> Hi,
>>>
>>> I'm definitely not an expert on regular expressions, so I
>>> need some help:
>>> I would like to easily find people with names spelled
>>> with on or two "s":
>>> Like Nilson and Nilsson in the same person filter search.
>>>
>>> /Peter
>> Does this work?
>>
>> \s*[a-rt-zA-Z]*[s|ss]\w*
> I don't really know how it works in gramps, but the solution
> should be :
> (s|ss)
>
> The [] means only one character : from a to z and from A to Z
> the () means several characters : in our case s or ss
>
>> Doug
>
>
Ah! thanks for that. I hadn't appreciated the difference
between [] and ()

Better and easier to use a lazy quantifier: \b[a-zA-Z]+?(s|ss)[a-z]*\b. Note that \w adds [0-9_], and you probably don't want that when you're matching names. I trust that the code behind this has re.M set so that [a-z] will be interpreted correctly (i.e., not literally, but as any unicode alphabetic character).  "\b" means word boundary, and is better than \s (whitespace) for isolating words... especially "zero or more" whitespace (\s*).

Oops, that's wrong. There isn't any unicode magic in [a-z] with re.M, so the only way to make it work with non-ascii characters is \b\w+?(s|ss)\w*\b . Python 3 is supposed to support POSIX character classes, so eventually you'll be able to use
\b[[:alpha:]]+?(s|ss)[[:alpha:]]*\b, which will avoid matching numbers and underscores.
I think this is the best regexp for that. It will work in any language.

feature request ?
perhaps we could have a combobox ( choice between  several values ) in the regex entering widget which propose some examples.
You select the regexp which correspond to your need then modify the string to correspond to your search.
This way, it's easier for the final user which ignore writing the regexp expressions.

what kind of search do we need ( to fill the combobox ) ? If you don't find the solution in the combobox, you can always write your search string. it could be usefull too to save this new string for a later use ( in .ini file ? )

Regards,
John Ralls
Serge

------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Serge Noiraud-2
In reply to this post by Peter Landgren
Le 26/05/2011 20:45, Peter Landgren a écrit :

>> Somebody should add some examples in the manual about regex search in the
>> filter sidebar gramplet
>> Peter, as you obtained an answer ....
>>
> Benny,
>
> I have inserted a small example here:
>
> http://www.gramps-
> project.org/wiki/index.php?title=People_screenshot#Main_window.2C_People_with_filter_sidebar
>
> /Peter
If you want to see a complex example, I often use the following to select all people named noiraud :

n(e|es|o[aiy])r(on|(e|)au(d|lt|t|x|))

This string contains all known entries.
I can comment this.

>
>> 2011/5/26 John Ralls<[hidden email]>
>>
>>> On May 26, 2011, at 3:37 AM, Rob Healey wrote:
>>>
>>> Greetings:
>>>
>>> I did not even know about the [] and (), so I am grateful that someone
>>> asked the question...
>>>
>>> Sincerely yours,
>>> Rob G. Healey
>>>
>>> On Thu, May 26, 2011 at 3:27 AM, doug<[hidden email]>  wrote:
>>>> On 25/05/11 21:08, Serge Noiraud wrote:
>>>>> Le 25/05/2011 20:36, doug a écrit :
>>>>>> On 25/05/11 18:44, Peter Landgren wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm definitely not an expert on regular expressions, so I
>>>>>>> need some help:
>>>>>>> I would like to easily find people with names spelled
>>>>>>> with on or two "s":
>>>>>>> Like Nilson and Nilsson in the same person filter search.
>>>>>>>
>>>>>>> /Peter
>>>>>> Does this work?
>>>>>>
>>>>>> \s*[a-rt-zA-Z]*[s|ss]\w*
>>>>> I don't really know how it works in gramps, but the solution
>>>>> should be :
>>>>> (s|ss)
>>>>>
>>>>> The [] means only one character : from a to z and from A to Z
>>>>> the () means several characters : in our case s or ss
>>>>>
>>>>>> Doug
>>>> Ah! thanks for that. I hadn't appreciated the difference
>>>> between [] and ()
>>> Better and easier to use a lazy quantifier: \b[a-zA-Z]+?(s|ss)[a-z]*\b.
>>> Note that \w adds [0-9_], and you probably don't want that when you're
>>> matching names. I trust that the code behind this has re.M set so that
>>> [a-z] will be interpreted correctly (i.e., not literally, but as any
>>> unicode alphabetic character).  "\b" means word boundary, and is better
>>> than \s (whitespace) for isolating words... especially "zero or more"
>>> whitespace (\s*).
>>>
>>> Regards,
>>> John Ralls


------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Martin Steer-2
In reply to this post by Peter Landgren
On Thu, May 26, 2011 at 08:18:26PM +0200, Peter Landgren wrote:

>
>But I needed a very simple regular expression. I wanted to filter out persons, spelling their
>surnames a little different: There are four versions of "Eriksson":
>Erikson
>Eriksson
>Ericson
>Ericsson
>
>Which I get with:
>eri[ck](s|ss)on

Slightly less typing (as Johnny suggested):

eri[ck]ss?on

I.e. '(s|ss)' means 'either s or ss', whereas 'ss?' means 'one s and
perhaps another'.

--
Martin

------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Peter Landgren
In reply to this post by Serge Noiraud-2
Den Saturday 28 May 2011 00.39.49 skrev Serge Noiraud:

> Le 26/05/2011 20:45, Peter Landgren a écrit :
> >> Somebody should add some examples in the manual about regex search in
> >> the filter sidebar gramplet
> >> Peter, as you obtained an answer ....
> >
> > Benny,
> >
> > I have inserted a small example here:
> >
> > http://www.gramps-
> > project.org/wiki/index.php?title=People_screenshot#Main_window.2C_People_
> > with_filter_sidebar
> >
> > /Peter
>
> If you want to see a complex example, I often use the following to select
> all people named noiraud :
>
> n(e|es|o[aiy])r(on|(e|)au(d|lt|t|x|))
>
> This string contains all known entries.
> I can comment this.

Yes, please do that.

/Peter
 

> >> 2011/5/26 John Ralls<[hidden email]>
> >>
> >>> On May 26, 2011, at 3:37 AM, Rob Healey wrote:
> >>>
> >>> Greetings:
> >>>
> >>> I did not even know about the [] and (), so I am grateful that someone
> >>> asked the question...
> >>>
> >>> Sincerely yours,
> >>> Rob G. Healey
> >>>
> >>> On Thu, May 26, 2011 at 3:27 AM, doug<[hidden email]>  wrote:
> >>>> On 25/05/11 21:08, Serge Noiraud wrote:
> >>>>> Le 25/05/2011 20:36, doug a écrit :
> >>>>>> On 25/05/11 18:44, Peter Landgren wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I'm definitely not an expert on regular expressions, so I
> >>>>>>> need some help:
> >>>>>>> I would like to easily find people with names spelled
> >>>>>>> with on or two "s":
> >>>>>>> Like Nilson and Nilsson in the same person filter search.
> >>>>>>>
> >>>>>>> /Peter
> >>>>>>
> >>>>>> Does this work?
> >>>>>>
> >>>>>> \s*[a-rt-zA-Z]*[s|ss]\w*
> >>>>>
> >>>>> I don't really know how it works in gramps, but the solution
> >>>>> should be :
> >>>>> (s|ss)
> >>>>>
> >>>>> The [] means only one character : from a to z and from A to Z
> >>>>> the () means several characters : in our case s or ss
> >>>>>
> >>>>>> Doug
> >>>>
> >>>> Ah! thanks for that. I hadn't appreciated the difference
> >>>> between [] and ()
> >>>
> >>> Better and easier to use a lazy quantifier: \b[a-zA-Z]+?(s|ss)[a-z]*\b.
> >>> Note that \w adds [0-9_], and you probably don't want that when you're
> >>> matching names. I trust that the code behind this has re.M set so that
> >>> [a-z] will be interpreted correctly (i.e., not literally, but as any
> >>> unicode alphabetic character).  "\b" means word boundary, and is better
> >>> than \s (whitespace) for isolating words... especially "zero or more"
> >>> whitespace (\s*).
> >>>
> >>> Regards,
> >>> John Ralls
>
> ---------------------------------------------------------------------------
> --- vRanger cuts backup time in half-while increasing security.
> With the market-leading solution for virtual backup and recovery,
> you get blazing-fast, flexible, and affordable data protection.
> Download your free trial now.
> http://p.sf.net/sfu/quest-d2dcopy1
> _______________________________________________
> Gramps-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gramps-devel

-

------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Peter Landgren
In reply to this post by Martin Steer-2
Den Saturday 28 May 2011 06.42.39 skrev Martin Steer:

> On Thu, May 26, 2011 at 08:18:26PM +0200, Peter Landgren wrote:
> >But I needed a very simple regular expression. I wanted to filter out
> >persons, spelling their surnames a little different: There are four
> >versions of "Eriksson": Erikson
> >Eriksson
> >Ericson
> >Ericsson
> >
> >Which I get with:
> >eri[ck](s|ss)on
>
> Slightly less typing (as Johnny suggested):
>
> eri[ck]ss?on
>
> I.e. '(s|ss)' means 'either s or ss', whereas 'ss?' means 'one s and
> perhaps another'.
>
> --
> Martin

Even simpler. Changed in the example.
Thanks!
/Peter

------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

DS Blank
>> eri[ck]ss?on

> Even simpler. Changed in the example.

Ok, how about even simpler:

eri[ck]s+on

where + means one or more.

-Doug

------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Helge@Gramps
I just try to use regular expressions more and more. But I got a problem to write a running formula for NOT.
Example:
In the Event list I try to filter for all events but not for events having the Place Name "Berlin"
I tried this "^(Berlin)" but I get always the same result as for "Berlin" --> all events having for Place "Berlin".
It seems to me the ^ character matches all strings starting with the next character or group instead the by me expected NOT operation: "^(Ber)" matches all places starting with "Ber"
What's wrong in my doing?
Thank you
-Helge
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

John Ralls-2

On May 28, 2011, at 10:12 AM, Helge@Gramps wrote:

> I just try to use regular expressions more and more. But I got a problem to
> write a running formula for NOT.
> Example:
> In the Event list I try to filter for all events but not for events having
> the Place Name "Berlin"
> I tried this "^(Berlin)" but I get always the same result as for "Berlin"
> --> all events having for Place "Berlin".
> It seems to me the ^ character matches all strings starting with the next
> character or group instead the by me expected NOT operation: "^(Ber)"
> matches all places starting with "Ber"
> What's wrong in my doing?

When ^ is the first character in the RE, it means "at the beginning of the line". It only means "not" when it's the first character in a character class (e.g., [^sc] means "any character except s and c".  To negate Berlin, use (?!Berlin).

See http://docs.python.org/library/re.html for all of the rules.

Regards,
John Ralls



------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Martin Steer-2
In reply to this post by DS Blank
On Sat, May 28, 2011 at 12:57:44PM -0400, Doug Blank wrote:
>>> eri[ck]ss?on
>
>> Even simpler. Changed in the example.
>
>Ok, how about even simpler:
>
>eri[ck]s+on
>
>where + means one or more.

Okay, maybe, for the OP's problem, but not too good as an example, given
that it allows e.g. 'ericsssssssssssssssson' (there are too many s's
here).

Martin

------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

jerome
In reply to this post by Peter Landgren
Note, maybe a regex wizard could improve current filter rules on wiki ?

I do not know if this could work on MediaWiki, but I tried to edit (not
enable) a "BadContent" filter:

http://www.gramps-project.org/wiki/index.php?title=Special:Undelete&target=Rechercher+%3ABadContent&timestamp=20110225151128

like http://portland.freedesktop.org/wiki/BadContent

because current blacklists[1][2] seem to be limited with last spams
(every day)...

[1]
http://www.gramps-project.org/wiki/index.php?title=MediaWiki:Titleblacklist
[2] http://www.gramps-project.org/wiki/index.php?title=Usernameblacklist

Peter Landgren a écrit :

> Den Saturday 28 May 2011 06.42.39 skrev Martin Steer:
>> On Thu, May 26, 2011 at 08:18:26PM +0200, Peter Landgren wrote:
>>> But I needed a very simple regular expression. I wanted to filter out
>>> persons, spelling their surnames a little different: There are four
>>> versions of "Eriksson": Erikson
>>> Eriksson
>>> Ericson
>>> Ericsson
>>>
>>> Which I get with:
>>> eri[ck](s|ss)on
>> Slightly less typing (as Johnny suggested):
>>
>> eri[ck]ss?on
>>
>> I.e. '(s|ss)' means 'either s or ss', whereas 'ss?' means 'one s and
>> perhaps another'.
>>
>> --
>> Martin
>
> Even simpler. Changed in the example.
> Thanks!
> /Peter
>
> ------------------------------------------------------------------------------
> vRanger cuts backup time in half-while increasing security.
> With the market-leading solution for virtual backup and recovery,
> you get blazing-fast, flexible, and affordable data protection.
> Download your free trial now.
> http://p.sf.net/sfu/quest-d2dcopy1
> _______________________________________________
> Gramps-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>


------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Jon Clements-2
Hi All,

A slight note: it's preferred to write (ss|s) instead of (s|ss),
especially when capturing, as the regex engine will shortcut on first
match and consider that the group result. (ie, it won't try to find the
longest match in the group).

Regarding the OP, would it not just be easier to include a (double)
metaphone, soundex or similar algorithm? So,
"Clements|Clemence|Clemance" / "Erikkson|Ericson|Ericcsun" etc...

There is a GPL implementation of the above on pypi called advas (which
I've used briefly in the past).

Just my 2p.

Cheers,

Jon.



On 30/05/11 08:38, Jérôme wrote:

> Note, maybe a regex wizard could improve current filter rules on wiki ?
>
> I do not know if this could work on MediaWiki, but I tried to edit (not
> enable) a "BadContent" filter:
>
> http://www.gramps-project.org/wiki/index.php?title=Special:Undelete&target=Rechercher+%3ABadContent&timestamp=20110225151128
>
> like http://portland.freedesktop.org/wiki/BadContent
>
> because current blacklists[1][2] seem to be limited with last spams
> (every day)...
>
> [1]
> http://www.gramps-project.org/wiki/index.php?title=MediaWiki:Titleblacklist
> [2] http://www.gramps-project.org/wiki/index.php?title=Usernameblacklist
>
> Peter Landgren a écrit :
>> Den Saturday 28 May 2011 06.42.39 skrev Martin Steer:
>>> On Thu, May 26, 2011 at 08:18:26PM +0200, Peter Landgren wrote:
>>>> But I needed a very simple regular expression. I wanted to filter out
>>>> persons, spelling their surnames a little different: There are four
>>>> versions of "Eriksson": Erikson
>>>> Eriksson
>>>> Ericson
>>>> Ericsson
>>>>
>>>> Which I get with:
>>>> eri[ck](s|ss)on
>>> Slightly less typing (as Johnny suggested):
>>>
>>> eri[ck]ss?on
>>>
>>> I.e. '(s|ss)' means 'either s or ss', whereas 'ss?' means 'one s and
>>> perhaps another'.
>>>
>>> --
>>> Martin
>> Even simpler. Changed in the example.
>> Thanks!
>> /Peter
>>
>> ------------------------------------------------------------------------------
>> vRanger cuts backup time in half-while increasing security.
>> With the market-leading solution for virtual backup and recovery,
>> you get blazing-fast, flexible, and affordable data protection.
>> Download your free trial now.
>> http://p.sf.net/sfu/quest-d2dcopy1
>> _______________________________________________
>> Gramps-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>>
>
> ------------------------------------------------------------------------------
> vRanger cuts backup time in half-while increasing security.
> With the market-leading solution for virtual backup and recovery,
> you get blazing-fast, flexible, and affordable data protection.
> Download your free trial now.
> http://p.sf.net/sfu/quest-d2dcopy1
> _______________________________________________
> Gramps-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gramps-devel


------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery,
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now.
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Gramps-users] Regular expression

Helge@Gramps
In reply to this post by John Ralls-2
Hi John,
thank you for replay. But "(?!Berlin)" doesn't work, but it seems to be near by the solution
I switched over to "^(?!Berlin$)" as working regex for this task.
May be there are better soltuions....

-Helge
Loading...