Quantcast

On the generation and handling of _UID

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

On the generation and handling of _UID

Julio Sánchez-2
Hi,

Searching for something else, I found at last something that seems like a spec for _UID.  It is referenced in the following message, together with announcement of the convenience of having those on future submissions to FamilySearch:

http://lists.ldsoss.org/pipermail/ldsoss/2006-May/002298.html

I am not a member of LDS and I am not likely to submit anything there either,  but I think generation and proper handling of _UID will be a good thing because it makes the merge while import much more reliable and hands-free.  They are nonstandard, but they solve a problem that cannot be solved any other way in a genealogy software multi-vendor wolrd.  Other methods might have been, but this *is*.

I have myself found the _UIDs I generate for other applications are not acceptable, but now I think I can adapt with some effort leveraging formats 3 and 5 from RFC4122.

Best regards,

Julio

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: On the generation and handling of _UID

Benny Malengier
2008/2/11, Julio Sánchez <[hidden email]>:
Hi,

Searching for something else, I found at last something that seems like a spec for _UID.  It is referenced in the following message, together with announcement of the convenience of having those on future submissions to FamilySearch:

<a href="http://lists.ldsoss.org/pipermail/ldsoss/2006-May/002298.html" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://lists.ldsoss.org/pipermail/ldsoss/2006-May/002298.html

I am not a member of LDS and I am not likely to submit anything there either,  but I think generation and proper handling of _UID will be a good thing because it makes the merge while import much more reliable and hands-free.  They are nonstandard, but they solve a problem that cannot be solved any other way in a genealogy software multi-vendor wolrd.  Other methods might have been, but this *is*.

We should make this a feature of GRAMPS I think. With web sharing becoming prevalent, it becomes more and more important.
 

I have myself found the _UIDs I generate for other applications are not acceptable, but now I think I can adapt with some effort leveraging formats 3 and 5 from RFC4122.

Can you present example code for GRAMPS? I would suppose you generate those in GRAMPS and would want them stored in the person record too. It would look advantageous to move the gramps handle to such a thing, or is that not possible?

Benny

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: On the generation and handling of _UID

Julio Sánchez-2
Hello,

I am debugging code based on version 5 RFC4122 UUID values for GRAMPS 2.2 to be forward ported to 3.0.  It is very easy, but if done wrongly and put into use, it is ugly to change. This format hashes together another UUID (the namespace) and a string (the textual form of the handle) and encodes it as mandated by RFC4122 but to follow the format suggested, prints it in uppercase hexadecimal without dashes instead of what is usual for UUIDs.  No checksum is generated.

To do this, a UUID needs to be allocated.  I have generated 516cd010-5a41-470f-99f8-eb22f1098ad6 for that purpose.

Now, can it be assumed python 2.5 as target to leverage the builtin uuid module?

For the time being I am just adding them on Gedcom output if no _UID value is present.  Later a more permanent approach could be better, but I'd rather introduce this slowly.

Another thing, should an internal attribute ID be assigned that is mapped on input/output instead of having an attribute called "_UID"?

Regards,

Julio

2008/2/11, Benny Malengier <[hidden email]>:
2008/2/11, Julio Sánchez <[hidden email]>:
Hi,

Searching for something else, I found at last something that seems like a spec for _UID.  It is referenced in the following message, together with announcement of the convenience of having those on future submissions to FamilySearch:

<a href="http://lists.ldsoss.org/pipermail/ldsoss/2006-May/002298.html" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://lists.ldsoss.org/pipermail/ldsoss/2006-May/002298.html

I am not a member of LDS and I am not likely to submit anything there either,  but I think generation and proper handling of _UID will be a good thing because it makes the merge while import much more reliable and hands-free.  They are nonstandard, but they solve a problem that cannot be solved any other way in a genealogy software multi-vendor wolrd.  Other methods might have been, but this *is*.

We should make this a feature of GRAMPS I think. With web sharing becoming prevalent, it becomes more and more important.
 

I have myself found the _UIDs I generate for other applications are not acceptable, but now I think I can adapt with some effort leveraging formats 3 and 5 from RFC4122.

Can you present example code for GRAMPS? I would suppose you generate those in GRAMPS and would want them stored in the person record too. It would look advantageous to move the gramps handle to such a thing, or is that not possible?

Benny


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: On the generation and handling of _UID

Benny Malengier


2008/2/12, Julio Sánchez <[hidden email]>:
Hello,

I am debugging code based on version 5 RFC4122 UUID values for GRAMPS 2.2 to be forward ported to 3.0.  It is very easy, but if done wrongly and put into use, it is ugly to change. This format hashes together another UUID (the namespace) and a string (the textual form of the handle) and encodes it as mandated by RFC4122 but to follow the format suggested, prints it in uppercase hexadecimal without dashes instead of what is usual for UUIDs.  No checksum is generated.

To do this, a UUID needs to be allocated.  I have generated 516cd010-5a41-470f-99f8-eb22f1098ad6 for that purpose.

Now, can it be assumed python 2.5 as target to leverage the builtin uuid module?

Yes 3.0 requires python 2.5

For the time being I am just adding them on Gedcom output if no _UID value is present.  Later a more permanent approach could be better, but I'd rather introduce this slowly.

Another thing, should an internal attribute ID be assigned that is mapped on input/output instead of having an attribute called "_UID"?

You mean in the attribute tab of person? Not sure, I see they use GUID in the link you referred too. Looks very cryptic to new users though. Something like Universal ID is probably best as attribute type. If that is what you mean here.

Benny


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: On the generation and handling of _UID

Julio Sánchez-2
2008/2/12, Benny Malengier <[hidden email]>:

You mean in the attribute tab of person? Not sure, I see they use GUID in the link you referred too. Looks very cryptic to new users though. Something like Universal ID is probably best as attribute type. If that is what you mean here.


I think it would be best to hide them.  Only experts can do anything sensible with them and the operation is very rarely necessary.  I actually was thinking that in later GRAMPS versions English strings in the database have been evolving towards neutral codes that are translated on output.  Standard names have hardwired codes.  I was wondering if giving the universal identifier such treatment was warranted.  I think it is.

The expert operation needed now and then is, essentially nuking all _UID values.  A real wizard might be able to delete them selectively, but this is rare and not doing it is not fatal while the risk of getting it wrong is real and the effects much worse.

The scenario where this might be needed is recovering from a wrong merge.  If the merged _UID values are kept, later on the wrong merge might be triggered either in a future GRAMPS version that implements such facility or on another program that know about the wrongly merged people and, now after getting notice that they are actually the same, merges them again.  A recurring nightmare.

Unmerging records is very hard in general so the above problem can be solved without exposing the _UID values to users and at the same time helping with the un-merge.  The solution is to implement a "duplicate" operation (or, some other name, like "split", "unmerge" I think would be misleading, but "duplicate" might be abused).

Duplication would create two new persons out of one THAT IS DESTROYED, i.e. its handle is not kept in either.  Names, events, etc. from the person are copied into both.  As a child it appears twice now.  Spouse relationships are duplicated now and children are linked into both families.  _UID value are of course nuked (normal maintenance might end up recovering these values but if not, well, nothing to lose sleep over).

After this duplication, user can now go to each person and remove incorrect data or relationships from each until the effect of the wrong merge is undone.  Seems a hard work, imagine what this entails by hand.

In the end, exposing _UIDs would be unnecessary.

Regards,

Julio


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: On the generation and handling of _UID

Benny Malengier
In reply to this post by Julio Sánchez-2
Julio, did you ever finish this code? Could you share.
I was thinking of having a go creating a 3th party plugin tool to create UID attributes as a first step, and the creation of the UID would be step one obviously.

In order to avoid extra tabs, my idea is to have UID as part of the attribute system of GRAMPS with a fixed default attribute code. I don't think there is support to add this as part of the database seperately at the moment, and I don't know if that would be such a good idea either.

Benny

2008/2/12 Julio Sánchez <[hidden email]>
Hello,

I am debugging code based on version 5 RFC4122 UUID values for GRAMPS 2.2 to be forward ported to 3.0.  It is very easy, but if done wrongly and put into use, it is ugly to change. This format hashes together another UUID (the namespace) and a string (the textual form of the handle) and encodes it as mandated by RFC4122 but to follow the format suggested, prints it in uppercase hexadecimal without dashes instead of what is usual for UUIDs.  No checksum is generated.

To do this, a UUID needs to be allocated.  I have generated 516cd010-5a41-470f-99f8-eb22f1098ad6 for that purpose.

Now, can it be assumed python 2.5 as target to leverage the builtin uuid module?

For the time being I am just adding them on Gedcom output if no _UID value is present.  Later a more permanent approach could be better, but I'd rather introduce this slowly.

Another thing, should an internal attribute ID be assigned that is mapped on input/output instead of having an attribute called "_UID"?

Regards,

Julio

2008/2/11, Benny Malengier <[hidden email]>:
2008/2/11, Julio Sánchez <[hidden email]>:
Hi,

Searching for something else, I found at last something that seems like a spec for _UID.  It is referenced in the following message, together with announcement of the convenience of having those on future submissions to FamilySearch:

http://lists.ldsoss.org/pipermail/ldsoss/2006-May/002298.html

I am not a member of LDS and I am not likely to submit anything there either,  but I think generation and proper handling of _UID will be a good thing because it makes the merge while import much more reliable and hands-free.  They are nonstandard, but they solve a problem that cannot be solved any other way in a genealogy software multi-vendor wolrd.  Other methods might have been, but this *is*.

We should make this a feature of GRAMPS I think. With web sharing becoming prevalent, it becomes more and more important.
 

I have myself found the _UIDs I generate for other applications are not acceptable, but now I think I can adapt with some effort leveraging formats 3 and 5 from RFC4122.

Can you present example code for GRAMPS? I would suppose you generate those in GRAMPS and would want them stored in the person record too. It would look advantageous to move the gramps handle to such a thing, or is that not possible?

Benny



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: On the generation and handling of _UID

Julio Sánchez-2
Benny,

This part I think I haven't changed in a while, I also use an attribute for that.  I enclose the patch to Utils.py, it adds two functions, one that ensures the entry has at least one _UID (no need to generate new values if the entry already contains one).  The other generates a new value using the GRAMPS handle.  Since entropy is insufficient for 128 bits, I compose it using one of the methods in the RFC (which number escapes me right now) to merge it with a 128-bit random value generated specifically for this purpose.  The result is that, if the GRAMPS handle is considered globally unique (this has been an assumption for some time in GRAMPS I think), the transformation result is unique as well.  Moreover, every GRAMPS instance would compute the same _UID for the same entry.  So no pollution happens because of _UID generation.

Regards,

Julio

2008/11/5 Benny Malengier <[hidden email]>
Julio, did you ever finish this code? Could you share.
I was thinking of having a go creating a 3th party plugin tool to create UID attributes as a first step, and the creation of the UID would be step one obviously.

In order to avoid extra tabs, my idea is to have UID as part of the attribute system of GRAMPS with a fixed default attribute code. I don't think there is support to add this as part of the database seperately at the moment, and I don't know if that would be such a good idea either.

Benny

2008/2/12 Julio Sánchez <[hidden email]>

Hello,

I am debugging code based on version 5 RFC4122 UUID values for GRAMPS 2.2 to be forward ported to 3.0.  It is very easy, but if done wrongly and put into use, it is ugly to change. This format hashes together another UUID (the namespace) and a string (the textual form of the handle) and encodes it as mandated by RFC4122 but to follow the format suggested, prints it in uppercase hexadecimal without dashes instead of what is usual for UUIDs.  No checksum is generated.

To do this, a UUID needs to be allocated.  I have generated 516cd010-5a41-470f-99f8-eb22f1098ad6 for that purpose.

Now, can it be assumed python 2.5 as target to leverage the builtin uuid module?

For the time being I am just adding them on Gedcom output if no _UID value is present.  Later a more permanent approach could be better, but I'd rather introduce this slowly.

Another thing, should an internal attribute ID be assigned that is mapped on input/output instead of having an attribute called "_UID"?

Regards,

Julio

2008/2/11, Benny Malengier <[hidden email]>:
2008/2/11, Julio Sánchez <[hidden email]>:
Hi,

Searching for something else, I found at last something that seems like a spec for _UID.  It is referenced in the following message, together with announcement of the convenience of having those on future submissions to FamilySearch:

http://lists.ldsoss.org/pipermail/ldsoss/2006-May/002298.html

I am not a member of LDS and I am not likely to submit anything there either,  but I think generation and proper handling of _UID will be a good thing because it makes the merge while import much more reliable and hands-free.  They are nonstandard, but they solve a problem that cannot be solved any other way in a genealogy software multi-vendor wolrd.  Other methods might have been, but this *is*.

We should make this a feature of GRAMPS I think. With web sharing becoming prevalent, it becomes more and more important.
 

I have myself found the _UIDs I generate for other applications are not acceptable, but now I think I can adapt with some effort leveraging formats 3 and 5 from RFC4122.

Can you present example code for GRAMPS? I would suppose you generate those in GRAMPS and would want them stored in the person record too. It would look advantageous to move the gramps handle to such a thing, or is that not possible?

Benny




-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel

uuid.patch (1K) Download Attachment
Loading...