Quantcast

Why does check.py remove rather than fix?

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Why does check.py remove rather than fix?

DS Blank
I'm trying to track down the cause of a database corruption, and
noticed in check.py that when a child has no reference to a family,
the child is removed from the family:

                    if family_handle not in \
                           child.get_parent_family_handle_list():
                        # The referenced child has no reference to the family
                        # This is tested by TestcaseGenerator where the father
                        # is "Broken8"
                        logging.warning("    FAIL: family '%(fam_gid)s' "
                                        "child '%(child_gid)s' has no reference"
                                        " to the family" %
                                        {'fam_gid' : family.gramps_id,
                                         'child_gid' : child.gramps_id})
                        family.remove_child_ref(child_ref)

Wouldn't it be better to add the family to the child, if possible? The
current behavior seems wrong.

Anyone know why this is the way that it is?

-Doug

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why does check.py remove rather than fix?

Benny Malengier
No idea.
I would guess related to a bug where this was the correct action.

Benny

2012/7/27 Doug Blank <[hidden email]>
I'm trying to track down the cause of a database corruption, and
noticed in check.py that when a child has no reference to a family,
the child is removed from the family:

                    if family_handle not in \
                           child.get_parent_family_handle_list():
                        # The referenced child has no reference to the family
                        # This is tested by TestcaseGenerator where the father
                        # is "Broken8"
                        logging.warning("    FAIL: family '%(fam_gid)s' "
                                        "child '%(child_gid)s' has no reference"
                                        " to the family" %
                                        {'fam_gid' : family.gramps_id,
                                         'child_gid' : child.gramps_id})
                        family.remove_child_ref(child_ref)

Wouldn't it be better to add the family to the child, if possible? The
current behavior seems wrong.

Anyone know why this is the way that it is?

-Doug

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why does check.py remove rather than fix?

Tim Lyons
Administrator
In reply to this post by DS Blank
DS Blank wrote
I'm trying to track down the cause of a database corruption, and
noticed in check.py that when a child has no reference to a family,
the child is removed from the family:

Wouldn't it be better to add the family to the child, if possible? The
current behavior seems wrong.

Anyone know why this is the way that it is?
I haven't looked at the code again recently, but I seem to recall that it has found a reference in one direction, but not in the reverse direction, and it removes the reference that does exist - as far as I recall it does not remove the person (which might be suggested by the message).

I think that asserting that a child belongs to a family might be regarded as a strong claim that needs adequate evidence. If there is doubt, as exemplified by the link going in only one direction, it is safer to remove the link, rather than reinforce the possibly doubtful assertion of a relationship by adding the link in the other direction. After all, the user (if he understands the messages!!!) can insert the two way relationship himself if he believes it to be correct.

Note that import which has been cleaned up for 3.4.0 (thanks Michiel Nauta) checks for certain references that are incorrect in the imported file. See bug 5466 though the solution has been applied to all sorts of import, especially GEDCOM not just XML: "one of the most common bugs we run into: the user imports a file, the file does not fully comply with the standard, the import routine does not complain and the user finds out that his family tree is malformed long after the import by doing something, like sorting, merging, generating a report or whatever". See the discussion in the bug on whether to ask the user to fix it.

We  decided to automatically correct the error if necessary (AFAIR) on import rather than remove the link, so not the same as the check and repair tool. I think the different choice here can be justified on the grounds that the problem is probably due to a failure in the exporting program which has not output all the information correctly, though the existence of part of the information implies that it believes the link to exist, so we trust the assertion from the imported file.

(The situation in 5466 relates to missing objects, but I think that the fix considers some other incorrect data as well, hence why it is related to this question).
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why does check.py remove rather than fix?

Tim Lyons
Administrator
Following this thread [1], and much private email discussion with the OP, I have changed my mind (180 degrees) about this topic.

In GEDCOM, FAM records have links to the INDI records for parents and children. The INDI records have FAMS/FAMC links back to the FAM record.

If a FAM record links to a parent, but the parent does not link back, then both import GEDCOM and Check repair the link by constructing a link back to the Family record.

However (in the situation you mention) if a FAM record links to a child, but the child does not link back, then import GEDCOM does not notice, and Check removes the link from the Family record to the child.

There are several reasons why I think the missing link should be inserted:
(1) The existence of links and reverse links in the GEDCOM seems to be entirely redundant: one can be deduced from the other.
(2) There is some difference of opinion as to whether both links are mandatory (I think they both are, but in view of the other arguments it is a waste of time to discuss which opinion is correct).
(3) There are probably lots of situations where software from which we are importing does not fully comply with the GEDCOM spec.
(4) According to the OP some other software does not make links in the GEDCOM both way mandatory. It assumes one from the other.
(5) The current situation is inconsistent.


Therefore I think import GEDCOM and Check should construct the reverse link in the Gramps database if it is omitted in the GEDCOM (or other input). I'm not sure whether the same thing applies to Import XML.

Regards,
Tim.

[1] http://gramps.1791082.n4.nabble.com/Re-Gramps-bugs-Reading-GED-com-Files-td4656078.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why does check.py remove rather than fix?

DS Blank
On Fri, Sep 7, 2012 at 9:27 AM, Tim Lyons [via GRAMPS]
<[hidden email]> wrote:

> Following this thread [1], and much private email discussion with the OP, I
> have changed my mind (180 degrees) about this topic.
>
> In GEDCOM, FAM records have links to the INDI records for parents and
> children. The INDI records have FAMS/FAMC links back to the FAM record.
>
> If a FAM record links to a parent, but the parent does not link back, then
> both import GEDCOM and Check repair the link by constructing a link back to
> the Family record.
>
> However (in the situation you mention) if a FAM record links to a child, but
> the child does not link back, then import GEDCOM does not notice, and Check
> removes the link from the Family record to the child.
>
> There are several reasons why I think the missing link should be inserted:
> (1) The existence of links and reverse links in the GEDCOM seems to be
> entirely redundant: one can be deduced from the other.
> (2) There is some difference of opinion as to whether both links are
> mandatory (I think they both are, but in view of the other arguments it is a
> waste of time to discuss which opinion is correct).
> (3) There are probably lots of situations where software from which we are
> importing does not fully comply with the GEDCOM spec.
> (4) According to the OP some other software does not make links in the
> GEDCOM both way mandatory. It assumes one from the other.
> (5) The current situation is inconsistent.
>
>
> Therefore I think import GEDCOM and Check should construct the reverse link
> in the Gramps database if it is omitted in the GEDCOM (or other input). I'm
> not sure whether the same thing applies to Import XML.
>
> Regards,
> Tim.

I had already come to agree, but with your additional points I think
that this is a necessary change. I am using a version that fixes
rather than deletes in my own use. If there is consensus on this, I'd
be glad to patch trunk.

-Doug

> [1]
> http://gramps.1791082.n4.nabble.com/Re-Gramps-bugs-Reading-GED-com-Files-td4656078.html
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://gramps.1791082.n4.nabble.com/Why-does-check-py-remove-rather-than-fix-tp4655799p4656302.html
> This email was sent by Tim Lyons (via Nabble)
> To receive all replies by email, subscribe to this discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why does check.py remove rather than fix?

jerome
In reply to this post by Benny Malengier
This is very old ...

http://gramps.svn.sourceforge.net/viewvc/gramps?view=revision&revision=6337

and child handle/ref was already removed before revision 6337 !


Note, maybe this could be associated with:
5466: On import and Check and Repair need to check references to absent
objects http://www.gramps-project.org/bugs/view.php?id=5466
//was person_ref//



Le 27/07/2012 15:00, Benny Malengier a écrit :

> No idea.
> I would guess related to a bug where this was the correct action.
>
> Benny
>
> 2012/7/27 Doug Blank <[hidden email] <mailto:[hidden email]>>
>
>     I'm trying to track down the cause of a database corruption, and
>     noticed in check.py that when a child has no reference to a family,
>     the child is removed from the family:
>
>                          if family_handle not in \
>                                 child.get_parent_family_handle_list():
>                              # The referenced child has no reference to
>     the family
>                              # This is tested by TestcaseGenerator where
>     the father
>                              # is "Broken8"
>                              logging.warning("    FAIL: family
>     '%(fam_gid)s' "
>                                              "child '%(child_gid)s' has
>     no reference"
>                                              " to the family" %
>                                              {'fam_gid' : family.gramps_id,
>                                               'child_gid' :
>     child.gramps_id})
>                              family.remove_child_ref(child_ref)
>
>     Wouldn't it be better to add the family to the child, if possible? The
>     current behavior seems wrong.
>
>     Anyone know why this is the way that it is?
>
>     -Doug
>
>     ------------------------------------------------------------------------------
>     Live Security Virtual Conference
>     Exclusive live event will cover all the ways today's security and
>     threat landscape has changed and how IT managers can respond.
>     Discussions
>     will include endpoint security, mobile security and the latest in
>     malware
>     threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>     _______________________________________________
>     Gramps-devel mailing list
>     [hidden email]
>     <mailto:[hidden email]>
>     https://lists.sourceforge.net/lists/listinfo/gramps-devel
>
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>
>
>
> _______________________________________________
> Gramps-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why does check.py remove rather than fix?

Tim Lyons
Administrator
In reply to this post by DS Blank
DS Blank wrote
I had already come to agree, but with your additional points I think
that this is a necessary change. I am using a version that fixes
rather than deletes in my own use. If there is consensus on this, I'd
be glad to patch trunk.
Well, yes, but I think it needs to be fixed in gramps34 and gramps35 as well, especially as the person who raised the problem want to use Gramps for importing his GEDCOM. Note also it needs to be fixed in Import GEDCOM, not just check and repair. Bug 5466 is all about preventing incomplete/erroneous data getting into Gramps, not just fixing it when it gets there.

Does it also need to be fixed in import XML?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why does check.py remove rather than fix?

Benny Malengier


2012/9/7 Tim Lyons <[hidden email]>

DS Blank wrote
>
> I had already come to agree, but with your additional points I think
> that this is a necessary change. I am using a version that fixes
> rather than deletes in my own use. If there is consensus on this, I'd
> be glad to patch trunk.
>

Well, yes, but I think it needs to be fixed in gramps34 and gramps35 as
well, especially as the person who raised the problem want to use Gramps for
importing his GEDCOM. Note also it needs to be fixed in Import GEDCOM, not
just check and repair. Bug 5466 is all about preventing incomplete/erroneous
data getting into Gramps, not just fixing it when it gets there.

Does it also need to be fixed in import XML?

For import XML, we need to find first bugs that cause an error. These must be fixed, and if we find them, we can construct a workaround for import of erronous exports.

So in this case, I rather have a bug that shows us there is an error in export, than you fixing something now that does not occur.

Benny



--
View this message in context: http://gramps.1791082.n4.nabble.com/Why-does-check-py-remove-rather-than-fix-tp4655799p4656307.html
Sent from the GRAMPS - Dev mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why does check.py remove rather than fix?

Tim Lyons
Administrator
I have raised bug 6061: http://www.gramps-project.org/bugs/view.php?id=6061 for this.

The bug also explains how the problem can occur with xml import (using a database that had been created from a GEDCOM file before the fix had been applied).

Fixed in gramps34/trunk revisions 20438/20439. Fixed in GEDCOM import, importxml and Check and repair. (Also minor fixes for improved diagnostics when importxml just completely fails and for exportxml illegal characters in mime_type).

Tested using the supplied file (gramps-1.ged) and also xml export and import of Tools->Debug->Generate Testcases for Persons and Families...
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why does check.py remove rather than fix?

DS Blank
Thanks! It looks great to me.

-Doug

On Sun, Sep 23, 2012 at 6:54 PM, Tim Lyons [via GRAMPS]
<[hidden email]> wrote:

> I have raised bug 6061: http://www.gramps-project.org/bugs/view.php?id=6061
> for this.
>
> The bug also explains how the problem can occur with xml import (using a
> database that had been created from a GEDCOM file before the fix had been
> applied).
>
> Fixed in gramps34/trunk revisions 20438/20439. Fixed in GEDCOM import,
> importxml and Check and repair. (Also minor fixes for improved diagnostics
> when importxml just completely fails and for exportxml illegal characters in
> mime_type).
>
> Tested using the supplied file (gramps-1.ged) and also xml export and import
> of Tools->Debug->Generate Testcases for Persons and Families...
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://gramps.1791082.n4.nabble.com/Why-does-check-py-remove-rather-than-fix-tp4655799p4656715.html
> This email was sent by Tim Lyons (via Nabble)
> To receive all replies by email, subscribe to this discussion
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why does check.py remove rather than fix?

jerome
Tim,


I guess you just have fixed one of the oldest problems!
//I do not ask for backport//


Thank you!
Jérôme

Le 24/09/2012 04:57, DS Blank a écrit :

> Thanks! It looks great to me.
>
> -Doug
>
> On Sun, Sep 23, 2012 at 6:54 PM, Tim Lyons [via GRAMPS]
> <[hidden email]> wrote:
>
>  > I have raised bug 6061:
> http://www.gramps-project.org/bugs/view.php?id=6061
>  > for this.
>  >
>  > The bug also explains how the problem can occur with xml import (using a
>  > database that had been created from a GEDCOM file before the fix had
> been
>  > applied).
>  >
>  > Fixed in gramps34/trunk revisions 20438/20439. Fixed in GEDCOM import,
>  > importxml and Check and repair. (Also minor fixes for improved
> diagnostics
>  > when importxml just completely fails and for exportxml illegal
> characters in
>  > mime_type).
>  >
>  > Tested using the supplied file (gramps-1.ged) and also xml export and
> import
>  > of Tools->Debug->Generate Testcases for Persons and Families...
>  >
>  >
>  > ________________________________
>  > If you reply to this email, your message will be added to the discussion
>  > below:
>  >
> http://gramps.1791082.n4.nabble.com/Why-does-check-py-remove-rather-than-fix-tp4655799p4656715.html
>  > This email was sent by Tim Lyons (via Nabble)
>  > To receive all replies by email, subscribe to this discussion
>
> ------------------------------------------------------------------------
> View this message in context: Re: Why does check.py remove rather than
> fix?
> <http://gramps.1791082.n4.nabble.com/Why-does-check-py-remove-rather-than-fix-tp4655799p4656717.html>
> Sent from the GRAMPS - Dev mailing list archive
> <http://gramps.1791082.n4.nabble.com/GRAMPS-Dev-f1791083.html> at
> Nabble.com.
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>
>
>
> _______________________________________________
> Gramps-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Loading...