DB API addition; has_handle method.

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

DB API addition; has_handle method.

prculley
which discusses various API methods which were added to DBAPI but never officially made part of the Gramps API, I'm asking for a review of one particular method which I would like to use to patch a bug.

The has_handle(obj_key, handle) method takes an obj_key (PERSON_KEY, etc.) and a handle and determines if the db contains the handle, without raising a HandleError.

This was added to dbapi by Doug Blank a while ago, presumably because he expected it to be useful.

I've found a spot where I can use it as well; it turns out that when we decided to raise HandleErrors for invalid handles, we started getting a lot of these errors from our Gramplets.  The Gramplets were originally designed to deal with 'None' returns from the various db operations, but don't deal with HandleErrors.  Someone put in what I would consider to be a bandaid (a 'try' 'except' clause around our signal 'emit' code, which catches most of these, but not all.

I'd like to fix up the base Gramplet class which has a get_active method that returns the active handle, to check if the handle is valid before passing to the Gramplets, thus not having to fix all of the Gramplets individually.  But to do this, I have to check the handle for the various object types used by the Gramplets.  This can be done pretty easily via the has_handle method.

I've submitted a PR https://github.com/gramps-project/gramps/pull/412 that adds the has_handle method to bsddb and patches the bug if anyone cares to see what this looks like.

Paul Culley

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DB API addition; has_handle method.

DS Blank
On Sun, Jun 4, 2017 at 6:44 PM, Paul Culley <[hidden email]> wrote:
which discusses various API methods which were added to DBAPI but never officially made part of the Gramps API, I'm asking for a review of one particular method which I would like to use to patch a bug.

The has_handle(obj_key, handle) method takes an obj_key (PERSON_KEY, etc.) and a handle and determines if the db contains the handle, without raising a HandleError.

This was added to dbapi by Doug Blank a while ago, presumably because he expected it to be useful.

It is useful logically (as your code shows). But I have a different feeling now: if you think about database accesses, then you could be accessing the database twice for no particular gain. Gramps code has a lot of code like that (because the database access has been fairly cheap with local, single-user databases where you can even cache data). But as you head towards more efficient database access (because it is remote or shared), I think you'll probably want to undo your beautiful logic:

       if db.has_media_handle(handle):
            obj = db.get_media_from_handle(handle)

into something that if successfully found (which is going to be a majority of the time, presumably) then you'll want it to be efficient too. But if you treat handles that don't return objects as an error, then you have your current dilemma.

My goal is to have referential integrity in gPrime databases so that they can be efficient without errors. But that will take additional processing in Gramps after imports that allow referentially invalid data.

-Doug

 

I've found a spot where I can use it as well; it turns out that when we decided to raise HandleErrors for invalid handles, we started getting a lot of these errors from our Gramplets.  The Gramplets were originally designed to deal with 'None' returns from the various db operations, but don't deal with HandleErrors.  Someone put in what I would consider to be a bandaid (a 'try' 'except' clause around our signal 'emit' code, which catches most of these, but not all.

I'd like to fix up the base Gramplet class which has a get_active method that returns the active handle, to check if the handle is valid before passing to the Gramplets, thus not having to fix all of the Gramplets individually.  But to do this, I have to check the handle for the various object types used by the Gramplets.  This can be done pretty easily via the has_handle method.

I've submitted a PR https://github.com/gramps-project/gramps/pull/412 that adds the has_handle method to bsddb and patches the bug if anyone cares to see what this looks like.

Paul Culley

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DB API addition; has_handle method.

prculley
Doug;
I'd love to make this more efficient, even if it is only operating at human rates in the GUI.
But Gramps problem is not only one of 'referential integrity' but of signaling.  It should be a rare event that the db is indeed inconsistent.  But due to sloppiness or other design decisions, it is a common event that our db and/or signals are temporarily inconsistent during multi-commit operations.  The best that can be said is that the db is consistent before and after the overall transaction.

Our undo/redo data is stored in a dict, that doesn't always return the data in the order it was put in.  Our signals are delayed until after the complete transactions, guaranteeing that the signaled objects see inconsistencies at times.  For example, on the issue that got me started on the current set of fixes, during a delete of a person with some references from other places, we
1) locate and find references to the person
2) remove the references, and commit each object effected
3) delete the person
4) use the undo/redo data to 'play back' the signals (out of a dict in some order).
5) use very generic signal handlers in a lot of cases that update everything (Gramplets)

Even if the signals came out in the same order as the commits occurred, on getting a signal for one of the reference commits, the Gramplet will often try to re-display some portion of the (now deleted) persons data, creating a HandleError.

I expect that there are other cases where the commits themselves don't occur in an order that leaves the db consistent at all times.

None of this used to matter when db accesses with bad references (handles) just returned 'None'.  (Most) of our code tested for 'None' and dealt with it correctly.  In fact our filters used this feature and deliberately created proxy dbs that are quite inconsistent.  So now some of our code in the GUI has to deal with BOTH the 'None' and HandleError issues.

As a relatively new developer, I certainly don't have a full understanding of the history and underlying concepts and plan for how I am seeing things work.  So I'm may be missing a lot of things.  So as I patch bugs I tend to patch symptoms (which are usually small fixes and easier for the other developers to accept).

Perhaps if we had lots of experienced developers we could take a step back and rethink the whole problem.  But I fear that the best we can hope for is to keep patching things well enough to keep our user base happy.

Paul C.

On Sun, Jun 4, 2017 at 7:52 PM, Doug Blank <[hidden email]> wrote:
On Sun, Jun 4, 2017 at 6:44 PM, Paul Culley <[hidden email]> wrote:
which discusses various API methods which were added to DBAPI but never officially made part of the Gramps API, I'm asking for a review of one particular method which I would like to use to patch a bug.

The has_handle(obj_key, handle) method takes an obj_key (PERSON_KEY, etc.) and a handle and determines if the db contains the handle, without raising a HandleError.

This was added to dbapi by Doug Blank a while ago, presumably because he expected it to be useful.

It is useful logically (as your code shows). But I have a different feeling now: if you think about database accesses, then you could be accessing the database twice for no particular gain. Gramps code has a lot of code like that (because the database access has been fairly cheap with local, single-user databases where you can even cache data). But as you head towards more efficient database access (because it is remote or shared), I think you'll probably want to undo your beautiful logic:

       if db.has_media_handle(handle):
            obj = db.get_media_from_handle(handle)

into something that if successfully found (which is going to be a majority of the time, presumably) then you'll want it to be efficient too. But if you treat handles that don't return objects as an error, then you have your current dilemma.

My goal is to have referential integrity in gPrime databases so that they can be efficient without errors. But that will take additional processing in Gramps after imports that allow referentially invalid data.

-Doug

 

I've found a spot where I can use it as well; it turns out that when we decided to raise HandleErrors for invalid handles, we started getting a lot of these errors from our Gramplets.  The Gramplets were originally designed to deal with 'None' returns from the various db operations, but don't deal with HandleErrors.  Someone put in what I would consider to be a bandaid (a 'try' 'except' clause around our signal 'emit' code, which catches most of these, but not all.

I'd like to fix up the base Gramplet class which has a get_active method that returns the active handle, to check if the handle is valid before passing to the Gramplets, thus not having to fix all of the Gramplets individually.  But to do this, I have to check the handle for the various object types used by the Gramplets.  This can be done pretty easily via the has_handle method.

I've submitted a PR https://github.com/gramps-project/gramps/pull/412 that adds the has_handle method to bsddb and patches the bug if anyone cares to see what this looks like.

Paul Culley

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DB API addition; has_handle method.

Nick Hall
On 05/06/17 14:25, Paul Culley wrote:
> As a relatively new developer, I certainly don't have a full
> understanding of the history and underlying concepts and plan for how
> I am seeing things work.  So I'm may be missing a lot of things.  So
> as I patch bugs I tend to patch symptoms (which are usually small
> fixes and easier for the other developers to accept).

To understand this you need to go back a few years when we were looking
into NoneType errors.  Some of these resulted from broken links in the
database.  Unfortunately, because we don't enforce referential integrity
at database level, and were also not raising errors when we tried to
retrieve an object from a broken link, these were almost impossible to
track down.  Patching symptoms actually made it harder to find the real
problem.

A None return from a "get" method could either mean a broken link or a
link hidden by a proxy.  The solution was to raise a HandleError
exception when a broken link was encountered.  This has already resulted
in some program errors being fixed and one major source of broken links
found.  Hopefully, we can eliminate all broken links in the future and
HandleError exceptions should never be raised.

In the bug report you are looking at, the history list is not consistent
with the database.  It is being updated to reflect the database after
the gramplet is updated.  The database itself is consistent.  The
solution would seem to be to make sure that the history objects are
updated first.

I can still see some valid reasons to want a has_handle method, for
example, when importing a Gramps XML file into an existing database.  
Perhaps instead of an object key we could use the class name or table name?

Nick.



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DB API addition; has_handle method.

prculley
I'd be quite happy with Class name instead of object key for has_handle (would avoid a conversion in _gramplet).  Only reason for object key is that dbapi had it that way, and I try to do things in a minimal way.

Regarding the db consistency, you are right, in this case it was consistent throughout.  However, since the signals went out after the fact (in the wrong order), the GUI elements 'saw' a db that was inconsistent with their signals.  And during undo/redo we do in fact end up with temporary db inconsistencies since it uses the history.  Lots of issues here that are somewhat interrelated...

Paul C.

On Mon, Jun 5, 2017 at 10:43 AM, Nick Hall <[hidden email]> wrote:
On 05/06/17 14:25, Paul Culley wrote:
As a relatively new developer, I certainly don't have a full understanding of the history and underlying concepts and plan for how I am seeing things work.  So I'm may be missing a lot of things.  So as I patch bugs I tend to patch symptoms (which are usually small fixes and easier for the other developers to accept).

To understand this you need to go back a few years when we were looking into NoneType errors.  Some of these resulted from broken links in the database.  Unfortunately, because we don't enforce referential integrity at database level, and were also not raising errors when we tried to retrieve an object from a broken link, these were almost impossible to track down.  Patching symptoms actually made it harder to find the real problem.

A None return from a "get" method could either mean a broken link or a link hidden by a proxy.  The solution was to raise a HandleError exception when a broken link was encountered.  This has already resulted in some program errors being fixed and one major source of broken links found.  Hopefully, we can eliminate all broken links in the future and HandleError exceptions should never be raised.

In the bug report you are looking at, the history list is not consistent with the database.  It is being updated to reflect the database after the gramplet is updated.  The database itself is consistent.  The solution would seem to be to make sure that the history objects are updated first.

I can still see some valid reasons to want a has_handle method, for example, when importing a Gramps XML file into an existing database.  Perhaps instead of an object key we could use the class name or table name?

Nick.




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DB API addition; has_handle method.

Nick Hall
On 05/06/17 17:09, Paul Culley wrote:
> I'd be quite happy with Class name instead of object key for
> has_handle (would avoid a conversion in _gramplet).  Only reason for
> object key is that dbapi had it that way, and I try to do things in a
> minimal way.

The only reason has_handle uses the object key is that has_gramps_id
already does.  The object keys should really be internal to the database
layer.

Nick.



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DB API addition; has_handle method.

Nick Hall
In reply to this post by prculley
On 05/06/17 17:09, Paul Culley wrote:
> Regarding the db consistency, you are right, in this case it was
> consistent throughout.  However, since the signals went out after the
> fact (in the wrong order), the GUI elements 'saw' a db that was
> inconsistent with their signals.  And during undo/redo we do in fact
> end up with temporary db inconsistencies since it uses the history.  
> Lots of issues here that are somewhat interrelated...

The database signals will always be emitted after the transaction if it
is committed, or not if it fails and is rolled back.

Don't confuse the undo/redo history with the history of active objects.

Is the signal order important?  Could we emit the delete signals first,
which would update the history objects?

Nick.



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DB API addition; has_handle method.

prculley
Assuming we always delay emits until after the transaction completes:
At first thinking, it seems that putting deletes (and adds) first might do it.  Again assuming that ALL the get_active source data (active object history?) were updated before the various other GUI elements.
I think that whatever signals updates the GUI active object history would have to be emitted first (lest a delete signal git a GUI element, and it got the old, not yet removed, history object as active).
I've considered add, delete, and simple merges, but there may be other more complex transactions I'm not aware of.
I know we can do multiple deletes at once, but I'm not clear if this is a single transaction (it looks like it isn't based on the Undo History Gramplet).

Stream of consciousness here, not carefully examined, debugged etc.

Paul C.

On Mon, Jun 5, 2017 at 12:31 PM, Nick Hall <[hidden email]> wrote:
On 05/06/17 17:09, Paul Culley wrote:
Regarding the db consistency, you are right, in this case it was consistent throughout.  However, since the signals went out after the fact (in the wrong order), the GUI elements 'saw' a db that was inconsistent with their signals.  And during undo/redo we do in fact end up with temporary db inconsistencies since it uses the history.  Lots of issues here that are somewhat interrelated...

The database signals will always be emitted after the transaction if it is committed, or not if it fails and is rolled back.

Don't confuse the undo/redo history with the history of active objects.

Is the signal order important?  Could we emit the delete signals first, which would update the history objects?

Nick.




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: DB API addition; has_handle method.

Tim Lyons
Administrator
In reply to this post by prculley
prculley wrote
I'd be quite happy with Class name instead of object key for has_handle
(would avoid a conversion in _gramplet).  Only reason for object key is
that dbapi had it that way, and I try to do things in a minimal way.

Regarding the db consistency, you are right, in this case it was consistent
throughout.  However, since the signals went out after the fact (in the
wrong order), the GUI elements 'saw' a db that was inconsistent with their
signals.  And during undo/redo we do in fact end up with temporary db
inconsistencies since it uses the history.  Lots of issues here that are
somewhat interrelated...

Paul C.
I agree with Nick that we need HandleError as well as None, to help us diagnose and avoid bugs. (I do understand the difficulty with temporary db inconsistencies and sorry I don't have a suggestion as to how to deal with it).

In regard to the detail of  the details of the has_handle interfaces, I would prefer to avoid the signatures without the 'class name' in the signature (and class name or object key as a parameter), and instead use the signatures that are specific to the object type. (i.e. use the has_*_gramps_id methods).

I said in:
https://gramps-project.org/bugs/view.php?id=9541

As I understand it, the has_*_gramps_id methods for dbapi now call has_gramps_id (which has a parameter for the object of interest), which is good, because there is only one underlying implementation.

However, there are still two different styles of signature, (one with the table name in the function name, and one with the table as a parameter). This is very confusing for developers, because they have an unnecessary choice to make.

As it happens, has_gramps_id is used only in importxml.py so now is an ideal time to remove it and just have one signature style with the object name in the method name, which is the general case in other database methods.

In the case of bsddb, the two interfaces, has_*_gramps_id and has_gramps_id use different implementations! (One using maps and one doing a table get). This is a really TERRIBLE idea, because it is implicitly exposing the implementation of the method to the caller, violating all principles of information hiding.

This would be avoided by only providing one set of signatures.

Regards,
Tim.
Loading...