Python2 to python3 upgrade encoding errors

classic Classic list List threaded Threaded
52 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Python2 to python3 upgrade encoding errors

Nick Hall
I have now investigated bug #8360.

The database contains a mix of unicode and strings.  When a record in
python2 format is loaded into python3, unicode gets converted into a
str, but the way strings are converted depends on the encoding setting.  
If "bytes" is specifies then strings are converted into bytes, otherwise
they are also converted to str.

By default strings are converted into str using the ASCII encoding. This
will cause errors if the strings contain utf-8 encoded unicode.  I see
that Benny has already explained this in another thread.

So what encoding do we choose?  I would have chosen "bytes".  This would
have kept the handles as bytes and most of the other fields as str.  I
still think that converting handles into str was a bad choice.

However, the choice of "utf-8" does have an advantage.  It converts all
the old fields that we really want to be str rather than bytes.

Where do we go from here?  The easiest solution would be to convert all
records in the database using 'utf-8' encoding.  This should be done
prior to the upgrade, because at the moment the upgrade can fail.  I got
such an error when attempting to upgrade the database in bug #8360.

Ideally, I would also like to convert all handles back to bytes. This
would make the code simpler and easier to maintain.  Perhaps this should
wait until v4.2 though.

It's getting late now, so I'll think about the solution more tomorrow.


Nick.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Tim Lyons
Administrator
I appreciate that you have a more general strategy problem to solve, but while you are thinking about it, I wonder whether you could think about another point.

As I understand it, conversion (from records stored in Python 2 format to Python 3) takes place 'on-the-fly' as instances are read. This seems like a bad idea, from the point of view of complexity of code, the need to make sure that all situations are covered (also ? for new code) and in a minor way efficiency.

I wonder whether it could be arranged that conversion is done once and for all with some sort of marker to indicate that conversion has been done (e.g. changing the pythonversion.txt file from 2 or 3 to 3C to indicate that conversion had been done).

Do I understand correctly that if 'bytes' were chosen some data (i.e. that which is strings) would be converted to bytes (rather than str). Does this mean that such data would occupy more bytes (because it would not have the advantage of utf-8 encoding)? If so, this seems a disadvantage as it might make the DB larger.

It doesn't look to me as though a more permanent solution can be postponed to 4.2!

Regards,
Tim.
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Nick Hall
On 21/02/15 01:52, Tim Lyons wrote:
> As I understand it, conversion (from records stored in Python 2 format to
> Python 3) takes place 'on-the-fly' as instances are read. This seems like a
> bad idea, from the point of view of complexity of code, the need to make
> sure that all situations are covered (also ? for new code) and in a minor
> way efficiency.

Yes.  The best place for the conversion would be before the upgrade. At
the moment the upgrade can fail.


> I wonder whether it could be arranged that conversion is done once and for
> all with some sort of marker to indicate that conversion has been done (e.g.
> changing the pythonversion.txt file from 2 or 3 to 3C to indicate that
> conversion had been done).

That is a good idea.  I was considering creating a new file, but we
already have one.


> Do I understand correctly that if 'bytes' were chosen some data (i.e. that
> which is strings) would be converted to bytes (rather than str). Does this
> mean that such data would occupy more bytes (because it would not have the
> advantage of utf-8 encoding)? If so, this seems a disadvantage as it might
> make the DB larger.

The byte strings would be utf-8 encoded.  I don't know how a str is
pickled, but it may well also use utf-8 encoding.

Ideally, str should be stored as str and bytes as bytes.  At the moment
in python2 all handles are strings but other data is a mixture of
unicode and strings.  During the conversion to python3, which is done
on-the-fly, all string and unicode fields are converted to str, even the
handles.

I'm going to start writing some code now.


Nick.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Tim Lyons
Administrator
Does this explain why upgrade fails with "too many values to unpack" from a source object? (the error is not directly "UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position <x>: ordinal not in range(128)")

Tim.
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

enno
Hi Tim,
> Does this explain why upgrade fails with "too many values to unpack" from a
> source object? (the error is not directly "UnicodeDecodeError: 'ascii' codec
> can't decode byte 0xc3 in position <x>: ordinal not in range(128)")
No. The original report is incomplete. The "too many values" error
appears when you restart 4.1.1 and try to open the DB after the upgrade
has failed. See Nick's analysis at:

https://gramps-project.org/bugs/view.php?id=8360#c40417

Because of the failed upgrade, not all DB files have been updated,
meaning that some are still in 3.3 format. Because of that, the schema
upgrade is started again, but it fails, because some tables have already
been upgraded, and return more items than expected.

When I restart 4.1.1, and try to open the offending tree, Gramps will
offer the schema upgrade again, from 15 to 17, and timestamps for a
dozen DB files will be updated, but not all. You can repeat this till
the end of times, and check DB timestamps to see what goes on, and what not.

I'm quite glad that we discovered this, because it also explains why I
saw "too many values" errors in some trees that I got from the reporter
of #8233, who's external HD I could use to investigate.

regards,

Enno


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

enno
In reply to this post by Nick Hall
Nick,
> Where do we go from here?  The easiest solution would be to convert all
> records in the database using 'utf-8' encoding.  This should be done
> prior to the upgrade, because at the moment the upgrade can fail.  I got
> such an error when attempting to upgrade the database in bug #8360.
Isn't it so that most of the records are in UTF-8 already, also in
Python 2? The errors that I've seen so far concentrate on tags and
media, and don't show on person and location names, where far more users
have foreign characters.

I do agree that most elements should be treated as UTF-8, meaning all
that depend on user input, including elements that refer to the file system.

regards,

Enno


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Nick Hall
On 21/02/15 22:02, Enno Borgsteede wrote:
>> n the database using 'utf-8' encoding.  This should be done
>> >prior to the upgrade, because at the moment the upgrade can fail.  I got
>> >such an error when attempting to upgrade the database in bug #8360.
> Isn't it so that most of the records are in UTF-8 already, also in
> Python 2? The errors that I've seen so far concentrate on tags and
> media, and don't show on person and location names, where far more users
> have foreign characters.

Old databases, such as the one in bug #8360, contain a mix of python2
strings and unicode.


>
> I do agree that most elements should be treated as UTF-8, meaning all
> that depend on user input, including elements that refer to the file system.

In python3, we want most string fields to be type str not bytes. Ideally
we also want all database handles to be type bytes.  I will keep them as
str for this fix, but hope to convert them to bytes for v4.2.

It is the conversion of python2 string fields to python3 str using ASCII
encoding that causes the errors.


Nick.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Nick Hall
In reply to this post by Nick Hall
On 21/02/15 00:18, Nick Hall wrote:
> Where do we go from here?  The easiest solution would be to convert all
> records in the database using 'utf-8' encoding.  This should be done
> prior to the upgrade, because at the moment the upgrade can fail.  I got
> such an error when attempting to upgrade the database in bug #8360.

This has now been fixed.

The conversion occurs before the upgrade.  Strings are converted to str
using utf-8 encoding and records are saved in the new pickle protocol.

A v4.1 database that doesn't need upgrading can be converted by
exporting to Gramps XML and importing into an empty database.

I have set a record called "upgrade" in the metadata table to "Yes"
after the upgrade for future use.

>
> Ideally, I would also like to convert all handles back to bytes. This
> would make the code simpler and easier to maintain.  Perhaps this should
> wait until v4.2 though.

We can consider this for v4.2.  Converting the records using "bytes"
wasn't a good option for this fix.

Please test the gramps41 branch for me.


Nick.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

enno
Nick,

> On 21/02/15 00:18, Nick Hall wrote:
>> Where do we go from here?  The easiest solution would be to convert all
>> records in the database using 'utf-8' encoding.  This should be done
>> prior to the upgrade, because at the moment the upgrade can fail.  I got
>> such an error when attempting to upgrade the database in bug #8360.
> This has now been fixed.
>
> The conversion occurs before the upgrade.  Strings are converted to str
> using utf-8 encoding and records are saved in the new pickle protocol.
>
> A v4.1 database that doesn't need upgrading can be converted by
> exporting to Gramps XML and importing into an empty database.
>
> I have set a record called "upgrade" in the metadata table to "Yes"
> after the upgrade for future use.
Shortly before your messaga arrived, I checked out master, ran all
upgrades (bsddb, python, schema), and saw no errors anywhere. Filtering
persons and families on a tag with accented letters worked too.
>> Ideally, I would also like to convert all handles back to bytes. This
>> would make the code simpler and easier to maintain.  Perhaps this should
>> wait until v4.2 though.
> We can consider this for v4.2.  Converting the records using "bytes"
> wasn't a good option for this fix.
>
> Please test the gramps41 branch for me.
I'll try that later today. I ran check, because that goes through lots
of tables, and let it remove missing media references. That resulted in
a key error mentioning something about bytecode:

2015-02-22 16:56:50.063: WARNING: check.py: line 661:         FAIL:
media object and all references to it removed
2015-02-22 16:56:50.063: WARNING: check.py: line 738:     FAIL: media
object "Euriware_grève_20150205" reference to missing file
"/home/syl/Album/Photos/2015/2-Février/05.02.15/photo-le-dl-g-j.jpg" found
2015-02-22 16:56:50.782: ERROR: tool.py: line 256: Failed to start tool.
Traceback (most recent call last):
   File "/home/enno/gramps-source/gramps/gui/plug/tool.py", line 252, in
gui_tool
     callback = callback)
   File "/home/enno/gramps-source/gramps/plugins/tool/check.py", line
176, in __init__
     checker.cleanup_missing_photos(cli)
   File "/home/enno/gramps-source/gramps/plugins/tool/check.py", line
739, in cleanup_missing_photos
     remove_clicked()
   File "/home/enno/gramps-source/gramps/plugins/tool/check.py", line
660, in remove_clicked
     self.db.remove_object(ObjectId,self.trans)
   File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1753, in
remove_object
     MEDIA_KEY)
   File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1690, in
__do_remove
     txn=txn.txn)
   File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1221, in
delete_primary_from_reference_map
     self.__remove_reference(main_key, transaction, txn)
   File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1293, in
__remove_reference
     'Key is %s') % str(key))
gramps.gen.errors.DbError: An attempt is made to save a reference key
which is partly bytecode, this is not allowed.
Key is ('cf22d3693702393f81558b9f817', b'cf4f3ffc40c48c40136b5ed2da2')

Could this be related? I can repeat the test with 4.1, both Python 2 and
3, but that'll probably be after dinner.

thanks,

Enno


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Nick Hall
On 22/02/15 16:34, Enno Borgsteede wrote:

>     File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1293, in
> __remove_reference
>       'Key is %s') % str(key))
> gramps.gen.errors.DbError: An attempt is made to save a reference key
> which is partly bytecode, this is not allowed.
> Key is ('cf22d3693702393f81558b9f817', b'cf4f3ffc40c48c40136b5ed2da2')
>
> Could this be related? I can repeat the test with 4.1, both Python 2 and
> 3, but that'll probably be after dinner.
>
No.  This is the type of error that I said that I didn't fix.

In python3, Gramps stores handles both as str and bytes.  This sometimes
causes problems when use the wrong type by mistake.

As I said before, ideally we should always store handles as bytes.

What are the steps to reproduce this error?


Nick.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

enno
Nick,

> On 22/02/15 16:34, Enno Borgsteede wrote:
>>      File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1293, in
>> __remove_reference
>>        'Key is %s') % str(key))
>> gramps.gen.errors.DbError: An attempt is made to save a reference key
>> which is partly bytecode, this is not allowed.
>> Key is ('cf22d3693702393f81558b9f817', b'cf4f3ffc40c48c40136b5ed2da2')
>>
>> Could this be related? I can repeat the test with 4.1, both Python 2 and
>> 3, but that'll probably be after dinner.
>>
> No.  This is the type of error that I said that I didn't fix.
>
> In python3, Gramps stores handles both as str and bytes.  This sometimes
> causes problems when use the wrong type by mistake.
>
> As I said before, ideally we should always store handles as bytes.
>
> What are the steps to reproduce this error?
After upgrade, run check, and when it complains about missing media, let
it remove all references. I get the same on gramps41. Upgrade itself is
perfect, no errors in terminal, person and family filters on the
accented tag work all OK.

2015-02-22 19:23:14.348: WARNING: check.py: line 662:         FAIL:
media object and all references to it removed
2015-02-22 19:23:14.349: WARNING: check.py: line 734:     FAIL: media
object "Euriware_grève_20150205" reference to missing file
"/home/syl/Album/Photos/2015/2-Février/05.02.15/photo-le-dl-g-j.jpg" found
2015-02-22 19:23:14.586: ERROR: tool.py: line 256: Failed to start tool.
Traceback (most recent call last):
   File "/home/enno/gramps-source/gramps/gui/plug/tool.py", line 252, in
gui_tool
     callback = callback)
   File "/home/enno/gramps-source/gramps/plugins/tool/check.py", line
177, in __init__
     checker.cleanup_missing_photos(cli)
   File "/home/enno/gramps-source/gramps/plugins/tool/check.py", line
735, in cleanup_missing_photos
     remove_clicked()
   File "/home/enno/gramps-source/gramps/plugins/tool/check.py", line
661, in remove_clicked
     self.db.remove_object(ObjectId,self.trans)
   File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1753, in
remove_object
     MEDIA_KEY)
   File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1690, in
__do_remove
     txn=txn.txn)
   File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1221, in
delete_primary_from_reference_map
     self.__remove_reference(main_key, transaction, txn)
   File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1293, in
__remove_reference
     'Key is %s') % str(key))
gramps.gen.errors.DbError: An attempt is made to save a reference key
which is partly bytecode, this is not allowed.
Key is ('cf22d3693702393f81558b9f817', b'cf4f92fd0d02735409ab8fa4758')

This is a python3 thing indeed. Shall I file a separate report?

thanks again,

Enno


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Nick Hall
On 22/02/15 18:48, Enno Borgsteede wrote:
>     File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1293, in
> __remove_reference
>       'Key is %s') % str(key))
> gramps.gen.errors.DbError: An attempt is made to save a reference key
> which is partly bytecode, this is not allowed.
> Key is ('cf22d3693702393f81558b9f817', b'cf4f92fd0d02735409ab8fa4758')
>
> This is a python3 thing indeed. Shall I file a separate report?
>

I think that there is already a bug report for this.

Is this on the same database as the other bug?

Nick.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

enno
Nick,

> On 22/02/15 18:48, Enno Borgsteede wrote:
>>      File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1293, in
>> __remove_reference
>>        'Key is %s') % str(key))
>> gramps.gen.errors.DbError: An attempt is made to save a reference key
>> which is partly bytecode, this is not allowed.
>> Key is ('cf22d3693702393f81558b9f817', b'cf4f92fd0d02735409ab8fa4758')
>>
>> This is a python3 thing indeed. Shall I file a separate report?
>>
> I think that there is already a bug report for this.
Searching for "partly bytecode", I see no open ones.
> Is this on the same database as the other bug?
It's the LABOISNE tree, formerly attached to
https://gramps-project.org/bugs/view.php?id=8360

Enno


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Tim Lyons
Administrator
In reply to this post by Nick Hall
Nick Hall wrote
On 21/02/15 00:18, Nick Hall wrote:
> Where do we go from here?  The easiest solution would be to convert all
> records in the database using 'utf-8' encoding.  This should be done
> prior to the upgrade, because at the moment the upgrade can fail.  I got
> such an error when attempting to upgrade the database in bug #8360.

This has now been fixed.

The conversion occurs before the upgrade.  Strings are converted to str
using utf-8 encoding and records are saved in the new pickle protocol.

A v4.1 database that doesn't need upgrading can be converted by
exporting to Gramps XML and importing into an empty database.

I have set a record called "upgrade" in the metadata table to "Yes"
after the upgrade for future use.

>
> Ideally, I would also like to convert all handles back to bytes. This
> would make the code simpler and easier to maintain.  Perhaps this should
> wait until v4.2 though.

We can consider this for v4.2.  Converting the records using "bytes"
wasn't a good option for this fix.

Please test the gramps41 branch for me.
Nice elegant solution - the brevity of the code attests to its elegance.

But why not fix the data for people who already upgraded to Python3 in 4.1? (or indeed in 4.0).

Tim.
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Nick Hall
On 23/02/15 10:23, Tim Lyons wrote:
>> Please test the gramps41 branch for me.
> Nice elegant solution - the brevity of the code attests to its elegance.

Thanks.

>
> But why not fix the data for people who already upgraded to Python3 in 4.1?
> (or indeed in 4.0).
>

An upgrade from 3.4 or 4.0 involves a schema upgrade which will apply
the fix.

Originally, I wrote the code to fix all databases running on python3
that had not already been fixed.  Then I changed my mind.

The fix is fast and safe, so we don't really need to inform the user
what is happening.   However, for databases that are very large the
extra load time may be a cause for concern.

We could provide a tool for users to fix 4.1 databases.


Nick.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Nick Hall
In reply to this post by enno
On 22/02/15 20:51, Enno Borgsteede wrote:

>> On 22/02/15 18:48, Enno Borgsteede wrote:
>>> >>      File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1293, in
>>> >>__remove_reference
>>> >>        'Key is %s') % str(key))
>>> >>gramps.gen.errors.DbError: An attempt is made to save a reference key
>>> >>which is partly bytecode, this is not allowed.
>>> >>Key is ('cf22d3693702393f81558b9f817', b'cf4f92fd0d02735409ab8fa4758')
>>> >>
>>> >>This is a python3 thing indeed. Shall I file a separate report?
>>> >>
>> >I think that there is already a bug report for this.
> Searching for "partly bytecode", I see no open ones.
>> >Is this on the same database as the other bug?
> It's the LABOISNE tree, formerly attached to
> https://gramps-project.org/bugs/view.php?id=8360

I can't reproduce this error.


Nick.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Ross Gammon
In reply to this post by Nick Hall
On 02/23/2015 03:41 PM, Nick Hall wrote:
> We could provide a tool for users to fix 4.1 databases.

+1

There may be many users that upgrade to 4.1.1 (Python 3) automatically
when Debian Jessie and Ubuntu Vivid are released in the next months
(both are in freeze right now).

Alternatively, is there some good advice I can give to Debian & Ubuntu
bug reporters before they are sent to the Gramps Website for 4.1.2
(until I can backport it)?

Cheers,

Ross


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

enno
In reply to this post by Nick Hall
Hi Nick,

> On 22/02/15 20:51, Enno Borgsteede wrote:
>>> On 22/02/15 18:48, Enno Borgsteede wrote:
>>>>>>       File "/home/enno/gramps-source/gramps/gen/db/write.py", line 1293, in
>>>>>> __remove_reference
>>>>>>         'Key is %s') % str(key))
>>>>>> gramps.gen.errors.DbError: An attempt is made to save a reference key
>>>>>> which is partly bytecode, this is not allowed.
>>>>>> Key is ('cf22d3693702393f81558b9f817', b'cf4f92fd0d02735409ab8fa4758')
>>>>>>
>>>>>> This is a python3 thing indeed. Shall I file a separate report?
>>>>>>
>>>> I think that there is already a bug report for this.
>> Searching for "partly bytecode", I see no open ones.
>>>> Is this on the same database as the other bug?
>> It's the LABOISNE tree, formerly attached to
>> https://gramps-project.org/bugs/view.php?id=8360
> I can't reproduce this error.
Weird. I ran another test with a fresh git clone, to make sure that any
of my local hacks don't interfere with your fix. And with that, I get
the same error on repair. When it complains about missing media, I let
it remove the reference, and set the check mark to apply this to all
missing media. I also tried without the check mark, clicking remove one
by one, and got the same error after 9 clicks, which suggests that it
doesn't apply to all media.

For further analysis, I made a .gramps backup before repair, in which I
found an object

     <object handle="_cf22d3693702393f81558b9f817" change="1423428131"
id="O0019">
       <file
src="/home/syl/Album/Photos/2015/2-Février/05.02.15/photo-le-dl-g-j.jpg"
mime="image/jpeg" description="Euriware_grève_20150205"/>
       <citationref hlink="_b'cf528b9487c7d78adb286e1712e'"/>
     </object>

referencing

     <citation handle="_cf528b9487c7d78adb286e1712e" change="1424708975"
id="C0007">
       <confidence>2</confidence>
       <sourceref hlink="_cf22d3623b63b7fa696cce6f2d"/>
     </citation>

where the citationref is binary, while citation handle is not.

Note that this citation handle is another one than quoted above, where
the object handle is the same. I guess that is because the citation is
generated during the upgrade from 3.3 to 4.1. It is the handle mentioned
in my error message though:

gramps.gen.errors.DbError: An attempt is made to save a reference key
which is partly bytecode, this is not allowed.
Key is ('cf22d3693702393f81558b9f817', b'cf528b9487c7d78adb286e1712e')

I searched the XML for the binary prefix "b'" and found a total of 9
occurrences, all citationref. Moreover, all citationrefs are binaries,
so it looks quite consistent here.

This is an all-in-one upgrade, for which I loaded the #8360 database in
todays gramps41 branch, downloaded as a fresh clone, so there is no hack
of mine interfering here. I have no idea what would happen when I try
step-by-step, like 3.3 - 3.4 (python 2), 3.4 - 4.0 (python 3), 4.0 - 4.1
(python 3), or one of the other possible paths to 4.1 in python 3.

regards,

Enno


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Paul Franklin-5
In reply to this post by Nick Hall
On 2/23/15, Nick Hall <[hidden email]> wrote:
> I can't reproduce this error.

I have seen it too.

I didn't bother saying so before as I thought with Enno's
report you would be able to (reproduce it and) investigate.

I still assume that is the case but if you want me to do
it again and then document my steps here, please say so.

(I vaguely recall it was with the 8258 DB, but from memory
my steps to reproduce it are the same as Enno's.  The
8258 zip does not come with images, like the 8360 zip does.
When I tested with 8360 I only copied the grampsdb, not
the user's entire .gramps tree.)

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Python2 to python3 upgrade encoding errors

Nick Hall
In reply to this post by enno
On 23/02/15 17:22, Enno Borgsteede wrote:
> This is an all-in-one upgrade, for which I loaded the #8360 database in
> todays gramps41 branch, downloaded as a fresh clone, so there is no hack
> of mine interfering here. I have no idea what would happen when I try
> step-by-step, like 3.3 - 3.4 (python 2), 3.4 - 4.0 (python 3), 4.0 - 4.1
> (python 3), or one of the other possible paths to 4.1 in python 3.

There was a bug upgrading the schema from 15 to 16 with python3.

It should be fixed now.  Please test again.


Nick.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
123