Re: Fwd: Re: [ 1057651 ] Non-ascii filenames not supported

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Re: [ 1057651 ] Non-ascii filenames not supported

Martin Hawlisch
Hi,

this bug is not fixed. The problem in fixing this is, that we dont know the
codepage used to encode the filename. We could add handlers for UTF-8 and
latin-1, and maybe a handler for a locale based encoding, but this might do
the wrong conversion.

$ cvs diff -u src/ReadGedcom.py
Index: src/ReadGedcom.py
===================================================================
RCS file: /cvsroot/gramps/gramps2/src/ReadGedcom.py,v
retrieving revision 1.47.2.15
diff -u -r1.47.2.15 ReadGedcom.py
--- src/ReadGedcom.py   20 May 2005 21:27:03 -0000      1.47.2.15
+++ src/ReadGedcom.py   30 May 2005 07:39:39 -0000
@@ -266,7 +266,13 @@
         self.gedsource = self.gedmap.get_from_source_tag('GEDCOM 5.5')
         self.def_src = RelLib.Source()
         fname = os.path.basename(filename).split('\\')[-1]
-        self.def_src.set_title(_("Import from %s") % unicode(fname))
+        try:
+            self.def_src.set_title(_("Import from %s") % unicode(fname))
+        except UnicodeDecodeError:
+            try:
+                self.def_src.set_title(_("Import from %s") %
unicode(fname,"UTF-8"))
+            except UnicodeDecodeError:
+                self.def_src.set_title(_("Import from %s") %
unicode(fname,"latin1"))
         self.dir_path = os.path.dirname(filename)
         self.localref = 0
         self.placemap = {}


The next problem is for the recent files implementation. It could be
possible to convert the file to unicode by trying out some different
codepages, but to successfully load the file later we have to convert the
unicode filename back to the original byte string, therefore we have to know
the codepage that we used before. We currently cannot store this
information.

What we could do is to encode all chars > 127 into for example some escape
sequence, that later can be converted back. But this generated non-readable
filenames. Btw, the gnome/gtk file-chooser handles it this way to display
the files.

Cheers,
  Martin.


> --- Ursprüngliche Nachricht ---
> Von: Trevor <[hidden email]>
> An: Martin Hawlisch <[hidden email]>
> Betreff: Fwd: Re: [ 1057651 ] Non-ascii filenames not supported
> Datum: Mon, 30 May 2005 11:08:39 +1000
>
>
>
> ----------  Forwarded Message  ----------
>
> Subject: Re: [ 1057651 ] Non-ascii filenames not supported
> Date: Mon, 30 May 2005 02:17 am
> From: Don Allingham <[hidden email]>
> To: Trevor <[hidden email]>
>
> Trevor,
>
> I really don't know if this is fixed, since I don't use non-ASCII
> characters. This might be a good question for Martin Hawlisch.
>
> I'll see if I can reproduce this on my own.
>
> Don
>
> On Mon, 2005-05-30 at 01:16 +1000, Trevor wrote:
> > Don,
> >
> > I deleted a lot this one for the email due to it's length, but I hope
> you
> > can still remember it.  Your comments are at the bottom.  Do you know if
> > this is fixed?
> > =========================
> >  I receive the following error report:
> >  self.parse_individual()
> >  File "/usr/share/gramps/plugins/ReadGedcom.py", line
> >  750, in parse_individual
> >  self.parse_person_object(2)
> >  File "/usr/share/gramps/plugins/ReadGedcom.py", line
> >  953, in parse_person_object
> >  (ok,path) = self.find_file(file,self.dir_path)
> >  File "/usr/share/gramps/plugins/ReadGedcom.py", line
> >  312, in find_file
> >  if os.path.isfile(fullname):
> >  File "/usr/lib/python2.2/posixpath.py", line 197, in
> >  isfile
> >  st = os.stat(path)
> >  UnicodeError: ASCII encoding error: ordinal not in
> >  range(128)
> > =========================
> > Date: 2004-11-01 23:43
> > Sender: dallingham
> >
> > The problem is not with the GEDCOM file, but with the file
> > name. Check the full path (which is probably something like
> > /home/user/something.ged) and see if there are any accented
> > characters (such as é or á)
> > We are trying to figure out how to detect the character set
> > of the filesystem.
> >
> > -----------------
> > Date: 2004-11-01 19:55
> > Sender: nobody
> >
> > But how do i identify those special character codes ?
> > And a other genealogy system i use does import the GEDCOM?
> > What can i do ?
> >
> > -----------------
> > Date: 2004-11-01 02:22
> > Sender: dallingham
> >
> > This can happen if you have non-ascii characters in your
> > file name or directory name. GRAMPS does not know what
> > character encoding your file system uses (UNICODE,
> > ISO-8859-1, etc.). If it is not ascii or unicode, a problem
> > can occur.
>
> -------------------------------------------------------
>
> --
>  Regards
>       Trevor Rhodes
> ========================================
> Powered by Linux               -              Mandriva 2005 LE
> Registered Linux user # 290542 at http://counter.li.org
> Registered Machine # 186951 = Mandriva Club Silver Member
> Source :  my 100 % Microsoft-free personal computer.
> ========================================
>  11:08:21 up 8 days, 18:37,  2 users,  load average: 0.00, 0.00, 0.00
>

--
5 GB Mailbox, 50 FreeSMS http://www.gmx.net/de/go/promail
+++ GMX - die erste Adresse für Mail, Message, More +++


-------------------------------------------------------
This SF.Net email is sponsored by Yahoo.
Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
Search APIs Find out how you can build Yahoo! directly into your own
Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel