Gramps translation

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Gramps translation

Patrick Gerlier

Hi all,

I'm undertaking a general revision of the French translation because it lacks consistency and clarity. I don't put the blame on previous contributors but, due to the number of localisable strings, every one only looked at a "window" inside this haystack. Also, it is likely that only the new messages were cared for, leaving untouched ones as they were, on the basis they were granted as approved for not being changed for years.

CoVid-19 leaves plenty of time to do things very carefully. I won't talk here issues related to the French translation per se.

I take the time to read every message, open the source file to get a wider context than the one offered by Lokalize (translation tool under KDE desktop) and optionally make short tests to see what Gramps does.

This highlighted several problems in the original English strings:

  • Some "granted" messages (considered as "eternal" truths) have not been revised though Gramps evolved and they don"t describe accurately the present state.
  • Some help/description messages are not clear at all and don't allow to predict the effects of the parameters (only my tests clarified the situation). This particularly shows up with the filters and the reports. Some even say the contrary of what they do!
  • I next stumbled on limits of the POSIX gettext feature:
    • It is only based on string morphology. I mean it takes into account only the appearance, the characters composing the string, not its semantics. English language can easily turn nouns into verbs or adjectives, keeping the same spelling for the word. Example: "male" can be a noun, synonym for "man" or an adjective, synonym for "masculine".
It does not matter in English. However, French grammar (and others as well) does not offer this versatility. Consequently, "male" must be translated as "homme" (noun) or "masculin" (adjective) depending on context. since string _("male") is an insertion string, you can't predict where it is used (though it is implicit in the source files) and a single translation does not fit.

Maybe his could be solved with context strings like "Noun|male" and "Adj|male".
    • The tool to collect translatable strings into gramps.pot returns "technical" strings like "%s" which need no translation. Probably developers were too conservative in their programming. The target string for "%s" can be translatable, but not the processing format indicator. There are many other less trivial cases.
I can easily understand that "%s %s" or rather "%{a} %{b}" may need translation because the local culture prefers order "%{b} %{a}" but I don't see the point where there is a single format code. I don't speak here of the plural issue which is legitimate.
  • I found translatable strings which are never displayed despite all my attempts. This seems to occur in base classes which must always be overloaded. Then if those strings have no user-visible use, why mark them as translatable. A note to developers would indicate they are provided as examples. The benefit would be less strings in the translation store.
  • There are also user-invisible strings in the DB table descriptions (column names and comments). I consider them as "internal technical". It would probably result in less confusion when there is a bug report dump (I have never seen one, so forgive me if I'm wrong) if both originator and recipient talk about the same item.

There are presently 6846 translatable strings in Gramps 5.3.2. This figure is too high for serene maintenance, not speaking of translation. Many messages are not well "common-factored": they differ only in 2-3 words. This could be corrected in defining and enforcing a conventional lexicon and using (non-translatable) insertions (like "Python" and "BSDDB" in some error messages).

Reducing the number of messages would allow to concentrate on translation accuracy and fidelity.

I have gained quite a deep insight on Gramps features and possibilities (some I wouldn't have though of) and it would be a shame this knowledge gets lost. However, I'm afraid by the implications. If I start delving and patching the code, virtually all source files would be impacted. This means the change set in git would be huge, creating a focal point in the branches (because it would also imply some change of code to cope with "invariant insertions"). There remains also the question of validating my understanding of the features and how I describe them because I may be wrong.

What is your advice? Recommendations?

Patrick



_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Gramps translation

prculley
I understand your concern and applaud your willingness to try to improve things.  Any changes made to the French translation alone seem like a reasonable work item.  I would definitely work with or have Jérôme Rapinat <[hidden email]> review your work, as he has been the most recent translator working on our French translation.  And he is a senior developer.

Regarding changes to the untranslated strings in the Gramps code; as you probably realize, any change will invalidate all the translations done in all languages for the modified string.  That would not be much of a problem if we had a full set of active translators...  But we don't, so that would mean that the changed translations might become completely unreadable to non-English speakers for an undetermined time.

As a result, in my opinion, you should only submit changes to the source code where it is clearly wrong, not just difficult to understand.  If you could find a way for the gettext utilities to accept  "Noun|male"or just "male" as equivalent (with the exact match given a preference when both exist in the .mo file), then I could see updating the source code to add the context string.

Just my opinions, others may differ.

Paul C.

On Mon, Mar 23, 2020 at 6:08 AM Patrick Gerlier <[hidden email]> wrote:

Hi all,

I'm undertaking a general revision of the French translation because it lacks consistency and clarity. I don't put the blame on previous contributors but, due to the number of localisable strings, every one only looked at a "window" inside this haystack. Also, it is likely that only the new messages were cared for, leaving untouched ones as they were, on the basis they were granted as approved for not being changed for years.

CoVid-19 leaves plenty of time to do things very carefully. I won't talk here issues related to the French translation per se.

I take the time to read every message, open the source file to get a wider context than the one offered by Lokalize (translation tool under KDE desktop) and optionally make short tests to see what Gramps does.

This highlighted several problems in the original English strings:

  • Some "granted" messages (considered as "eternal" truths) have not been revised though Gramps evolved and they don"t describe accurately the present state.
  • Some help/description messages are not clear at all and don't allow to predict the effects of the parameters (only my tests clarified the situation). This particularly shows up with the filters and the reports. Some even say the contrary of what they do!
  • I next stumbled on limits of the POSIX gettext feature:
    • It is only based on string morphology. I mean it takes into account only the appearance, the characters composing the string, not its semantics. English language can easily turn nouns into verbs or adjectives, keeping the same spelling for the word. Example: "male" can be a noun, synonym for "man" or an adjective, synonym for "masculine".
It does not matter in English. However, French grammar (and others as well) does not offer this versatility. Consequently, "male" must be translated as "homme" (noun) or "masculin" (adjective) depending on context. since string _("male") is an insertion string, you can't predict where it is used (though it is implicit in the source files) and a single translation does not fit.

Maybe his could be solved with context strings like "Noun|male" and "Adj|male".
    • The tool to collect translatable strings into gramps.pot returns "technical" strings like "%s" which need no translation. Probably developers were too conservative in their programming. The target string for "%s" can be translatable, but not the processing format indicator. There are many other less trivial cases.
I can easily understand that "%s %s" or rather "%{a} %{b}" may need translation because the local culture prefers order "%{b} %{a}" but I don't see the point where there is a single format code. I don't speak here of the plural issue which is legitimate.
  • I found translatable strings which are never displayed despite all my attempts. This seems to occur in base classes which must always be overloaded. Then if those strings have no user-visible use, why mark them as translatable. A note to developers would indicate they are provided as examples. The benefit would be less strings in the translation store.
  • There are also user-invisible strings in the DB table descriptions (column names and comments). I consider them as "internal technical". It would probably result in less confusion when there is a bug report dump (I have never seen one, so forgive me if I'm wrong) if both originator and recipient talk about the same item.

There are presently 6846 translatable strings in Gramps 5.3.2. This figure is too high for serene maintenance, not speaking of translation. Many messages are not well "common-factored": they differ only in 2-3 words. This could be corrected in defining and enforcing a conventional lexicon and using (non-translatable) insertions (like "Python" and "BSDDB" in some error messages).

Reducing the number of messages would allow to concentrate on translation accuracy and fidelity.

I have gained quite a deep insight on Gramps features and possibilities (some I wouldn't have though of) and it would be a shame this knowledge gets lost. However, I'm afraid by the implications. If I start delving and patching the code, virtually all source files would be impacted. This means the change set in git would be huge, creating a focal point in the branches (because it would also imply some change of code to cope with "invariant insertions"). There remains also the question of validating my understanding of the features and how I describe them because I may be wrong.

What is your advice? Recommendations?

Patrick

_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel


_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Gramps translation

Per Starbäck
Paul Culley wrote:

> Regarding changes to the untranslated strings in the Gramps code; as
> you probably realize, any change will invalidate all the
> translations done in all languages for the modified string.  That
> would not be much of a problem if we had a full set of active
> translators...  But we don't, so that would mean that the changed
> translations might become completely unreadable to non-English
> speakers for an undetermined time.

One way would be to duplicate translation strings when a split is
made. Patrick Gerlier gave the example with "male" and "female" that
may need different translations:

>>  However, French grammar
>> (and others as well) does not offer this versatility. Consequently,
>> "male" must be translated as "homme" (noun) or "masculin"
>> (adjective) depending on context. since string _("male") is an
>> insertion string, you can't predict where it is used (though it is
>> implicit in the source files) and a single translation does not
>> fit. Maybe his could be solved with context strings like
>> "Noun|male" and "Adj|male".

I mean that if a split is made into these two in the code it shouldn't
lead to holes in the existing translations. Whatever translation
string they have now for _("male") should be copied to *both*
positions (with a comment for translators about a possible need to fix
one of them), and the result will not be worse than it already is,
whereas it will be better in the cases where there *are* translators
on top of it. (Also it's not necessary to go through all code and fix
which of the strings they should use. As long as some parts are
changed it becomes better. For the rest, it has not got any worse.)

Patrick also wrote:
>> There are presently 6846 translatable strings in Gramps 5.3.2. This
>> figure is too high for serene maintenance, not speaking of
>> translation. Many messages are not well "common-factored": they
>> differ only in 2-3 words. This could be corrected in defining and
>> enforcing a conventional lexicon and using (non-translatable)
>> insertions (like "Python" and "BSDDB" in some error messages).

I think you are right that some common facts can be extracted, but I
think you need to be wary when thinking of some strings as
non-translatable. Even foreign words like Python might very well need
to be inflected. To quote the page about the Python language in
Finnish Wikipedia: "Pythonia käyttäen voi tuottaa kuvaajia (kirjasto
matplotlib) sekä vuorovaikuttaa matlabin kanssa."

/not a Gramps developer, just a random user with a view

Den mån 23 mars 2020 kl 15:13 skrev Paul Culley <[hidden email]>:

>
> I understand your concern and applaud your willingness to try to improve things.  Any changes made to the French translation alone seem like a reasonable work item.  I would definitely work with or have Jérôme Rapinat <[hidden email]> review your work, as he has been the most recent translator working on our French translation.  And he is a senior developer.
>
> Regarding changes to the untranslated strings in the Gramps code; as you probably realize, any change will invalidate all the translations done in all languages for the modified string.  That would not be much of a problem if we had a full set of active translators...  But we don't, so that would mean that the changed translations might become completely unreadable to non-English speakers for an undetermined time.
>
> As a result, in my opinion, you should only submit changes to the source code where it is clearly wrong, not just difficult to understand.  If you could find a way for the gettext utilities to accept  "Noun|male"or just "male" as equivalent (with the exact match given a preference when both exist in the .mo file), then I could see updating the source code to add the context string.
>
> Just my opinions, others may differ.
>
> Paul C.
>
> On Mon, Mar 23, 2020 at 6:08 AM Patrick Gerlier <[hidden email]> wrote:
>>
>> Hi all,
>>
>> I'm undertaking a general revision of the French translation because it lacks consistency and clarity. I don't put the blame on previous contributors but, due to the number of localisable strings, every one only looked at a "window" inside this haystack. Also, it is likely that only the new messages were cared for, leaving untouched ones as they were, on the basis they were granted as approved for not being changed for years.
>>
>> CoVid-19 leaves plenty of time to do things very carefully. I won't talk here issues related to the French translation per se.
>>
>> I take the time to read every message, open the source file to get a wider context than the one offered by Lokalize (translation tool under KDE desktop) and optionally make short tests to see what Gramps does.
>>
>> This highlighted several problems in the original English strings:
>>
>> Some "granted" messages (considered as "eternal" truths) have not been revised though Gramps evolved and they don"t describe accurately the present state.
>> Some help/description messages are not clear at all and don't allow to predict the effects of the parameters (only my tests clarified the situation). This particularly shows up with the filters and the reports. Some even say the contrary of what they do!
>> I next stumbled on limits of the POSIX gettext feature:
>>
>> It is only based on string morphology. I mean it takes into account only the appearance, the characters composing the string, not its semantics. English language can easily turn nouns into verbs or adjectives, keeping the same spelling for the word. Example: "male" can be a noun, synonym for "man" or an adjective, synonym for "masculine".
>>
>> It does not matter in English. However, French grammar (and others as well) does not offer this versatility. Consequently, "male" must be translated as "homme" (noun) or "masculin" (adjective) depending on context. since string _("male") is an insertion string, you can't predict where it is used (though it is implicit in the source files) and a single translation does not fit.
>>
>> Maybe his could be solved with context strings like "Noun|male" and "Adj|male".
>>
>> The tool to collect translatable strings into gramps.pot returns "technical" strings like "%s" which need no translation. Probably developers were too conservative in their programming. The target string for "%s" can be translatable, but not the processing format indicator. There are many other less trivial cases.
>>
>> I can easily understand that "%s %s" or rather "%{a} %{b}" may need translation because the local culture prefers order "%{b} %{a}" but I don't see the point where there is a single format code. I don't speak here of the plural issue which is legitimate.
>>
>> I found translatable strings which are never displayed despite all my attempts. This seems to occur in base classes which must always be overloaded. Then if those strings have no user-visible use, why mark them as translatable. A note to developers would indicate they are provided as examples. The benefit would be less strings in the translation store.
>> There are also user-invisible strings in the DB table descriptions (column names and comments). I consider them as "internal technical". It would probably result in less confusion when there is a bug report dump (I have never seen one, so forgive me if I'm wrong) if both originator and recipient talk about the same item.
>>
>> There are presently 6846 translatable strings in Gramps 5.3.2. This figure is too high for serene maintenance, not speaking of translation. Many messages are not well "common-factored": they differ only in 2-3 words. This could be corrected in defining and enforcing a conventional lexicon and using (non-translatable) insertions (like "Python" and "BSDDB" in some error messages).
>>
>> Reducing the number of messages would allow to concentrate on translation accuracy and fidelity.
>>
>> I have gained quite a deep insight on Gramps features and possibilities (some I wouldn't have though of) and it would be a shame this knowledge gets lost. However, I'm afraid by the implications. If I start delving and patching the code, virtually all source files would be impacted. This means the change set in git would be huge, creating a focal point in the branches (because it would also imply some change of code to cope with "invariant insertions"). There remains also the question of validating my understanding of the features and how I describe them because I may be wrong.
>>
>> What is your advice? Recommendations?
>>
>> Patrick
>>
>> _______________________________________________
>> Gramps-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gramps-devel
>
> _______________________________________________
> Gramps-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gramps-devel


_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Gramps translation

Patrick Gerlier
In reply to this post by prculley

Thanks Paul for your quick feedback.

Le 23/03/2020 à 15:12, Paul Culley a écrit :
I understand your concern and applaud your willingness to try to improve things.  Any changes made to the French translation alone seem like a reasonable work item.  I would definitely work with or have Jérôme Rapinat <[hidden email]> review your work, as he has been the most recent translator working on our French translation.  And he is a senior developer.
I am already in contact with Jérôme.

Regarding changes to the untranslated strings in the Gramps code; as you probably realize, any change will invalidate all the translations done in all languages for the modified string.  That would not be much of a problem if we had a full set of active translators...  But we don't, so that would mean that the changed translations might become completely unreadable to non-English speakers for an undetermined time.

This is the biggest challenge. I don't want to break everything, though some items need serious rewriting to be really understandable.

A temporary solution would be a personal implication in the on-line user"s manual/wiki to make things clearer. But I'm not sure it has the required structure. I'll think over it when I'm done with my present tasks.


As a result, in my opinion, you should only submit changes to the source code where it is clearly wrong, not just difficult to understand.  If you could find a way for the gettext utilities to accept  "Noun|male"or just "male" as equivalent (with the exact match given a preference when both exist in the .mo file), then I could see updating the source code to add the context string.
My problem: this is my first real contact with Python. I understand the gist of the code but I'm not fluent with all the subtleties.

Just my opinions, others may differ.
Respectable and sensible

Paul C.



_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Reply | Threaded
Open this post in threaded view
|

Re: Gramps translation

John Ralls-2
In reply to this post by prculley


> On Mar 23, 2020, at 7:12 AM, Paul Culley <[hidden email]> wrote:
>
> Regarding changes to the untranslated strings in the Gramps code; as you probably realize, any change will invalidate all the translations done in all languages for the modified string.  That would not be much of a problem if we had a full set of active translators...  But we don't, so that would mean that the changed translations might become completely unreadable to non-English speakers for an undetermined time.
>

It's not that bad: As long as the new translatable string (msgid for gettext) is close to its predecessor and in the same place gettext will mark it as fuzzy when it merges the new gramps.pot with the po. Simply removing the fuzzy tag will restore the translation. In more extreme cases a git diff on the po file after merging a new gramps.pot will reveal msgids that changed too much for gettext to recognize, and even someone with no knowledge of the target language can apply the old msgstr (translation) to the new msgid. Similarly where one adds context hints to a set of msgids one can copy the old msgstr to all of the newly contextualized msgids. It might be wrong but it's no more wrong than it was before.

One wouldn't want to do that to a msgstr that's completely wrong for the obvious reason that the translation is also very likely to be wrong. In that case one must make a judgement call: Is it better for the end user to have a misleading tooltip in their own language or a correct one in English? I can't think of any case where I'd prefer the misleading tooltip.

To keep the job from being overwhelming for both the contributor and reviewers do one source file per commit and no more than one directory per PR.

Regards,
John Ralls



_______________________________________________
Gramps-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-devel