Ian Jackson
2018-09-12 20:48:56 UTC
Hi. I hope this is a suitable list for my question. If not, please
direct me elsewhere...
I am doing the i18n for a package (src:dgit) which I think it will be
useful to translate (at least, much of it). It's a Debian native
package containing mostly perl scripts.
I'm not sure of the best approach. My main questions:
1. There doesn't seem to be any standard set of Makefile machiner to
include, or anything. Do I really have to write my own make rules to
run xgettext etc. ? I looked at the debconf source package, which
seemed like it would be a good example, and it had its own rules. I
can write my own rules if that is best; they're not huge. It just
seemed a bit wheel-reinventish.
(NB that I don't want to instroduce use of automake into what is
currently a simple "upstream" Makefile; if it comes to that I would
prefer just to write my own rules for this.)
1(a). While looking at the debconf source package i18n make rules I
saw it used a program `remove-potcdate' to avoid needless updates to
the pot file. Should I really copy this (trivial) script, from
src:debconf into src:dgit, along with the substance of the make
rules which invoke it ?
2. I am unsure of the best layout of the .pot, .po, po4a, etc., files.
The convention I saw in src:debconf was to have a directory `po'
containing a single `debcconf.pot', all the message translations
LANG.po, and the corresponding Makefile and script machinery. I
dislike the idea of mixing up files edited by translators with make
machinery, but I can tolerate it if it's conventional and would
disturb people if I did it differently.
In src:debconf I also saw po4a in use. The translations were all in
doc/man/po4a/po/LANG.po, and there was also
doc/man/po4a/add_LANG/addendum.man.LANG. This all seemed a bit ad
hoc.
Is there a standard layout ? See also my next question, which may
influence the answer to this one.
Relatedly, how do automatic translation coversge tools (we have those
I think?) deal with the variety of different possible layouts ?
3. I am not sure how to divide up my translation inputs (pot files).
My single source package generates two binary packages. The two
binary packages are rather different; they perform different roles
(although they work well together) and have different (but
overlapping) audiences.
This might reasonably influence the way the messages from the two
packages (really, the two programs) are translated. So maybe I should
have two .pot files for the two sets of messages.
But the programs share a small set of library code. The library code
does not have many messages, but there are some. These messages
should be translated only once. So if I split it up, there would have
to be *three* .pot files for messages: dgit, git-debrebase and common.
(I think I can use tools like xgettext and msgcat, with appropriate
make runes, to handle any arbitrary organisation of .pot files that I
decide on.)
The need for splitting up is perhaps more acute for the documentation.
I will use po4a for that. (po4a has a powerful system for handling
almost arbitrarily strange layouts.)
The git-debrebse package has its own data model and conceptual model,
and its documentation is carefully written to talk about that in the
right terms. Additionally, perhaps it is useful for a translator to
know whether a string they are translating is part of a reference
manual or a tutorial.
But src:debconf does not split like this so maybe it is not useful ?
Or maybe it is even harmful because it might involve duplicating
certain "framework" parts or something ?
4. Terminology in translations.
As I say, one of the two packages has a specific conceptual model.
Yhat has its own terminology, which is defined in a section 5 manpage.
It is important that if and when this is translated, thought is given
to what translated names to give for each of the English terms; and
that this settled terminology is then used consistenty throughout all
of the documentation.
Also, the terminology appears, in some cases, as protocol elements
(which are in text and amy be displayed to the user). These obviously
cannot be translated or things will break. So I think, ideally, when
the terms are defined in the section 5 manpage, the English words
should be stated alongside the translated ones.
Can I (should I) leave a note to translators about these issues ?
The relevant documents are in perl pod format.
5. Translation priority
Obviously translators are volunteers and will work on what they feel
is most important. But I think some parts are much more important to
translate than others:
These tools, particularly dgit, are useful within Debian but also,
IMO, extremely useful outside it. Different people will use it in
different ways.
This is reflected in the documentation. Some of the documentation is
aimed at users and downstreams; whereas some is aimed primarily at
Debian maintainers for whom it is less important to have translations
since much of the rest of their work has to be done in English.
Is there a sensible way to inform translators about this kind of
thing, so that they can spend their time wisely ? I think maybe I
would like to tag some documents as high, medium, or low priority, or
something.
6. Committing the .pot file
AFAICT it is conventional for the .pot file(s), generated
automatically from the source code with xgettext, to included in
source packages, git repos, etc.
That seems odd. What is the reason for this ? Can I sensibly diverge
from this and expect translators etc. to run a build rune to get the
.pot files ?
I was surprised not to find answers to my questions in the
documentation for gettext, etc. Am I missing some best practice
guide ?
All advice and opinions gratefully appreciated.
Thanks,
Ian.
direct me elsewhere...
I am doing the i18n for a package (src:dgit) which I think it will be
useful to translate (at least, much of it). It's a Debian native
package containing mostly perl scripts.
I'm not sure of the best approach. My main questions:
1. There doesn't seem to be any standard set of Makefile machiner to
include, or anything. Do I really have to write my own make rules to
run xgettext etc. ? I looked at the debconf source package, which
seemed like it would be a good example, and it had its own rules. I
can write my own rules if that is best; they're not huge. It just
seemed a bit wheel-reinventish.
(NB that I don't want to instroduce use of automake into what is
currently a simple "upstream" Makefile; if it comes to that I would
prefer just to write my own rules for this.)
1(a). While looking at the debconf source package i18n make rules I
saw it used a program `remove-potcdate' to avoid needless updates to
the pot file. Should I really copy this (trivial) script, from
src:debconf into src:dgit, along with the substance of the make
rules which invoke it ?
2. I am unsure of the best layout of the .pot, .po, po4a, etc., files.
The convention I saw in src:debconf was to have a directory `po'
containing a single `debcconf.pot', all the message translations
LANG.po, and the corresponding Makefile and script machinery. I
dislike the idea of mixing up files edited by translators with make
machinery, but I can tolerate it if it's conventional and would
disturb people if I did it differently.
In src:debconf I also saw po4a in use. The translations were all in
doc/man/po4a/po/LANG.po, and there was also
doc/man/po4a/add_LANG/addendum.man.LANG. This all seemed a bit ad
hoc.
Is there a standard layout ? See also my next question, which may
influence the answer to this one.
Relatedly, how do automatic translation coversge tools (we have those
I think?) deal with the variety of different possible layouts ?
3. I am not sure how to divide up my translation inputs (pot files).
My single source package generates two binary packages. The two
binary packages are rather different; they perform different roles
(although they work well together) and have different (but
overlapping) audiences.
This might reasonably influence the way the messages from the two
packages (really, the two programs) are translated. So maybe I should
have two .pot files for the two sets of messages.
But the programs share a small set of library code. The library code
does not have many messages, but there are some. These messages
should be translated only once. So if I split it up, there would have
to be *three* .pot files for messages: dgit, git-debrebase and common.
(I think I can use tools like xgettext and msgcat, with appropriate
make runes, to handle any arbitrary organisation of .pot files that I
decide on.)
The need for splitting up is perhaps more acute for the documentation.
I will use po4a for that. (po4a has a powerful system for handling
almost arbitrarily strange layouts.)
The git-debrebse package has its own data model and conceptual model,
and its documentation is carefully written to talk about that in the
right terms. Additionally, perhaps it is useful for a translator to
know whether a string they are translating is part of a reference
manual or a tutorial.
But src:debconf does not split like this so maybe it is not useful ?
Or maybe it is even harmful because it might involve duplicating
certain "framework" parts or something ?
4. Terminology in translations.
As I say, one of the two packages has a specific conceptual model.
Yhat has its own terminology, which is defined in a section 5 manpage.
It is important that if and when this is translated, thought is given
to what translated names to give for each of the English terms; and
that this settled terminology is then used consistenty throughout all
of the documentation.
Also, the terminology appears, in some cases, as protocol elements
(which are in text and amy be displayed to the user). These obviously
cannot be translated or things will break. So I think, ideally, when
the terms are defined in the section 5 manpage, the English words
should be stated alongside the translated ones.
Can I (should I) leave a note to translators about these issues ?
The relevant documents are in perl pod format.
5. Translation priority
Obviously translators are volunteers and will work on what they feel
is most important. But I think some parts are much more important to
translate than others:
These tools, particularly dgit, are useful within Debian but also,
IMO, extremely useful outside it. Different people will use it in
different ways.
This is reflected in the documentation. Some of the documentation is
aimed at users and downstreams; whereas some is aimed primarily at
Debian maintainers for whom it is less important to have translations
since much of the rest of their work has to be done in English.
Is there a sensible way to inform translators about this kind of
thing, so that they can spend their time wisely ? I think maybe I
would like to tag some documents as high, medium, or low priority, or
something.
6. Committing the .pot file
AFAICT it is conventional for the .pot file(s), generated
automatically from the source code with xgettext, to included in
source packages, git repos, etc.
That seems odd. What is the reason for this ? Can I sensibly diverge
from this and expect translators etc. to run a build rune to get the
.pot files ?
I was surprised not to find answers to my questions in the
documentation for gettext, etc. Am I missing some best practice
guide ?
All advice and opinions gratefully appreciated.
Thanks,
Ian.
--
Ian Jackson <***@chiark.greenend.org.uk> These opinions are my own.
If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.
Ian Jackson <***@chiark.greenend.org.uk> These opinions are my own.
If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.