Re : Gramps and Ancestry hints - a proof of concept
Some bug reports and feature requests have been filled around Ancestry stuff.
Reporters generated complete and clear reports, but this could lead to additional
issues (#1727 #8191). For now, we may find hack (#6941) or quick warning (#9298)
and we might generate some new sections via Gedcom extensions addon.
I do not use Ancestry myself and do not want to break current gedcom file format support.
So, I do not know what should be done, because it sounds like a game for Ancestry!
(#9298~46910). Also, gedcom file format is not my favourite playground.
If any expert on "Ancestry-Gramps" round-trip, or a gedcom wizard, or anyone else,
could test or review the patch on #9249 too, this could be great.
En date de : Mar 15.3.16, Tom Samstag <[hidden email]> a écrit :
Objet: [Gramps-devel] Gramps and Ancestry hints - a proof of concept
À: [hidden email] Date: Mardi 15 mars 2016, 21h52
I want to utilize Ancestry hints in my research. My research
is relatively new and incomplete, and
there are likely many records I haven't yet cited. I use
Gramps as my primary tree management
program. In using Ancestry hints, I'd like to be guided
toward records I haven't yet used, but I
have no interest in managing a tree on Ancestry. Any new
records that Ancestry helps me to find, I
will attach to my tree in Gramps in the same way I've been.
So any data transfer between Gramps and
Ancestry need only be one-way; I don't need to round-trip
any data out of Ancestry. I'd just like
Ancestry to be able to analyze my data and guide me, not
become my platform.
So I can export a gedcom from Gramps and upload it to
Ancestry. Doing so will give me hints, but
most of those hints will be noise. That's because Ancestry
has no way of knowing that the citation I
have to a source titled "United States Census, 1920" is the
same as a specific record in their
database. So most of the hints that Ancestry will give will
be duplicates of what I already have.
Ancestry can get that information through a gedcom upload
though. It uses a proprietary tag _APID
that references the database id and the record id. So if we
could somehow enter that data into
Gramps and get it to export it into gedcom, we'd be good.
So one problem is where in the Gramps object hierarchy to
store that info. The way that the
information is traditionally organized, (e.g. through the
census/forms addons) is n people in the
same event, that event having one citation, to the specific
source. The APID value is distinct for
each person in the event, but each person can be in multiple
The proof of concept:
So I've created a proof of concept to tackle this problem.
It consists of some changes to the gedcom
exporter, and an optional gramplet to make data entry
Each Gramps source should map to a given Ancestry database
ID. So each source has an attribute with
Then, each person has attributes that reference their record
number for a given database ID. For
example (attribute names are likely to change):
Source: US Census, 1920
attribute: Ancestry DBID = 6061
Citation: Pennsylvania, Allegheny County, Pittsburgh City,
Pittsburgh, Ward 18, sheet 10A, family
227, Henry Watzlaf household
Event: Census event
Person: Henry George Watzlaf
attribute: Ancestry APID H:6061 =
Henry George Watzlaf will have other similar attributes for
his records in other databases, and the
other people that appear in the 1920 census will have an
attribute of "Ancestry APID H:6061" with
This will result in the CENS census gedcom record having a
SOUR source record which contains a line
of _APID 1,6061::49733277. When this gedcom is used to
create a tree on Ancestry, it will be a
reference to the Ancestry record at
http://search.ancestry.com/cgi-bin/sse.dll?indiv=1&dbid=6061&h=49733277 and the hint will no longer
be given since Ancestry will understand that I've already
cited that record.
My gramplet will, for the active citation (if its source has
the attribute), enumerate events cited
and list the people. It sorts them according to the order
attribute used by the census/forms addon.
Each person can have a record ID entered. For convenience,
you can also paste in a URL to the
record, and it pulls out the record id (from the "h"
So I've been going through the process of generating a
gedcom, uploading it to Ancestry, going
through the hints to attribute my citations, then deleting
the Ancestry tree and repeating. In doing
so, using only hints (not actively finding records on
Ancestry) it's succeeded at suggesting the
majority of records in my test set of sources . After
these attributes, remaining hints are
either false positives or genuinely records that I didn't
yet have cited in my tree.
One issue I've realized exists is if the same person appears
in multiple records within the same
database. For instance, if the same person is in a marriage
license as a groom and in another as the
father of the bride. I haven't tested this yet, I think the
back-end will work, but there isn't
enough information for the UI to be well behaved.
So if you've read this far, thanks! I'd like any feedback
you may have to offer. I'll package up the
code later tonight, but as another warning, it's still very
much first-attempt quality.
 US census sources, PA birth and death certificates,
United States Social Security Death Index