|
About the database, I still have one issue. Perhaps it is best to decide how
this issue is resolved before to much coding is done. I started thinking about the preson-source link and the fact that apparently many people link the same source multiple times to a person because the person appears on different pages in the source and they want to have these relevant notes separated. I do not know if this was the intention of the designers, but the design of gramps allows multiple instances of the same source so people use it. I see one can also attach the same photo multiple times to a person. It is my opinion that for every relation possible the programmer should strickly decide what is meant, and forbid duplicates if this is silly, like the same pictures multiple times. It reduces complexity later on should the picture be deleted. So I think it should be decided what may be duplicated and what not, and what do we mean with duplicates (important if things change in the future or if reports are made to know how to handle the duplicates) One can argument that duplicate sources is not the best way. Just like repositories are created to bundle sources, it would be nice if on the source tab of a person you see a source only onces, and then you have multiple references/pages within that source possible. It would make retrieval later to edit a part easier (if you have 10 times the same source and you found out who the other people on a photo are in this source, which one of the 10 sourcerefs is the page where the photo is on? You start opening one after the other). Personally I have another practical problem which is in the same line: One of my sources is "Grandma's collection of death letters" (I don't think you call it death letters in english, but the hollywood movies I saw never use this word so I don't know it). The collection is partly scanned in (the old ones and the damaged ones especially). Some information of a person comes from their death letter, but for others it comes of their wifes/child letter. I think you see where I'm heading. I have a person of which I know information due to the spouses death lettre. I go to source and add "Grandma's collection of death letters", and in the text I put the name of the death letter used. This does not allow easy retrieval of how I got the information, as the collection is large and has many media objects. It would be nice if I can just add the scanned picture of the wife's letter as a media object to the specific sourcereference of this person. I suppose with the new repository object similar questions might arise. If this is something that might be added in the future to gramps, the database changes done now should make it easy to add this feature later, and not hinder it. So general question: should this be designed in, or is this too far fetched? Does anyone see a reason one would actually want two separate person-source links? Now the technical part on what implementing/allowing this has on the proposed design: From what richard says, he goes with a design that has a unique key for a sourceref in person, and I think it's good. As present gramps allows for the same source to be present multiple times so this design does mean some extra list checking is needed: If a sourceref is deleted, the reference map may only be deleted if all sourcerefs to the same source are deleted. The design suggested is: > OK, at this level I think better in code so I have implemented what I > think is a reasonable solution. Here goes an explanation: > > A single reference_map table with three keys: > > main key: is a tuple of > (primary_object_handle,referenced_object_handle) and is guaranteed to > be unique. > secondary_key1: the primary_object_handle allows duplicates and uses BTree > secondary_key2: the referenced_object_handle allows duplicates and > uses BTree > > data: a tuple of the form: ((primary_object_class_name, > primary_object_handle), > (referenced_object_class_name, > referenced_object_handle)) > > The main key can be used to quickly check for the existence of a > particular primary_object/referenced_object pair. > The secondary_key1 can be used to lookup for deletion. > The secondary_key2 can be used to lookup for search. So the table is stored by a hash, which I also think is best. The design however does not allow sourceref to have themselves mediarefs (well it does you can do anything in BSDDB remember, but the way to do it is counternatural I mean). The following modification would allow this. I say it with an example: We have source1, media1, media2, person1 . source1 has the 2 media connected to them, so there are 2 mediarefs: source1media1 and source1media2. Person1 has source1 connected, so a sourceref person1source1. We want to allow in the future that the sourceref can have a media coupled to it, so eg media2 is the page of the source where person1 is mentioned: so we need a mediareference in the sourcereference: person1source1_media2 The new table referencemap is created to do access backwards. So it contains key1, (key person1, key source1) key2, (key source1, key media1) key3, (key source1, key media2) A search on where is source1 used with this table (the backward search) will give you immediately as result person1. How do we make it easy to later also allow media objects connected to sourcerefs. We need to add the following to the table: key4, (key1 = key of sourceref person1 to source1, key media2) A search on where is media2 used with this table will give you immediately as result source1 and key1. What does the above example imply: I would use as key1 not a tuple of (primary_object_handle,referenced_object_handle) but really a unique key just as the other gramps key. For ease this key can be kept in the sourceref. I would do this but note that it is not needed. By combining the two secondary indices (possible in BDSDB) one can quickly find this key given only the person key and the source key. Many things to think over, and the above can be implemented in many ways. Should the sourceref data be moved to this reference map, it would simplify the above construction, but I said that before no ;-) ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Gramps-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gramps-devel |
|
On Fri, 2005-12-16 at 10:31 +0100, [hidden email] wrote:
> I started thinking about the preson-source link and the fact that apparently > many people link the same source multiple times to a person because the person > appears on different pages in the source and they want to have these relevant > notes separated. > > I do not know if this was the intention of the designers, but the design of > gramps allows multiple instances of the same source so people use it. I > see one > can also attach the same photo multiple times to a person. It is my opinion > that for every relation possible the programmer should strickly decide what is > meant, and forbid duplicates if this is silly, like the same pictures multiple > times. It reduces complexity later on should the picture be deleted. So > I think > it should be decided what may be duplicated and what not, and what do we mean > with duplicates (important if things change in the future or if reports are > made to know how to handle the duplicates) We have a common source of information, for this example, it is a book called "All about John Smith". We get a lot of information about "John Smith" from this book. For example, it has information recording the person's full name on page 100. It also indicates on page 300 that this same "John Smith" sometimes spelled his name as "Jon Smith". In this case, the person's primary name would be "John Smith", and we would attach a source reference to this source (the book) to this name. In this source reference, we would indicate in the source reference that we found the information on 100. We would also add the name of "Jon Smith", create a link to the same source (after all, it is the same book), and indicate on this source reference that we found the information on page 300. While the sourceref point to the same source, they are not identical. So, kind of in a nutshell, multiple source references in an object referring to the same source is something that we need to support, and is something that many users are using right now. > If this is something that might be added in the future to gramps, the database > changes done now should make it easy to add this feature later, and not hinder > it. As someone who has worked in industry for over 20 years now, I have found that while this may sound good in theory, many times this is not a good idea in practice. In projects that I have worked on (both software and hardware), the subject comes up that "we may need to do this in the future". In almost every case where we have "planned" features for the future, we ended up getting a large, complex, and bloated infrastructure that everyone had to deal with. And 9 times out of 10, the thing what we may have wanted to do for the future never came up, and when it did, we found that the implementation wasn't what we originally thought it might need to be. In fact, in many cases these "prepare for the future" enhancements prevented us from doing what we really need to do in the future. -- Don Allingham <[hidden email]> |
| Powered by Nabble | Edit this page |
