|
12
|
Excuse me for butting in here, as a mere occasional (wintertime)
ancestry buff,
I know nothing of the db s/w which is being used on various platforms,
but perhaps it is time to review those which were of the lite (or light)
variety, and which may not support the facilities of larger dbs, such
as are beginning to be created by the increasing number of users.
As a retired IT specialist, who worked alongside real db experts, I
have dredged from the back of my memory, some principles of db design,
for where tables are growing rapidly.
My suggestion is to llok at the use of techniques such table indexes and
two-way (forward and backward) pointers for individual larger tables
which are liable to large volumes of changes.
I am sure that there are plenty of texts on db tuning online, and if you
look at (for example) Oracle, which had, to my memory, lots of such
techniques as options (I an NOT recommending Oracle at all for Gramps
btw), it might aid you to select better db s/w to take Gramps forward
Keep up the good work
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
|
On Thu, 13 Aug 2015 12:13:05 +0200
mtoby < [hidden email]> wrote:
Hello mtoby,
>I know nothing of the db s/w which is being used on various platforms,
>but perhaps it is time to review those which were of the lite (or
>light) variety, and which may not support the facilities of larger
>dbs, such as are beginning to be created by the increasing number of
>users.
Such work is already being done. There's also some discussion regarding
the matter on the developer's list.
Although, to be fair, Tim's problems may not be solely due to poor
database scaling, but Gramps' need to update the screen constantly.
Work is also being done on that. Improvements are being made, I believe.
--
Regards _
/ ) "The blindingly obvious is
/ _)rad never immediately apparent"
You don't entertain ideas you simply bore them
I Don't Like You - Stiff Little Fingers
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
|
On Thu, 13 Aug 2015 12:13:05 +0200
mtoby < [hidden email]> wrote:
Hello mtoby,
>Keep up the good work
Hear, hear.
I should add that I'm not on the dev team, but do read messages on the
developer mailing list.
--
Regards _
/ ) "The blindingly obvious is
/ _)rad never immediately apparent"
He signed up for just three years, it seemed a small amount
Tin Soldiers - Stiff Little Fingers
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
|
Hi Josip,
> Now when playing with Doug's places test data i also notice terrible lag
> in scrolling. It's quite unacceptable for table with only 65K rows.
> Firefox SQLiteManager plugin for example did not have such problems with
> same database, both fill and scroll of view is fast.
>
> I agree that we do something wrong and must be that overuse one of the
> callbacks. Or in gtk-2 to gtk-3 transition one of our private function
> now overrides gtk's one (ones with "do_" prefix)
I think it's overuse, i.e. fellow developers didn't expects callbacks
being triggered for events like a row becoming visible, which happens a
lot during scrolling, or on mouse hover. You don't even have to click
for that.
In tree view, one of these callbacks tries to find a person by walking
the tree, starting at the first (or empty) surname, scanning all persons
grouped under that surname, until the person is found. The time needed
for this search has a linear relation to the place of the person in the
sorted tree, so the further you scroll to Z, the longer it takes. This
happens for all persons that appear on screen during scrolling, also
when their surname is collapsed I think, and each callback reads data
from the database.
We cannot override this callback, nor a few others, because the person
needs to found to refresh the view on delete, or when a surname is
changed, and the person has to be shown in another part of the tree. And
I bet the same is true for places shown in a hierarchy like you wrote.
A faster database like SQLite will not prevent this, but if these
callbacks use the database, scrolling will of course be faster. We
should not treat the symptom though, but change the code so that it
makes better use of gtk-3.
Nick made a prototype which scrolls very well, but the view in that
takes longer to load because all data is loaded in memory, so that we
need less callbacks to read the database, and above search can be
avoided. Loading the view takes a bit longer then, and I don't know how
good it works when you have a Million persons in your tree. A faster
database will very probably help though, in that case.
> Again for example:
> https://git.gnome.org/browse/pygobject/plain/demos/gtk-demo/demos/TreeView/treemodel_large.py> changing number of rows (item_count) from hundred thousand to one
> million affect only loading time not scrolling one
Right, and I even tried with 10 Million, and saw no delay in scrolling.
It took a few minutes to initialize the data, and it takes more than 500
MB RAM, but it scrolls fast. Only thing is that this doesn't really look
like a tree view where you can expand rows like we can expand surnames
or places.
regards,
Enno
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
|
On 08/13/2015 07:10 AM, Doug Blank
wrote:
[snip]
Will the records still be in Pickle format, or individually named
fields?
Gramps still freezes on me too often to even think of using this!!
:(
[snip]
Whate are XML-1.6 and JSON?
Will JSON replace XML as the backup format?
--
"Salads are only for murderers, cole slaw's a fascist regime!"
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
|
On 08/13/2015 10:28 AM, Doug Blank
wrote:
[snip]
There must be something else different for the import to be 11x
faster, even in v4.2!!
--
"Salads are only for murderers, cole slaw's a fascist regime!"
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
|
13. 08. 2015. u 13:52, Enno Borgsteede je napisao/la:
>> Again for example:
>> > https://git.gnome.org/browse/pygobject/plain/demos/gtk-demo/demos/TreeView/treemodel_large.py>> > changing number of rows (item_count) from hundred thousand to one
>> > million affect only loading time not scrolling one
> Right, and I even tried with 10 Million, and saw no delay in scrolling.
> It took a few minutes to initialize the data, and it takes more than 500
> MB RAM, but it scrolls fast. Only thing is that this doesn't really look
> like a tree view where you can expand rows like we can expand surnames
> or places.
All i want to say is that scrolling view is fast unless slowed down with
unnecessary or badly written callback.
Speed of displaying view can be handled in two ways: only display small
portion of data or dynamically update view outer from main thread.
Best is to use both of them togheter and treeview is good for that.
Check code at:
https://gist.github.com/bpisoj/ef89112fa0b5083bed37It is a simple file browser. It loads first directory level from homedir
quickly and is ready to use (scroll), in meantime other directory levels
is added in background not blocking gui.
To see it work in slow-motion (gui is still normally usable) uncomment
"time.sleep" lines in "_thread_scanning" method.
p.s.
It is just a quick draft, it should be recursive but i think it is good
inaf to show how new gazetter can load/shows more info
--
Josip
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
|
On 13/08/2015 14:10, Doug Blank wrote:
This raises key questions for the developers, in that you will have
to issue guidance to users who
a. know nothing about databases or plugins
b. have no idea how many entries they may have to start
with, and for example, if they accept and follow up every hint
that Ancestry.com supplies may end up with many distant
ancestors and potential living and very distant relatives (in my
case one of my ancestors seems to have had a (possibly adopted)
brother who was on HMS Bounty, and who has many living descendants
from Pitcairn Island still living in that area- Marlon Brando eat
your heart out).
c. will need tools to help them with measuring the size
of their evolving database(s), and offering easy to use tools to
transit from one backend to another
d. Can they tune the size of their Gramps programs, by
simple options and in easy to understand layman's terms, to have
in-memory tables if recommended by Gramps, or setting certain
preferences in Gramps ( please bear in mind that they should have
no idea at what cost to their computers which can be on just about
any platform and any operating system)
e. If they follow guidance and are told to select a
sophisticated database because of their end-user needs, are there
a good set of user friendly support tools, which may actually be
Gramps front-ended to the complex underlying tools that do the
real work
Doug, I have no doubt at all that the above reply is clearly
intended to be helpful, in reply to my mail, but I have used
computers continuously since 1964, as a developer, pre-sales and
post sales support, product introduction manager, technical manual
reviewer etcetera, and I understand nothing of what has been said,
in terms of how to choose which backend to use. So how will a
typical Gramps user with practically no IT knowledge, be able to
decide what to use !
Please everybody, my reply here is meant to be helpful to the
development team , to improve the product, and make it user-friendly
to all types of user, from the person who records 100 people in one
locality, up to those with a million data items or more, who share
them with distant relatives far far away
As I said above, it should be possible to put a Gramps front-end on
the SQL tools for all of the others who are not
regular programmers to use
Perhaps this example may aid in seeing where I am coming from.
20 odd years ago, the IT company that I worked for was developing a
one-size fits all, DMS database, to manage corporate networks for
both voice and data, and this would contain information on all
nodes, of all types, and also contain information on up time, down
time, connections, locations and so on.
This was in the era when relational databases were just being
developed, but the developers were trying to adapt a system suitable
for a particular huge corporate entity, to work with a totally
different audience as well. This was all very well for the
organisations who had many mainframes, dedicated DM developers, DM
sizing teams, Data support controllers who tracked the the
day-to-day usage of the database, and knew when to take the DB
off-line to retune and reorganise it, but all of this was beyond the
capability of companies with a single mainframe and no DM experience
at all, and as such, I was in the product introduction team which
rejected its general release because it was not fit for the wide
market it was targetted at, (and within 2 years, a sophisticated
relational database, acceptable to this market, took its place)
So the fact that data can be populated faster or slower in a
database means nothing much to an end user, since it is (possibly) a
one-off event.
The end-user is interested in getting
fast responses to their easy to choose and use queries,
the ability to modify their data easily and quickly in a time
which is acceptable (less than 10 seconds),
the revised information appear on the screen, again in a time
which is acceptable (less than 10 seconds)
no need to re-organise the db (or unload and reload it) every time
they modify it by adding 100 pieces of information (whether from a
serial input or a screen input)
I can Doug, I can
Regards
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
|
On 08/13/2015 02:58 PM, mtoby wrote:
[snip]
> Doug, I have no doubt at all that the above reply is clearly intended to
> be helpful, in reply to my mail, but I have used computers continuously
> since 1964, as a developer, pre-sales and post sales support, product
> introduction manager, technical manual reviewer etcetera, and I understand
> nothing of what has been said, in terms of how to choose which backend to
> use. So how will a typical Gramps user with practically no IT knowledge,
> be able to decide what to use !
Like with just about every other computer choice that the technically
illiterate make: accept the default.
The import program might also look at the size of the import file, and ask
some questions before making a suggestion.
[snip]
> So the fact that data can be populated faster or slower in a database
> means nothing much to an end user, since it is (possibly) a one-off event.
Well that's just not true. As someone who's been in post-sales support, you
must know that a system is useless if it takes an impractically long time to
load do the initial load.
--
"Salads are only for murderers, cole slaw's a fascist regime!"
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
|
Brad Rogers wrote
Yes, that's exactly what I meant. Because, at the current rate of
excision of spurious data from your large database, it's going to take
over 1200 hours to accomplish your goal. With two copies of Gramps
open, one with the old db and one with the new, cutting and pasting
between the two may not take any longer.
Any way you look at it, this situation will be time-consuming with the resources that I have right now. I guess that's why I'm hoping for a solution. For instance, if I can get the patch to work that bipasses all the unnecessary procedures for batch removals. Or if gramps v.5 is released within a year or two, I'd consider waiting... it's not that the database doesn't work, it's just that half the data is unnecessary.
Finally; I marvel at the size of your db. I've only been at genealogy
for ten years or so, and have fewer than one thousand individuals
contained therein. To get 600k people would take me over 600 years,
based on my work rate of the last decade. And that's ignoring the laws
of diminishing returns, and the almost inevitable tree collapse.
The 600k db was mostly a cumulation of gedcoms that I imported from others (some times I didn't know for sure if anyone was related, but my pholosophy at that time was: if they could be related then import it). At one time I had about 700k, but there were/are so many duplicate/triplicate people. (And I would spend about 80% of my time merging them -- and more times then not there was some source or piece of data that did not exist in one of the duplicates.) About a year ago I exported all related people to me from the 600k, and arrived at 158k. (Of these 158k I have really only discovered, maybe 1/4 of them.) But I generated a gedcom from the 600k that I currently use as a resorce, and recently I discovered a new line of relatives that was in the 600k db and so I exported it (that's where my problems all started). I still think there is a fair bit of info in the 600k db that I've missed -- and I still like the idea of having one big database so that any potential new data is always available -- it's like having my own "ancestry.com" (but currently it's clear that gramps is not made to handle 600k dbases).
BTW, I started my family history in 1994 (so about 20 years ago), and in between I did take a few long breaks from working on it.
|
|
Josip wrote
13. 08. 2015. u 13:52, Enno Borgsteede je napisao/la:
>> Again for example:
>> > https://git.gnome.org/browse/pygobject/plain/demos/gtk-demo/demos/TreeView/treemodel_large.py>> > changing number of rows (item_count) from hundred thousand to one
>> > million affect only loading time not scrolling one
> Right, and I even tried with 10 Million, and saw no delay in scrolling.
> It took a few minutes to initialize the data, and it takes more than 500
> MB RAM, but it scrolls fast. Only thing is that this doesn't really look
I'd be interested to know how long it would take to initialize if a SSD was used. I found that since I switch to a SSD, gramps starts about 10x faster... if SSDs are going to be default storage in the future, why not build future gramps with this is mind.
tim k
|
|
On 13/08/15 21:19, Ron Johnson wrote:
> On 08/13/2015 02:58 PM, mtoby wrote:
> [snip]
>> Doug, I have no doubt at all that the above reply is clearly intended to
>> be helpful, in reply to my mail, but I have used computers continuously
>> since 1964, as a developer, pre-sales and post sales support, product
>> introduction manager, technical manual reviewer etcetera, and I understand
>> nothing of what has been said, in terms of how to choose which backend to
>> use. So how will a typical Gramps user with practically no IT knowledge,
>> be able to decide what to use !
> Like with just about every other computer choice that the technically
> illiterate make: accept the default.
>
--- Even the technically literate will accept the defaults in many cases
because of the hopefully correct assumptions that the defaults are the
'best' choices that the developers could make, and we assume that the
developers are technically literate.
I am very much in the same technically educated sphere as Mtoby, have
developed communications protocols, databases for Packet switch systems
management, and all sorts of other good things.
Peter M.
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
|
On Thu, 13 Aug 2015 21:23:54 -0700 (PDT)
TJMcK < [hidden email]> wrote:
Hello TJMcK,
>all the unnecessary procedures for batch removals. Or if gramps v.5 is
>released within a year or two, I'd consider waiting... it's not that
There was notice given of a release date of 2016 not so long ago. I
can't remember whether it was here on the dev list. Either way, it
might be worth hanging on for v5.
--
Regards _
/ ) "The blindingly obvious is
/ _)rad never immediately apparent"
Now would I say something that wasn't true?
Would I Lie To You - Eurythmics
------------------------------------------------------------------------------
_______________________________________________
Gramps-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gramps-users
|
12
|