Tuesday, April 28, 2015

Staring at Goats IV: Movement

This is part 4 of a series of posts about goat breeding data.
[part 1: introduction] [part 2: Gender] [part 3: Fertility]


Goats are born at a breeder's site, but often move to another owner. Both breeders and owners are identified as a "stable". So we can examine how goats move between breeders and owners.

Recall from part 1 our graph model (Geit=goat, Stal= stable, Eigenaar=owner, Fokker=breeder):
For each goat, we thus know the breeder and the owner through relationships between goats and stables.

The database (or export thereof) does not keep track of intermediate owners, only current owners. So we have information about the breeder, and the current owner only.

There are four special "owners" with a special identifier:
NL001: Sold to an unregistered owner
NL002: Sold to a trader
NL003: Sold to the butcher
NL004: Dead

A simple cypher query shows how many goats are owned by one of the special owner IDs:
match (g:GEIT)-[:OWNER]->(stal:STAL{nr:"NL001"}) return count(g)
And similarly for the codes NL002, NL003, and NL004 This results in:
NL001: 2152 goats sold to an unregistered owner
NL002: 952 goats sold to a trader 
NL003: 1776 goats sold to the butcher (1232 bucks...)
NL004: 2929 goats dead

We exclude the special NL codes for further analysis, because they dominate the results. After all, eventually a goat dies and will get one of the NL codes.
So let's see which ten breeders and owners exchange most goats:
match (s1:STAL)<-[:FOKKER]-(g:GEIT)-[:OWNER]->(s2:STAL) where s2.naam <> s1.naam and NOT s2.nr IN ["NL001","NL002","NL003","NL004","NL005"] and s1.nr<>"ONB1" and s2.nr<>"ONB1" return s1.nr,s1.naam,count(g),s2.nr,s2.naam order by count(g) descending limit 10

So, breeder with number NB093 has 22 goats delivered to current owner LI013.

So far, no rocket science. Let's now see how goats move about between the Dutch provinces. Stable identifiers start with two letters which indicate the province: DR=Drente, FL=Flevoland, FR=Friesland, GE=Gelderland, GR=Groningen, LI=Limburg, NB=Noord-Brabant, NH=Noord-Holland, OV=Overijsel, UT=Utrecht, ZE=Zeeland, ZH=Zuid-Holland. Then there is IN which stands for international, i.e., outside The Netherlands.

We now get a complex cypher query, where we have to do some pattern matching using regular expressions and we have to group the results into provinces. Here it is:
match (s1:STAL)<-[:FOKKER]-(g:GEIT)-[:OWNER]->(s2:STAL) where s1.nr=~"[A-Z]{2}[0-9]{3}" and s2.nr=~"[A-Z]{2}[0-9]{3}" with s1, g, s2, SUBSTRING(s1.nr,0,2) AS p1, SUBSTRING(s2.nr,0,2) AS p2 return p1,p2,count(g) order by p1
Note how we use the regular expression in the where clause, and how we pass the results on to group them by province using the WITH keyword. The result is a table for which the first lines are:

So, 28 goats where bred in DR and their current owner is in OV, etc. We now put this into a matrix form:
The left column identifies the breeder's province, whereas the top row identifies the current owner's province. We deliberately only included goats in the cypher query where the breeder is not the same as the current owner. So the numbers identify goats that actually moved about.

We can turn this into a neat visual using the D3.js library.

This visual is called a chord diagram. The colour coding of the connections is such that it has the colour of the breeder's province, and points to the current owner's province. The thicker the source, the larger the number of goats.

For all provinces, most goats stay within the province. Next are the neighbouring provinces. Although it is interesting to see that no goat was moved from Zuid-Holland to Noord-Holland. (At least of the current owners, thus live goats.) Movements across large distances are uncommon. And international movements are also rare (due to legislation, of course). Just a few goats have been moved from other countries back into The Netherlands.

The D3.js library allows interaction as well, but unfortunately I can't show that on this page, because this blog site does not allow such Javascript on the page. Therefore, I created a separate page where you can play around with it, and you can view the source code of that page to see how it actually works: http://mekkerwei.be/data/goatMovement2.html

No comments:

Post a Comment