Language: English
chart golf  comment chart  comment  data  discharge  golf “demographic  inverted axis  inverted  plot  precipitation  rainfall  water  years 
Notes from the life of a [data] scientist

Comment on Years as coloured bars by Lauren C

Tue, 08 Aug 2017 15:10:46 +0000

I like the stacked bars the best. I don't think the final result is actually very intuitive in comparing one another.

Comment on Years as coloured bars by nsaunders

Sun, 06 Aug 2017 12:20:24 +0000

I just grabbed that image from the web at random for a "years by bars" example, so have not given it much thought. But you make a good point: the key thing is to isolate the aspect of the data that you want to highlight and go from there.

Comment on Years as coloured bars by Significance

Sat, 05 Aug 2017 11:25:47 +0000

I think for the airline travel example, the point is probably the seasonal trend (with years as cases) rather the the interannual trend, so facet by year with bars for categories (your second last plot) works better than facet by category (last plot) in that case. Or else two plots; a boxplot or similar with month as category plus a second plot to show the longer-term trend over time.

Comment on Chart golf: the “demographic tsunami” by Tim

Wed, 26 Jul 2017 14:40:47 +0000

ggplot(ausplot, aes(x=value, y=factor(Year))) + geom_joy() + theme_minimal() looks kinda neat?

Comment on Chart golf: the “demographic tsunami” by Tim

Wed, 26 Jul 2017 14:10:47 +0000

I'm going to give this a shot with a joypot (ggjoy)

Comment on Chart golf: the “demographic tsunami” by nsaunders

Sun, 23 Jul 2017 09:23:54 +0000

Had not thought of that, but you are probably right!

Comment on Chart golf: the “demographic tsunami” by Andrey

Fri, 21 Jul 2017 13:45:01 +0000

Most probably the reason behind putting persons ages 48 – 100+ into a different sheet is that ABS had run out of columns in .xls format. Real question here: why not upgrade to "new" (10 years old) .xlsx?

Comment on Hyetographs, hydrographs and highcharter by Joe S

Tue, 18 Apr 2017 15:27:40 +0000

While I agree that just because something has 'always been done that way' is a poor reason to continue doing it, the same goes for 'these rules of data-viz are absolutes'. There are very real reasons that some fields present their data in a specific way, even if it goes against some general rules of visualization. As a hydrologist here is my take on using the common hyeto-hydrograph. The first question: One chart or two? A lot of people might see rainfall (a depth) and discharge (a volume) as entirely different variables, but this is not really the view in hydrology. They are just transformations of one another. Hydrologists regularly report discharge as a depth, the volume divided by the watershed area. The difference between the two can be taken to provide a water balance showing long-term water excess or shortage. Because hydrologists think about the values in this way, they will often use a single plot; both values represent water moving through a watershed. This does not rule out using two plots, but it does illustrate why hydrologists tend to use one plot for these two measurements of water. The second question: Inverted or non-inverted axis for rainfall? Examples of rainfall plotted on both inverted or non-inverted axes are easy to find, but I believe the inverted axis is more common because it better represents the response lag between precipitation and discharge. The response lag is the time it takes for the discharge to respond to the precipitation, in other words, how long does it take for rainfall to become streamflow? Constructing figures with an inverted axis not only presents collected data, but also shows something about the physical processes that tie rainfall and discharge together. A simplistic way to think of this that you hit upon on Twitter, is as rain 'falling' from the top of the plot. And though it is simplistic, it does actually provide a visual cue to the physical process of precipitation filling up landscape storage and overflowing into runoff. This type of plot is regularly used to compare rainfall to water table elevation, especially local groundwater in wetlands, where the representation of 'filling up storage' is even more clear: periods of lower groundwater look like bowls that get filled up by precipitation. These cues allow a hydrologist to look quickly at these plots and see the relationship between a given discharge (or water levels) and precipitation, which allows them to draw extra information from the figure (what is the response lag, what were the antecedent moisture conditions, what was the impact of some land-use change, etc.). And if we are going to use one figure (see above), then using a non-inverted axis might lead to misinterpretations by someone not familiar with the processes represented. These figures often show the response of a stream to a given rainfall event, meaning that the periods of high rainfall generally coincide with relatively low discharge. A plot with non-inverted axes could easily be misread as correlating low discharge with high precipitation. So while the general rules may be that dual-axes are not ideal and can be used to manipulate data interpretation, they can also be a tool to communicate within a field more effectively. Just like any tool, it is the user who determines whether it is used towards a constructive end.

Comment on The nhmrcData package: NHMRC funding outcomes data made tidy by nsaunders

Mon, 13 Mar 2017 22:15:21 +0000

Yes, those are required for the vignette examples to work. I guess that means they are dependencies when build_vignettes = TRUE is specified. I'll amend the documentation to mention that. Thanks for testing!

Comment on The nhmrcData package: NHMRC funding outcomes data made tidy by datakid23

Mon, 13 Mar 2017 22:04:50 +0000

There are unlisted dependencies - the installation instructions above do not work out of the box. I needed to install wordcloud and tidytext. There may be other dependencies that I already have installed.