Support |
Eben, you said (replying to my age distribution question): > I'm not certain I understand your question about years of > looping experience, although you may be asking why there are > more people in the middle amount of years and less on the > extremes, which is how almost all data looks when you take a > survey (there is an average with most people in the middle, > and less people are further away from average). No, the question is more specific. First of all, of course you're right regarding the age of the loopers (for which we also see some approximately Gaussian binning), but not regarding the years of looping experience, and I'm gonna explain why: The mathematical model for this works like this. Any looper i has started at a point t1,i in time with looping. Now when the survey happens at t2, he will enter t2-t1,i as a reply to that question (which also has some predefined bins, namely <1, 1-3, 3-5, 5-10 and >10 to somehow stick with the typical bins of human resource departments). You may argue that there might be some answers from people who stopped looping at a time t3,i<t2. But I believe that those wouldn't have answered that survey, so it's fairly reasonable to assume that all of the loopers have been looping more or less all the time (at least everyone is still looping right now, and most of them in all of or a considerable amount of their current work). So what I'm looking at: how was the distribution when the people started looping in the past. If you model that the same number of people started looping every year, then after 18 years of looping, you get the 42% value we have for the >10yrs people - but then the other bins look differently (the 1-3 and 3-5 bins are about double the size of the <1 bin and the 5-10 bin is nearly three times as big as the 3-5 bin because these bins have different widths). More precisely, this model gives me values 5.3, 11, 11, 32 and 42, respectively. So how has the number of loopers which joined each year in the past have to look to lead to our distribution of 5 - 15 - 19 - 20 - 42? I did some simple runs with the Solver in Excel, and it looks like it's a distribution with the first people starting more than 20 years ago, a very slow increase from then, then some sharp peaks about 9 and 5 years ago and since then a slight decline. However, this set of equations is underdetermined, so: with your knowledge of statistical methods, would you be able to deduct the generating function or parameter for a surface which describes the solutions to this set of equations? Rainer