Mushahid Posted December 3, 2015 Report Share Posted December 3, 2015 So I'm doing a statistical Maths IA to investigate if education reduces income inequality and I'm testing the correlation between gini coefficient, education expenditure as a percentage of a country's GDP and finally education index, and this will be done for the top 50 developed countries (by HDI rank). The problem i have right know is that the source i'm using (worldbank) is missing data for a country in certain years, and seeing how i'm trying to get data for the last 10 years and different countries are missing data for different years, this is posing a big problem. So the question is would it be suitable for me to assume a linear trend throughout the years and calculate the missing data from the missing years using a Linear regression equation? I know this is not completely reliable but maybe I can use it and talk about it in my evaluation. Before I was trying to use multiple sources to fill in the gaps but my maths teacher told me not to do this as the different sources had different calculations for the same year and country and doing if from different sources would decrease the credibility of my data. It seems the only options I have left is either using Linear regression to predict the missing data or just decreasing the number of years worth of data form 10 to like 3. Any advice appreciated! Thanks in advance. Reply Link to post Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.