Covid 19 predictions – Dutch national data and classifiers
Much data regarding the spreading of Covid 19 has been collected since the begin of the epidemic in the early spring 2020. On the Dutch national site Dutch national data I found a data set that shows the spreading of Covid 19 during 2020 in the Netherlands.
Variables included are month, Dutch province, gender, age category, patient admitted to a hospital, and (possible) mortality. The public data set can be downloaded here.
I was curious whether hospitalization could be predicted from the other variables month, Dutch province, gender and age category. So I downloaded this data set in CSV-format (columns separated by ‘;’) and imported the data into a spread sheet. Some observations regarding hospitalization were ‘UNKNOWN’, so I decided to skip these cases.
The data set looks as follows:
Month_C;Agegroup_C;Sex_C;Province_C;Outcome 7;60-69;Female;Zuid-Holland;Yes 8;20-29;Male;Zuid-Holland;No 4;70-79;Male;Noord-Brabant;Yes 3;50-59;Male;Noord-Holland;Yes 3;50-59;Male;Noord-Holland;Yes 3;60-69;Male;Zuid-Holland;Yes 2;50-59;Female;Noord-Brabant;Yes 4;80-89;Male;Noord-Brabant;Yes 4;90G;Female;Gelderland;Yes 7;50-59;Male;Zuid-Holland;No 3;80-89;Male;Limburg;Yes ...
which I uploaded to Insight Classifiers. I choose the analysis that ranks the predictive variables, one by one.
First, 10-fold cross validation yielded an accuracy of 0.855. Both outcomes could be predicted.
The feature analysis shows interesting graphical results (made by Insight Classifiers):
The major part of the hospitalized patients occur in the months March and April 2020.
Also, the elderly part of the population above 60 years is clearly over represented among the hospitalized persons.
With respect to gender, about 2/3 of the hospitalized patients are Male. The number of unregistered genders (Unknown) can be neglected.
The Dutch southern provinces of N-Brabant and Limburg are clearly much more associated with hospitalized persons than the Nordic provinces of the country. One may wonder as to why the most southern provinces show much higher degree of hospitalization, than in the Northern parts of the Netherlands. One possible explanation is the widespread celebration of carnival in the beginning of March 2020. There are also cultural differences between ‘North’ and ‘South’ in the way social life takes place.
What is not part of this study, is a possible comorbidity of Covid 19 hospitalization co-occurring with other diseases. Severe obesity and diabetes are now well-known risk factors that attribute to a more severe course of Covid 19.
All analyses were performed with https://insight-classifiers.com.