DATA SCIENCE IN DEVELOPMENT ECONOMICS: USING CLUSTER ANALYSIS TO GENERATE A MULTIVARIATE DEVELOPMENT TAXONOMY

Date
2018-05
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
This paper attempts to apply clustering techniques from data science to the economic problem of generating a country-level development taxonomy. Development taxonomies currently in use su er from two key issues. First, the taxonomies are based on very few variables and therefore cannot properly represent something as complex and multifaceted as development. Second, the values used to discriminate groups are chosen arbitrarily. In this work, a univariate analysis is performed using the method of kernel density estimation to empirically generate a single-valued taxonomy which can be directly compared with the income group taxonomy published by the World Bank. Next, a de nition of development is derived and a multivariate analysis is performed to create a comprehensive development taxonomy using two forms of k-means clustering. The univariate analysis demonstrates the superiority of a data-driven approach to single-valued taxonomy creation. Conversely, it remains inconclusive as to whether cluster analysis can create a well-de ned multivariate development taxonomy.
Description
Keywords
Mathematics and Economics, cluster analysis, multivariate development taxonomy
Citation