Visualizing development and poverty reduction in Guizhou and Yunnan, China: using a drill-down choropleth map and time-series graphs
Abstract— How do we visualize a large number of counties and show to the user the amount of net rural income earned every year, from the 1990 to 2007? Our approach to this visualization is to present a story for our user to understand the background of Guizhou and Yunnan, before diving deep into poverty indicators – the net rural income at county level, and the GDP indicators at province level. We use a choropleth map for our user to understand the terrain better in these two provinces, and we allow broad comparisons to be made across counties and closer examination at the GDP from 1978 to 2005.
Index Terms—Information visualization, visual analytics, poverty, China, Guizhou, Yunnan.
The problem encountered is not new: Professor Donaldson, armed with data after years of research and fact-finding, now has an important conclusion to tell the world. The visualization of the data to support his paper is made in Excel, but it is not the most impressive looking line graph in the 21st century.
How do we visualize data sets that are sufficient large – they start from the 1970s to 2000s? We do not merely want to present this information, but we wish to invite users to explore the data further and to draw their own conclusions afterwards.
Big data (also spelled Big Data) is a general term used to describe the voluminous amount of unstructured and semi-structured data a company creates — data that would take too much time and cost too much money to load into a relational database for analysis. Although Big data doesn’t refer to any specific quantity, the term is often used when speaking about petabytes and exabytes of ...
How do we visualize a large number of counties and show to the user the amount of net rural income earned every year, from the 1990 to 2007? Yunnan, being the well known province compared to Guizhou, might ring a bell to people because of its tourist sites. However, our difficulty lies in presenting the sixteen prefectures, making up more than 200 counties in total, (and that is just for Yunnan) without overwhelming the user with data.
Janice Chua is an undergraduate student at Singapore Management University pursuing her Bachelor in Information Systems Management. Email: [email protected]
Edward Lim is a final year undergraduate student at Singapore Management University pursuing his Bachelor in Social Sciences, majoring in Political Science and Sociology. Email: [email protected]
Ginna Divya is an undergraduate student at Singapore Management University pursuing her Bachelor in Information Systems Management. Email: [email protected]
Paper submitted on 19 November 2010.
Professor Donaldson’s enthusiasm in telling the story of Yunnan and Guizhou was just one aspect of the motivation we had to embark on this project. Another major aspect was about humanity itself; how much do we know about less well-off people in a country as vast as China? Do political economists (like Professor Donaldson) actually sit in air-conditioned offices and dictate which county is considered to be “poor”, and which to be “not poor”? We decided to take Dr. Hans Rosling’s mantra and discover for ourselves the disparities in China, be it between states or within states.
Our additional motivation is for posterity. Professor Donaldson revealed to us that we would consider spreading the word on our work, in conjunction with the publication of his upcoming book in the next year. Which reminded us that the good work we have decided to commit ourselves to was not going to be for “show”, but was possible to make a real impact among interested readers of his book and beyond.
There have been attempts at visualizing poverty data, especially in the United States. For example, the Social Explorer  focuses on using the US Census data in order to visualize poverty (among other demographic indicators).
It is geared towards educational purposes and is made available to learning institutions on a subscription basis.
Introduction The primary requirement for making disaster management plan is the reliable and upto date information about topography and socio- economic and climatic conditions of this region. This will help in identifying the areas vulnerable to environmental and manmade hazards. This chapter deals with the information on geographical aspects of Gurgaon district, its area, population distribution, ...
Storyboard-wise, the election results in 2008 created by the visualization team at the New York Times  are an excellent example to learn from. It has a simple two-level zooming feature, which invites readers to explore the electoral reporting and results of the fifty states and its respective counties. It also allows users to have an overview of the country and click on any of the states. It will then zoom to the state chosen by the user and display the cities situated there. Similar to our project, instead of reading of various tables which can be tedious and complicated, the form of visualization used by the Presidential Election is an ideal example to be followed in terms of the effects we were hoping to achieve.
Fig. 1. The election results visualization created by the New York Times. |
The data used in the population spark lines for Guizhou and Yunnan was called from sparklines.js, which contained the yearly population of the two provinces over a twenty-year period. A similar effect was achieved for the bar graph that was used to depict the twelve-month period for average rainfall (in millimetres).
With these two exceptions, the rest of the data presented were directly coded into the HTML file. Therefore, the drawback for this overview is the lack of real-time data. However, we felt that this was a less serious concern since our data presented in the other tabs had a similar problem.
Fig. 2. Comparison table between the two provinces, with the use of spark lines and bar charts. |
There are two main visualizations in this section. Firstly, the spatial representation of the two provinces was drawn using lines (pv.Line).
The coordinates of each counties were extracted from the ARC/INFO® Export format data file downloaded from SEDAC (Socioeconomic data and Applications Center) using a GIS software. Because the data was dated December 31, 1990 for both Guizhou and Yunnan, we had to combine certain data sets of newly-created counties. For example in the case of Guiyang shixiaqu, it was aggregated data from six (mostly tiny in area size) counties (Nanming qu, Yunyan qu, Huaxi qu, Wudang qu, Baiyun qu and Xiaohe qu).
Candice CoxallENG 121 M'Lea Rignon 03-16-05 Compare and Contrast Geniuses of Their Time Nostradamus and Leonardo Da Vinci are two of the world's most intelligent, amazing, highly achieved men that have been followed and questioned throughout history. They have changed time and left many people wondering what was true or false throughout their work and lives that existed hundreds of years ago. ...
This was a compromise we made in our data because we did not have access to a new ARC/INFO® Export format data file. This represented less than 10% of our data sets.
Fig. 3. Our choropleth map that shows the degree of net rural income annually in these counties. |
Our second visualization was the line graph the user saw after clicking on a particular county. This is meant to provide context information for our user to see how poverty reduction and economic growth have taken place at a particular county level.
Fig. 4. Simple line graph provides a context to how a particular county’s net rural income has changed over time. |
Horizon graphs of counties
For horizon graphs, we used data tips which give the users the ability to look into particular facts that aren’t present in our overview. Through the use of data tips, we are able to present small target chunks of context dependent data and the user’s attention is focused on the information provided by the mouse pointer, allowing for an overview and a specific view at the same time. Creating a tool tip window also allowed us to format it accordingly and creates an interactive set of data that lets the user toggle between the different points.
Fig. 5. Horizon graphs provide a fuss-free way to compare quickly across counties; to examine closer, mouse over allows you to monitor how the county has progressed in economic development. |
Time-series GDP and GDP growth comparison graphs
The time-series graph contains a “view port” at the bottom of the visualization. This is the overview of the data over the years, and as the user drags this “view port” around the overview, our time-series graph reflects that particular part of the time period. Edward Tufte called this a “condensed slowed and personalized” version of the overall visualization. The two views are synchronized so that when the “view port” moves, the detailed view (which is the main time-series graph) changes accordingly. If the “view port” is made smaller and larger, the magnification into the time-series changes accordingly too.
Charts and graph are images that present data symbolically. They are used to present information and numerical data in a simple, compact format. This paper will focus on three types of charts and/or graphs which are: pie charts, bar graphs, and histograms. What types of data there are and how the data was collected is important for the reader to understand.. According to Bennett, Briggs, and ...
Fig. 5. The time-series graphs allow the user to select a particular section of the long time period – referred to as a “view port”. |
The overview is a table that aims to show the similarities between Guizhou and Yunnan. The argument that we are establishing is that despite these similarities, their development was quite different, with Guizhou’s slow economic growth reducing rural poverty significantly and Yunnan’s spectacular economic growth leading to little poverty reduction.
The webpage that we created provides many different forms of visualizations for users to study the trends between these two provinces better. For example one such visualization created allows the viewer to study the growth in each province separately throughout the years of 1990-2007, which was the data provided by Professor Donaldson. The user is able to view the growth of the net rural income of rural income by clicking the play icon provided and observe as the income changes over the years in different segments of each of these provinces.
Another aspect of data visualization for this project includes a breakdown of the rural income of different counties in each province throughout the various years. For example, if one were to study the data, and wanted to know specifically the net rural income in the county of Zhongdian xian/Shangrila, he or she could visit the Horizon graphs of that particular county during that particular year and find out exactly how the county performed. The Horizon graph also helps to see the performance of the county as a whole from 1990-2007 and an observer will also be able to study the growth or decline of the county and analyse it in comparison with the other county’s with ease. We believe that in the future, this form of visualization can be used by governments to make crucial decisions about progressing the development of each county in relation to others.
Users have commented that they feel engaged with our series of tabs; they shared how it felt like we were attempting to tell a story about the two provinces, as opposed to just dumping data visualizations on them.
Many have commented that the spatial visualization of the counties in Yunnan and Guizhou was their favorite visualization of all; that they understood better how uneven development was for the two provinces.
TABLE OF CONTENTS ACKNOWLEDGEMENT DEDICATION ABSTRACT LIST OF TABLES TABLE OF CONTENTS Chapter ITHE PROBLEM AND ITS BACKGROUND Introduction Background of the Study Theoretical Framework Conceptual Framework Statement of the Problem Scope and Limitations of the Study Significance of the Study Definitions of Terms Chapter IIREVIEW OF RELATED LITERATURE AND STUDIES Foreign Literature Local Literature ...
We encountered mixed results when loading our visualizations on different browsers. For example, Chrome had the fastest loading time, but was unable to load the visualizations for our time-series graphs. Firefox 3.6 had no difficulty in loading all of our visualizations, but it had the problem of being “bloated” – especially with the map visualization when played.
One of our professors also mentioned that although our visualisations are quite good and educational, which was the aim of the entire project, the project can be improved further if one of us were to take a higher level module called Geospatial Analytics for Business Intelligence as it would provide a detailed discussion on the principles of geovisualization which would come useful when dealing with data on a large country like China.
Although the learning curve was steep, the end result at the Townhall Presentation was worth it as we got many curious onlookers to try out the different visualizations and left our booth impressed by the fact that data can be manipulated in so many different ways.
Our work is scheduled to be used in conjunction with Professor Donaldson’s forthcoming book Small-Scale Development, Economic Growth and Poverty Reduction in China, with Cornell University Press. The website will be circulated around in the community and if sufficient interest is shown by other professors, our project should be capable of importing data from other provinces in China in order to provide a more accurate picture of economic development in Chinese provinces.
Our data visualization is not complete because we have limited the scope of our project deliberately by only picking on key economic data. There are many other useful economic data collated by Professor Donaldson in his research; such as the mining industry, the transportation infrastructure and revenue from tourism. Transforming these datasets and enriching our existing visualizations will certain provide a fuller perspective on the political economy of the two provinces.
In addition, the feedback we have received from users is that the horizon graphs can be more powerful if they are able to sort them according to some value. The question would be to what condition should the many counties be sorted according – the latest year’s net rural income in ascending/descending, or sorted according to area size or even alphabetical order.
Pie Chart Data Visualization for Businesses A picture is worth a thousand words. The ability to graphically represent your business data gives you the power to make informed business decisions quickly. (Microsoft. com, 2002) This representation must be visually appealing and easy to understand. By keeping it simple, it allows the broadest number of users to interpret the data, gain insights as to ...
Another feature that could be added to our time-series graph is annotations of milestones. There is existing information available that can enlighten the user; such as natural disasters which occurred on a specific year and affected certain industries in that county or prefecture.
Lastly, our wish is to have our system operate on an up-to-date basis. Information (from 2005 onwards) should be updated in the database and visualized. This will certainly extend the lifespan of our system.
Throughout the duration of the course, having learnt the use of various software and how they can be manipulated to present data in an attractive yet easy to understand manner, this project allowed us to put those skills into good use by helping a university faculty member to breakdown the large amounts of data and use it to tell a story using visual analytics software.
The authors wish to thank Professor John A. Donaldson for spending his precious time explaining economic concepts and sharing his valuable data. In addition, the authors wish to thank Professor Kam Tin Seong for his careful guidance throughout this project.
About Social Explorer – Social Explorer ® Free Edition. Website, 2010. http://www.socialexplorer.com/pub/aboutus/home.aspx.
Election Results 2008, President Map – The New York Times (Tuesday, December 09, 2008).
Website, 2008. http://elections.nytimes.com/2008/results/president/map.html