Here are some test records:
[22 , high , no , fair , yes]
[45 , high , no , excellent , yes]
[32 , low , yes , excellent , yes]
How would our decision tree classify these records?
Many-eyes is a
data visualization tool which can be found at
How can we visualize and interpret the data, which we found as a result of the question in the previous part?
Now, let's go to many-eyes web site, load our data into the system and visualize our data.
In order to load your own data, you have to register to the system. Don't worry, it doesn't take so much time.
After registering, login to the system and click on "Create Visualization" from the left menu.
Then, select "upload your own dataset".
Fill in the form. You can use this text file as a sample input.
When you click on upload after filling the form, you will be directed to a page where you can select the type of chart among a bunch of choices.
What kind of chart is convenient for your data? Bar chart, pie chart?
Download this excel file. It contains information about the average number of children per woman in many different countries for the years 1989 and 2009
Before we use the data in any data mining or visualization procedure, we usually want to correct them, purge them or even transform them into something new. As an example to that, the dataset you downloaded has some missing values. One way to cope with them is the following: If only one number is missing (i.e. for either 1989 or 2009 we don't have any statistics for that country), give it the value of the other year. If both are missing, do not include them in the final dataset.