Recognizing mushrooms - again
Download another data set about poisonous and edible mushrooms. The data set is made up: it consists of two numeric variables and the class. To be able to point to individual mushrooms, I assigned them names, and for fun (and to prevent you from recognizing the mushrooms you need to classify), they are in Slovenian. Some have nice names, though.
Observe the data in the Scatter plot. Color by class, label by name. Uncheck "Label only selection and subset" and "Jitter numeric value" (uncheck!!!).
There are five mushrooms with unknown class (with cyrillic transcription for easier pronounciation): vražji goban (вражйи гобaн), koprenka (копренка), lisička (лисичка), ježek (ежек), jurček (юрчек). Which of them are in your opinion edible? Which of them would you avoid the most, and which one seems the safest? Don't use trees, just the Scatter plot and common sense.
Could you formulate the rule behind your decision? You can do it in words or mathematically or graphically, whichever is the easiest for the type of rule you invent.
I assume that your rule makes a hard decision -- either edible or poisonous. Could you modify it such that it would also assess the probability?
Use your imagination. There are no wrong answers (though some are unlikely to be correct :)