Cross validation
- Use the Random Data widget to create a data set with 100 data instances, and 500 variables with normal random distribution, and one Bernouli variable (p = 0.5).
- Use Select Columns to designate the Bernouli variable as target.
- Use Test and Score to evaluate performance of classification tree on this data, using
- testing on training data
- cross validation
For classification tree, set the pruning parameters to defaults (2, 5, 100, 95), or even disable them.
Generate the random data a few times (by clicking "Generate" button at the bottom) to get a general impresion about the results in both cases.
Last modified: Monday, 7 March 2022, 4:27 PM