Preskoči na glavno vsebino
Stransko polje
Domov
Več
Išči
Zapri
Išči
Preklopi iskalni vnos
Slovenščina (sl)
English (en)
Slovenščina (sl)
Македонски (mk)
Русский (ru)
한국어 (ko)
Trenutno uporabljate gostujoči dostop
Prijavite se
Domov
Course Activities
Forumi
Razporedi
Viri
Nedavno dostopani predmeti
You are not enrolled in any courses
Odpri kazalo predmeta
dm-hse
Clustering (part 2: linkages, distances)
Clustering (part 2: linkages, distances)
Kliknite na povezavo
Clustering (part 2: linkages, distances)
, če želite odpreti vir.
◄ Clustering (part 1: k-means and hierarchical clustering)
Skoči na ...
Skoči na ...
Announcements and Discussions
Much Further Reading
Office hours
Data for the third part
Exercise (visualizations)
Mushrooms
Exercise (insignificance of significance)
Orange and basic visualizations
Mosaic and Sieve diagram
Task solutions
Arguments against testing of null hypotheses
Of Carrots, Horses and the Fear of Heights
How to Abuse p-values in Correlations
Cohen (1994): The Earth is Round (p < 0.05)
Surviving on mushrooms
Recognizing types of animals
Animals
Exploring Human Development Index
Human development index (+ religions + continents)
Classification trees
Decision tree learning (Wikipedia) [mandatory read, but see remark]
Information Gain in Decision Trees (Wikipedia) [optional reading]
Induction of Decision Trees (Quinlan, 1986) [optional reading]
Scores for evaluation of models
mushroom-predictions
Sara's Hamsters
Sara's Hamsters - solution
Cross validation
Scores for evaluation of model performance
List of performance scores (Wikipedia)
Cross validation (Wikipedia) [optional]
An introduction to ROC analysis (Fawcett, 2006) [mandatory: first seven sections]
A Unified View of Performance Metrics: Translating Threshold Choice into Expected Classification Loss [just the Introduction; optional]
Recognizing mushrooms - again
Mushrooms (numeric)
Decision boundaries
Linear models for classification
Logistic regression (Shalizi, 2012) [mandatory, see below which parts]
Nomograms for Visualization of Naive Bayesian Classifier (Možina, 2004) [mandatory, you may skip Section 2]
Nomograms for Linear Models
Exploration of Kernel Methods
Other models
A nice explanation of the kernel trick
Kernel Methods for Pattern Analysis (Shawe-Taylor, Christiannini, 2004) [optional, beyond this course]
Random Forests (Breiman, 2001) [optional]
The Random Subspace Method for Constructing Decision Forests (Ho, 1998) [optional]
Regularization Experiment
Regularization
Elements of Statistical Learning [optional, way beyond this course]
Clustering versus Classification
Exploration of linkage functions
Data sets for clustering
Exploration of Dendrograms
Exploration of Clusters
Clustering (part 1: k-means and hierarchical clustering)
Introduction to Data Mining, Chapter 8: Cluster Analysis: Basic Concepts and Algorithms (Tan P-N, Kumar, 2006)
Fake news
Text Mining
Text Mining - In-class assignment
Allahyari et al. - A Brief Survey of Text Mining
Bird and Klein: Regular Expressions for Natural Language Processing
Ramos: Using TF-IDF to Determine Word Relevance in Document Queries
An opinion word lexicon and a training dataset for Russian sentiment analysis of social media
Text Mining course notes
Liu, Bing: Sentiment Analysis and Opinion Mining
Projections
Deep learning and images
Distances
Analysis of Multivariate Social Science Data, Chapter 3: Multidimensional Scaling (Bartholomew, 2008) [recommended]
FreeViz—An intelligent multivariate visualization approach to explorative analysis of biomedical data (Demšar, 2007) [optional]
animals and fruits
Assignment: ROC Curve
Assignment: Regression
Assignment: Classifiers and their Decision Boundaries
Solution: Classification boundaries
Solution: ROC curve
Solution: Regression
Introduction to Data Mining, Chapter 8: Cluster Analysis: Basic Concepts and Algorithms (Tan P-N, Kumar, 2006) ►