Learning What is learning Supervised Learning Training data
- Slides: 16
Learning
What is learning?
Supervised Learning • Training data that has been classified • Examples – – – Concept learning Decision trees Markov models Nearest neighbor Neural Nets (in coming weeks) • Inductive Bias - limits imposed by assumptions! – Especially what factors we choose as inputs
Rote Learning • Store training data • Limitation - does not extend beyond what has been seen • Example: concept learning
Concept Learning • Inductive learning with generalization • Given training data: – tuples <a 1, a 2, a 3, …> – Boolean value – ai - can be any value – ? Is used for a don’t care positive – Null is used for don’t care negative
• A hypothesis if a tuple that is true • hg <? , …. > - most general - always true • hs <null, …> most specific always false • hg >= hs • Defines a partially ordered lattice
Training Method • Use the lattice to generate the most general hypothesis • Weakness – Inconsistent data – Data errors
Decision Trees • ID 3 algorithm • Entropy: a measure of information – p(I)log 2 p(I) entropy of an element • Entropy of the system of information: – Sum - p(I)log 2 p(I) – P(I) is instances of I / total instances – This is done over the outputs of the tree
• Gain is a measure of the effectiveness of a attribute • Gain(S, A) = Entropy(S) - S ((|Sv| / |S|) * Entropy(Sv)) – Sv Number of outputs with value v is attribute – S is the number of elements in the outputs
ID 3 • Greedy algorithm • Select the attribute by the largest gain • Iterate until done
Markov Models • Markov Chain is a set of states • State transitions are probabilistic • State xi goes to state xj with P(xj | xi) • This can be extended to allow the probability to depend on a set of past states (Memory)
Example from the Text • Given a set of words, Markov chain to generate similar words • For each letter position of the words, compute probability • Use a matrix of counts – Count[from][to] • Normalize rows by total count in row
Nearest Neighbor • 1 NN: • Use vectors to represent entities • Use distance measure between vectors to locate closest known entity • Can be effected by noisy data
k. NN - better • Use k closest neighbors and vote
Other techniques Yet to cover! • Evolutionary algorithms • Neural nets
- Lda supervised or unsupervised
- Supervised learning dan unsupervised learning
- On training targets for supervised speech separation
- Supervised data mining
- Supervised vs unsupervised data mining
- Predicting good probabilities with supervised learning
- Supervised learning pipeline
- Partially supervised learning
- Deep reinforcement learning example
- Supervised and unsupervised learning
- Eli gutin
- Normalized cut loss for weakly-supervised cnn segmentation
- Normalized cut loss for weakly-supervised cnn segmentation
- Joanne supervised 36 professionals in 6 city libraries
- Supervised diversionary program
- Interactive supervised classification
- Supervised visitation center dc