This article is of an experimental design for a part of my research plan (on language acquisition), where phonetic labels and visual patterns are to be associated with machine learning.
- Basic Ideas
- Goal: The system relates figure images with linguistic labels (phoneme strings).
- Labels represent the shapes or colors of figures.
- Labels and shapes/colors of figures shall be learned in a non-supervised manner by presenting labels and figures together.
- The system should be able to associate labels with shapes/colors of figures after learning.
- Computer Vision: OpenCV/SimpleCV
- Machine Learning Tool: TBD
- Clustering (non-supervised learning)
Hopefully, non-parametric for the number of clusters.
Auto-complete-like function for learned patterns.
- Learning patterns
- Input patterns (in two modalities)
- Figure images: rectangles, circles, triangles of various sizes and colors
- Label pattern: 2D array of (phoneme position × phoneme type)
- Image processing: monochromatizing, edge detection, SIFT?
Co-occurrence patterns of figures and labels, figure shapes, colors, and labels shall be clustered with the following configuration of learners:
[the following has been modified on 2014-09-14]
- Label learner: label candidates are selected from frequent phoneme N-grams.
- Shape learner: assign a non-supervised learner for figure shapes.
- Color learner: assign a
non-supervised learner for figure colors.
Colors are learned with color label candidates as teacher signals.
- Cross-modal (label sense) learner: for a label (candidate) to have a sense, it must have a statistic correlation with a 'cluster' in shape/color learner.
In any case, as learned category nodes do not have complete information to construct particular lower input patterns, some external mechanism should be added to visualize the association.