Machine learning applications generally rely on supervised learning, learning from training samples that have been labeled by a human ‘teacher’. Unsupervised learning learns what it can from unlabeled training samples. What can be learned this way are basic structural characteristics of the training data, and this information can be a useful aid to supervised learning.
In my latest iMerit blog I describe how the long-used technique of clustering has been incorporated into deep learning systems, to provide a useful starting point for supervised learning and to extrapolate what is learned from labeled training data.
Language is a hallmark of human intelligence, and Natural Language Processing (NLP) has long been a goal of Artificial Intelligence. The ability of early computers to process rules and look up definitions made machine translation seem right around the corner. However language proved to be more complicated than rules and definitions.
The observation that humans use practical knowledge of the world to interpret language set off a quest to create vast databases of human knowledge to apply to NLP. But it wasn’t until deep learning became available that human-level NLP was achieved, using an approach quite unlike human language understanding.
In my latest iMerit blog I trace the path that led to modern NLP systems, which leave meaning to humans and let machines do what they are good at – finding patterns in data.
iMerit is a remarkable company of over 4000 people that specializes in annotating the data needed to train machine learning systems.
I am writing a series of blogs for them on various aspects of machine learning. In my latest blog I explain how ML systems embody both human intelligence and a form of machine ‘intelligence’.
Just as our biology provides the basis for human learning, human-provided ML system designs provide frameworks that enable machine learning. Through human engineering, these designs bring ML systems to the point where everything they need to ‘know’ about the world can be reflected in their parameters.
Analogous to the role of our parents and teachers, training data annotation drives the learning process toward competent action. Annotation is the crucial link between the ML system and its operational world, and accurate and complete annotation is the only way an ML system can learn to perform well.