UIMA
Developer(s) | IBM, Apache Software Foundation (since October 2006) |
---|---|
Stable release | |
Written in | Java with C++ Enablement |
Operating system | Cross-platform |
Type | Text mining, Information Extraction |
License | Apache License 2.0 |
Website | http://uima.apache.org/ |
UIMA (Pronounced as ″u e ma″[2]) stands for Unstructured Information Management Architecture. An OASIS standard[3] as of March 2009, UIMA is to date the only industry standard for content analytics. Other general frameworks used for natural language processing include the General Architecture for Text Engineering (GATE) and the Natural Language Toolkit (NLTK).[4]
UIMA is a component software architecture for the development, discovery, composition, and deployment of multi-modal analytics for the analysis of unstructured information and its integration with search technologies developed by IBM. The source code for a reference implementation of this framework has been made available on SourceForge, and later on the website of the Apache Software Foundation.
One potential use of UIMA is in a logistics analysis software system that could convert unstructured data such as repair logs and service notes into relational tables. These tables can then be used by automated tools to detect maintenance or manufacturing problems.
Another use of UIMA is in systems that are used in medical contexts to analyze clinical notes, such as the Clinical Text Analysis and Knowledge Extraction System (Apache cTAKES).
Structure of UIMA
The UIMA architecture can be thought of in four dimensions:
- It specifies component interfaces in an analytics pipeline.
- It describes a set of Design patterns.
- It suggests two data representations: an in-memory representation of annotations for high-performance analytics and an XML representation of annotations for integration with remote web services.
- It suggests development roles allowing tools to be used by users with diverse skills.
Watson
IBM Research's Watson uses UIMA for analyzing unstructured data.[5]
See also
- Data Discovery and Query Builder
- Entity extraction
- General Architecture for Text Engineering (GATE)
- IBM Omnifind
- Languageware
- List of natural language processing toolkits
- Natural Language Toolkit (NLTK)
- OpenNLP
- Darmstadt Knowledge Processing Software Repository (DKPro)
References
- ↑ http://uima.apache.org/news.html#14%20January%202014
- ↑ UIMA Frequently Asked Questions (FAQ's) The Apache Software Foundation
- ↑ UIMA Specification The Apache Software Foundation.
- ↑ Natural Language Processing (NLP) Survey of Tools & Resources
- ↑ https://blogs.apache.org/foundation/entry/apache_innovation_bolsters_ibm_s
External links
- UIMA Homepage at the Apache Software Foundation
- OASIS Unstructured Information Management Architecture (UIMA) TC