Natural Language Processing with Deep Learning
- Sponsored by: Munich Re
- Scientific Lead: Oliver Mitevski, Mariia Stepanova
- Project Lead: Dr. Ricardo Acevedo Cabra
- Term: Summer semester 2018
Named Entity Recognition (NER) is a subtask of Information Extraction (IE), that seeks to locate and classify named entities in text into pre-defined categories such as persons, organizations, events, locations etc. Traditionally it has been tackled by supervised learning on a number of hand engineered linguistic features, which requires extensive linguistic expertise. Deep learning models have been successfully applied to NER, which obviates the need for hand engineered features and still achieving state-of-the-art performance. Semantic word embeddings such as word2vec have been used as the basis (namely the first layer) of such deep learning systems. The approach is very versatile and elegant in the sense that it can tackle many different aspects of Natural Language Understanding, such as sense disambiguation, entity linking and co-reference resolution. The main focus of the project is to develop NER and optimize the performance for standard entities, which can be expanded for non-standard and insurance specific ones. The language is python, using keras with tensoflow backend. Also free to use other deep learning libraries such as pytorch or mxnet.
Results: The results of this project were summarised in the presentation and explained in detail in the final documentation.