10 Common NLP Terms Explained for the Text Analysis Novice

2 min read

If you’re relatively new to the NLP and Text Analysis world, you’ll more than likely have come across some pretty technical terms and acronyms, that are challenging to get your head around, especially, if you’re relying on scientific definitions for a plain and simple explanation.

We decided to put together a list of 10 common terms in Natural Language Processing which we’ve broken down in layman terms, making them easier to understand. So if you don’t know your “Bag of Words” from your LDA we’ve got you covered.

The terms we chose were based on terms we often find ourselves explaining to users and customers on a day to day basis.

Natural Language Processing (NLP) – A Computer Science field connected to Artificial Intelligence and Computational Linguistics which focuses on interactions between computers and human language and a machine’s ability to understand, or mimic the understanding of human language. Examples of NLP applications include Siri and Google Now.

Get the AI & data signal, daily.

335k+ subscribers read this every morning. One email, both newsletters. Unsubscribe anytime.

Information Extraction – The process of automatically extracting structured information from unstructured and/or semi-structured sources, such as text documents or web pages for example.

Named Entity Recognition (NER) – The process of locating and classifying elements in text into predefined categories such as the names of people, organizations, places, monetary values, percentages, etc.

Corpus or Corpora – A usually large collection of documents that can be used to infer and validate linguistic rules, as well as to do statistical analysis and hypothesis testing.

Sentiment Analysis – The use of Natural Language Processing techniques to extract subjective information from a piece of text. i.e.

Continue Reading

Enjoyed this summary? Read the complete article at the source:

Continue at datasciencecentral.com →

Yves Mulkers

Yves Mulkers is the founder of 7wData and a widely followed voice in the data and AI community. He curates the 7wData and AI Beat newsletters, reaching hundreds of thousands of data and AI professionals, and writes on data strategy, analytics, AI, and the evolving data ecosystem.