NLP-FOR-HACKERS on Feedspot

Building Chatbots – Introduction

NLP-FOR-HACKERS

by bogdani

4y ago

This is really a hot topic these days: Chatbots . In this tutorial, we’re diving in the world of chatbots and how they are built. What are chatbots? Chatbots are systems that can have a fairly complex conversation with humans. They can go by different names: Conversational Agents or Dialog Systems . As you’ve probably guessed, chatbots use a lot of Natural Language Processing techniques in order to understand the human’s requests. The Holy Grail of chatbot builders is to pass the Turing Test. This means that a human can’t figure out that he’s talking to an actual human. Although we are pretty ..read more

Visit website

Build a POS tagger with an LSTM using Keras

NLP-FOR-HACKERS

by bogdani

4y ago

In this tutorial, we’re going to implement a POS Tagger with Keras. On this blog, we’ve already covered the theory behind POS taggers: POS Tagger with Decision Trees and POS Tagger with Conditional Random Field. Recently we also started looking at Deep Learning, using Keras, a popular Python Library. You can get started with Keras in this Sentiment Analysis with Keras Tutorial. This tutorial will combine the two subjects. We’ll be building a POS tagger using Keras and a Bidirectional LSTM Layer. Let’s use a corpus that’s included in NLTK: import nltk tagged_sentences = nltk.corpus.treebank.t ..read more

Visit website

Getting started with Keras for NLP

NLP-FOR-HACKERS

by bogdani

4y ago

In the previous tutorial on Deep Learning, we’ve built a super simple network with numpy. I figured that the best next step is to jump right in and build some deep learning models for text. The best way to do this at the time of writing is by using Keras . What is Keras? Keras is a deep learning framework that actually under the hood uses other deep learning frameworks in order to expose a beautiful, simple to use and fun to work with, high-level API. Keras can use either of these backends: Tensorflow – Google’s deeplearning library Theano – may not be further developed CNTK – Microsoft’s dee ..read more

Visit website

Roundup of Python NLP Libraries

NLP-FOR-HACKERS

by bogdani

4y ago

The purpose of this post is to gather into a list, the most important libraries in the Python NLP libraries ecosystem. This list is important because Python is by far the most popular language for doing Natural Language Processing. This list is constantly updated as new libraries come into existence. In case you are looking for a list of useful corpora, check out this NLP corpora list General Purpose Name Functionalities Notes URL NLTK tokenization, POS, NER, classification, sentiment analysis, access to corpora Maybe the best known Python NLP Library. Not entirely suited for production ..read more

Visit website

Introduction to Deep Learning – Sentiment Analysis

NLP-FOR-HACKERS

by bogdani

4y ago

Deep Learning is one of those hyper-hyped subjects that everybody is talking about and everybody claims they’re doing. In certain cases, startups just need to mention they use Deep Learning and they instantly get appreciation. Deep Learning is indeed a powerful technology, but it’s not an answer to every problem. It’s also not magic like many people make it look like. In this post, we’ll be doing a gentle introduction to the subject. You’ll learn what a Neural Network is, how to train it and how to represent text features (in 2 ways). For this purpose, we’ll be using the IMDB dataset. It conta ..read more

Visit website

Complete Guide to Word Embeddings

NLP-FOR-HACKERS

by bogdani

4y ago

Introduction We talked briefly about word embeddings (also known as word vectors) in the spaCy tutorial. SpaCy has word vectors included in its models. This tutorial will go deep into the intricacies of how to compute them and their different applications. Bag Of Words Model In most of our tutorials so far, we’ve been using a Bag-Of-Words model. Take for example this article: Text Classification Recipe. Using the BOW model we just keep counts of the words from the vocabulary. We don’t know anything about the words semantics. Another drawback of the BOW model is that we work with very sparse ..read more

Visit website

Quick Recipe: Build a POS tagger using a Conditional Random Field

NLP-FOR-HACKERS

by bogdani

4y ago

A while back I wrote a Complete guide for training your own Part-Of-Speech Tagger. If you are new to Part-Of-Speech Tagging (POS Tagging) make sure you follow that tutorial first. This article is more of an enhancement of the work done there. What is a CRF? A Conditional Random Field (CRF for short) is a discriminative sequence labelling model. It’s fairly easy to explain model (compared to Hidden Markov Models). Basically, given: some feature extractors (feature extractors need to output real numbers) weights associated with the features (which are learned) previous labels predict the curre ..read more

Visit website

Complete Guide to spaCy

NLP-FOR-HACKERS

by bogdani

4y ago

Updates 29-Apr-2018 – Fixed import in extension code (Thanks Ruben) spaCy is a relatively new framework in the Python Natural Language Processing environment but it quickly gains ground and will most likely become the de facto library. There are some really good reasons for its popularity: It's really FAST Written in Cython, it was specifically designed to be as fast as possible It's really ACCURATE spaCy implementation of its dependency parser is one of the best-performing in the world: It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool Batteries included Inde ..read more

Visit website

Complete Guide to Topic Modeling

NLP-FOR-HACKERS

by bogdani

4y ago

What is Topic Modeling? Topic modelling, in the context of Natural Language Processing, is described as a method of uncovering hidden structure in a collection of texts. Although that is indeed true it is also a pretty useless definition. Let’s define topic modeling in more practical terms. Definitions: C: collection of documents containing N texts. V: vocabulary (the set of unique words in the collection) Dimensionality Reduction Topic modeling is a form of dimensionality reduction. Rather than representing a text T in its feature space as {Word_i: count(Word_i, T) for Word_i in V}, we can ..read more

Visit website

Quick Recipe: Building Word Clouds

NLP-FOR-HACKERS

by bogdani

4y ago

What are Word Clouds? Word Clouds are a popular way of displaying how important words are in a collection of texts. Basically, the more frequent the word is, the greater space it occupies in the image. One of the uses of Word Clouds is to help us get an intuition about what the collection of texts is about. Here are some classic examples of when Word Clouds can be useful: Take a quick peek at the word distribution of a collection of texts Clean the texts and want to see what are some frequent stopwords you want to filter out See the differences between frequent words between two or more colle ..read more

Visit website

Follow NLP-FOR-HACKERS on FeedSpot