Overview and benchmark of traditional and deep learning models in text classification

Posted on mar. 12 juin 2018 in Deep Learning • Tagged with NLP, CNN, RNN, GRU, transfer learning, deep learning, keras, neural networks, Twitter, GloVe, Bag of words, word ngrams, character ngramsLeave a comment


This article is an extension of a previous one I wrote when I was experimenting sentiment analysis on twitter data. Back in the time, I explored a simple model: a two-layer feed-forward neural network trained on keras. The input tweets were represented as document vectors resulting from a weighted average of the embeddings of the words composing the tweet. The embedding I used was a word2vec model I trained from scratch on the corpus using gensim. The task was a binary classification and I was able with this setting to achieve 79% accuracy.
The goal of this post is to explore other NLP models trained on the same dataset and then benchmark their respective performance on a given test set. We'll go through different models: from simple ones relying on a bag-of-word representation to a heavy machinery deploying convolutional/recurrent networks: We'll see if we'll score more than 79% accuracy!
Let's investigate !


Continue reading

Understanding deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras

Posted on lun. 13 novembre 2017 in Deep Learning • Tagged with Deep learning, Convolutional Neural Networks, image classification, Keras, Tensorflow, AWS, GPU, Python, KaggleLeave a comment


Convolutional Neural Networks (CNNs) are nowadays the standard go-to technology when it comes to analyzing image data. These are special neural network architectures that perform extremely well on image classification. They are widely used in the computer vision industry and are shipped in different products: self driving cars, photo tagging systems, face detection security cameras, etc.
The theory behind convnets is beautiful. It attempts to explain and reverse-engineer the vision process. In this article, I'll go through it and explain what CNNs are all about. I'll try to go over the hype you see on the mass media and provide a detailed explanation with code snippets and interpretations.
This is also a hands-on guide to setup a deep learning dedicated machine on AWS and develop an end-to-end CNN model from scratch using Keras and Tensorflow.
By the end of this post you should have the global picture about CNNs: How do they work? and How to put them in practice?


Continue reading