How to mine newsfeed data and extract interactive insights in Python

Posted on Mer 15 mars 2017 in NLP • Tagged with Data science, Python, tf-idf, LDA, Kmeans,, NLP, Topic mining, Text Clustering, BokehLeave a comment

In this tutorial we'll dive in Topic Mining. We'll analyze a dataset of newsfeed extracted from more than 60 sources. We'll show how to process it, analyze it and extract visual clusters from it. We'll be using great python tools for interactive visualization, topic mining and text analytics.
All the code is available to you to run and test. No bullshit.

Continue reading

How to score 0.8134 in Titanic Kaggle Challenge

Posted on Mer 10 août 2016 in Kaggle • Tagged with Kaggle, Titanic, Data science, Python, Solution, TutorialLeave a comment

The Titanic challenge on Kaggle is a competition in which the task is to predict the survival or the death of a given passenger based on a set of variables describing him such as his age, his sex, or his passenger class on the boat.
I have been playing with the Titanic dataset for a while, and I have recently achieved an accuracy score of 0.8134 on the public leaderboard.
As I'm writing this post, I am ranked among the top 9% of all Kagglers: More than 4540 teams are currently competing.
This post is the opportunity to share my solution with you.

Continue reading