Machine Learning with Coffee on Feedspot

Machine Learning with Coffee

893 FOLLOWERS

Machine Learning with Coffee is a podcast where we are going to be sharing ideas about Machine Learning and related areas such as artificial intelligence, business intelligence, business analytics, data mining, and Big data. The objective is to promote a healthy discussion on the current state of this fascinating world of Machine Learning. We will be sharing our experience, sharing tricks,..

06 How to Become a Data Scientist

Machine Learning with Coffee

by Gustavo Lujan

2y ago

We talk about what it takes to become a Data Scientist. We also discuss 4 prerequisites before preparing yourself to become a Data Scientist. Finally, we provide recommendations on 3 online courses, that if mastered, will put you above 90% of all Data Scientists out there ..read more

Visit website

14 XGBoost: The Winner of Many Competitions

Machine Learning with Coffee

by Gustavo Lujan

2y ago

XGBoost is an open-source software library which has won several Machine Learning competitions in Kaggle. It is based on the principles of gradient boosting, which is based on the ideas of the Leo Breiman, the creator of Random Forest. The theory behind gradient boosting was later formalized by Jerome H. Friedman. Gradient boosting combines weak learners just as Random Forest. XGBoost is an engineering implementation which includes a clever penalization of trees and a proportional shrinking of leaf nodes ..read more

Visit website

13 Random Forest

Machine Learning with Coffee

by Gustavo Lujan

2y ago

Random Forest is one of the best out-of-the-shelf algorithms. In this episode we try to understand the intuition behind the Random Forest and how it tries to leverage the capabilities of Decision Trees by aggregating them using a very smart trick called “bagging”. Variable Importance and out-of-bag error are two of the nice capabilities of Random Forest which allow us to find the most important predictors and compute a good generalization error, respectively. ..read more

Visit website

18 PCA: Principal Component Analysis

Machine Learning with Coffee

by Gustavo Lujan

2y ago

We discuss Principal Component Analysis as one of the most popular techniques to reduce the dimensionality of a dataset. PCA helps us be more efficient in terms of the number of variables we feed to our machine learning models. ..read more

Visit website

17 Anomaly Detection: Clustering

Machine Learning with Coffee

by Gustavo Lujan

2y ago

We present 3 clustering algorithms which will help us detect anomalies: DBSCAN, Gaussian Mixture Models and K-means. These 3 algorithms are very popular and basic but have passed the test of time. All these algorithms have many variations which try to overcome some of the disadvantages of the original implementation ..read more

Visit website

12 Decision Trees

Machine Learning with Coffee

by Gustavo Lujan

3y ago

We talk about Decision Trees as one of the most basic statistical learning algorithms out there that all Data Scientist should know. Decision Trees are one of a few machine learning models which are easy to interpret which makes them a favorite when it is desired to understand the logic behind a certain decision. Decision Trees naturally handle all types of variables without the need to create dummy variables, no need to scale or normalize and they are also very robust against outliers. ..read more

Visit website

11 Inferential Statistics

Machine Learning with Coffee

by Gustavo Lujan

3y ago

We talk about the importance of inferential statistics in Data Science. Inferential statistics are a set of techniques used to make generalizations about a population from a sample. One of the tools used in inferential statistics is hypothesis testing. In this episode we provide a couple of examples on when and why to use 1-sample t-tests and 2-sample t-tests. We also argue that the mean or average of a sample means nothing if we do not also consider the variation of the data ..read more

Visit website

03 What is Machine Learning?

Machine Learning with Coffee

by Gustavo Lujan

3y ago

The definition of Machine Learning and other related areas such as: artificial intelligence, business analytics, business intelligence and Big Data, is provided. These are not academic definitions extracted from books, these are real world concepts as I see them. We discuss similarities, differences and overlap between all these, sometimes confusing terms, which people tend to misuse. ..read more

Visit website

09 Regularization to Deal with Overfitting

Machine Learning with Coffee

by Gustavo Lujan

3y ago

In this episode with talk about regularization, an effective technique to deal with overfitting by reducing the variance of the model. Two techniques are introduced: ridge regression and lasso. The latter one is effectively a feature selection algorithm. ..read more

Visit website

02 My Personal Journey: How I Became a Data Scientist

Machine Learning with Coffee

by Gustavo Lujan

3y ago

In this episode I talk about my personal journey, how I became a Data Scientist. I start by talking about how I decided to go to college, what major to choose, how I chose my master’s degree. I talk about my time studying a PhD in Engineering and the most useful classes I took related to machine learning and data science. Finally, I briefly talk about my job experience as Data Scientist. ..read more

Visit website

Follow Machine Learning with Coffee on FeedSpot