Statistics and Machine Learning in Python Release 0.1, 2017 by Edouard Duchesnay, Tommy Löfstedt
Читать

Statistics and Machine Learning in Python Release 0.1, 2017 by Edouard Duchesnay, Tommy Löfstedt

Machine learning covers two main types of data analysis: 1. Exploratory analysis: Unsupervised learning. Discover the structure within the data. E.g.: Experience (in years in a company) and salary are correlated. 2. Predictive analysis: Supervised learning. This is sometimes described as “learn from the past to predict the future”. Scenario: a company wants to detect potential future clients among a base of prospects. Retrospective data analysis: we go through the data constituted of previous prospected companies, with their characteristics (size, domain, localization, etc...). Some of these companies became clients, others did not. The question is, can we possibly predict which of the new companies are more likely to become clients, based on their characteristics based on previous observations? In this example, the training data consists of a set of n training samples. Each sample,