AI & Analytics
Introduction to Machine Learning in Pyspark
Data scientists who want to learn how to apply machine learning on big data using Pyspark
Interactive classroom training combining theory with practical exercises
Basic understanding of linear regression, knowledge of programming basics (not necessarily in Python)
Bring a laptop with X2Go installed
Prefered group size:
+/-10 participants per trainer
This training provides a general introduction to some basic concepts of Machine Learning in the context of logistic regression in Pyspark. It discusses the difference between linear and logistic regression, the algorithm underlying logistic regression, the bias-variance trade-off, and regularization. Participants then work through a Zeppelin notebook in which they apply the learned concepts to predict trial versus settlement outcomes in patent litigation.
Upon completion of this training, participants will be able to apply machine learning in a big data setting.