MLlib and Machine Learning¶
This is the main Machine Learning (ML) guide. It provides an overview of ML capabilities in Databricks and Apache Spark, with links to other currently available guides.
MLlib is Apache Spark’s scalable machine learning library consisting of common learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction, and underlying optimization primitives.
Databricks recommends the following Apache Spark MLLib guides:
For using MLlib with R, please refer to the SparkR documentation.
See pages below for examples.
- Binary Classification Example
- Decision Trees Example
- Third-Party Machine Learning Integrations
- Advanced MLLib