 课程大纲:
        
    课程大纲:                    Apache Spark MLlib培训
      
spark.mllib: data types, algorithms, and utilities
        Data types
        Basic statistics
        summary statistics
        correlations
        stratified sampling
        hypothesis testing
        streaming significance testing
        random data generation
        Classification and regression
        linear models (SVMs, logistic regression, linear regression)
        naive Bayes
        decision trees
        ensembles of trees (Random Forests and Gradient-Boosted Trees)
        isotonic regression
        Collaborative filtering
        alternating least squares (ALS)
        Clustering
        k-means
        Gaussian mixture
        power iteration clustering (PIC)
        latent Dirichlet allocation (LDA)
        bisecting k-means
        streaming k-means
        Dimensionality reduction
        singular value decomposition (SVD)
        principal component analysis (PCA)
        Feature extraction and transformation
        Frequent pattern mining
        FP-growth
        association rules
        PrefixSpan
        Evaluation metrics
        PMML model export
        Optimization (developer)
        stochastic gradient descent
        limited-memory BFGS (L-BFGS)
        spark.ml: high-level APIs for ML pipelines
        Overview: estimators, transformers and pipelines
        Extracting, transforming and selecting features
        Classification and regression
        Clustering
        Advanced topics
 
     
     
         
     加入高级会员获得助教答疑
 加入高级会员获得助教答疑 
                