Implement both simple and complex analytics and machine learning models:
Perform basic data preparation – filter, splits, append, merge, duplicate; and other text, numerical, and logical functions.
Text mining capabilities like text extraction, hashing and text grouping.
Feature Extraction – word to vector, Inverse document frequency, etc.
Feature Transformation – binarize, scale, normalize, etc.
Feature Selection – slice vector, chi-sq select, etc.
Develop clustering models – K-means, Bisecting K-means, Guassian mixture, Latent Dichrichlet Allocation (LDA), and others.
Develop regression models – linear, decision tree, survival, isotonic.
Develop classification models – logistic regression, random forest, gradient busted tree (GBT), Naive Bayes, and others.
Perform model tuning and scoring.
Build pipelines to execute a group of algorithms simultaneously.
Configure multiple sandboxes View and modify machine generated source code.
Monitor real-time model execution.