Search

The Fundamentals of Machine Learning Using Oracle Analytics Cloud (OAC)

Oracle Analytics Cloud (OAC) is a complete platform offering that spans the analytical requirements of an enterprise from IT-governed data models to self-service exploration capabilities connected to a wide range of data sources.


One of the key components of OAC is its comprehensive, yet user-friendly, ability to use machine learning (ML) techniques to train and apply models to datasets. This allows us to gain insight and predictive capabilities that go beyond regular business intelligence analysis.

The purpose of this blog post is to provide background information on machine learning in general, as well as to describe how to utilize those capabilities within OAC.


Machine Learning Background

Machine learning is a field of computer science that focuses on developing and evaluating algorithms that identify meaningful patterns from data.  The algorithm reads a dataset, applies statistical functions to it, and returns a software model that stores the patterns. That model can then be applied to another dataset to predict outcomes either as a number within a range of values, a binary true/false identifier, or a classification result within a group of defined attribute values.


Oracle offers three different methods of supervised machine learning out-of-the-box in the Data Visualization component of OAC.  Supervised learning depends on using a representative dataset for training a model using a set of predefined attributes and a target column.  The three options include: Numeric Prediction, Multi-Classifier, and Binary Classifier.

Numeric Prediction is typically referred to as a regression technique used to predict a value within a continuous range.


Examples include:

  • Predicting store sales based on location, surrounding demographics, and nearest competition

  • Estimating what price a home will sell for based on recent sales, square footage, and age

  • Determining how many days until a customer will return to a store based on most recent purchases or demographics obtained from credit card account

Common algorithms include:

  • Linear Regression

  • Decision Trees

  • Neural Networks

  • K-Nearest Neighbors

Binary Prediction determines values that can only have one of two states, typically “true” or “false.”


Examples include:

  • Identifying employees who are most likely to leave a company

  • Predicting if a subscription holder will renew or not

  • Deciding if a web page belongs in a search result

Common algorithms include:

  • CART

  • Naïve Bayes

  • Neural Network

  • Random Forest

Multi-Class Prediction predicts values that belong to a limited, predefined set of permissible values.

Examples:

  • Predicting susceptibility levels for certain diseases

  • Image recognition to classify objects in a photo

  • Predicting which component of a machine will most likely fail first

Common algorithms include:

  • CART

  • Naïve Bayes

  • Neural Network

  • Random Forest

The Machine Learning Process

The typical process for creating and applying a machine learning model includes the following steps:

  • Formulate a question

  • Acquire data

  • Clean data

  • Train model

  • Apply model

  • Analyze results

  • Act

This process is represented in the following diagram, known as the CRISP-DM Process (cross-industry process for data mining).