Discipline Multi-Discipline
Duration3 Days
Delivery Mechanism Classroom
Interested in attending this course? Click below to be alerted when scheduled

Print Email

Fundamentals of Data Analytics

This three-day course introduces the data analytics techniques to extract knowledge from raw data. The course aims to educate class audience on how to create data-driven models through the data mining pipeline that consists of data exploration, data preprocessing, machine learning modeling, and model evaluation. The course combines theoretical knowledge with hands-on training of the data analytics techniques. After taking this course, the participants should be able to build and evaluate data-driven models via the machine learning approach.

This is a practical course with 50% of the time dedicated to hands-on sessions using R programming language. Hands-on session will be based on oil and gas related datasets.

  • Agenda
  • Audience
  • Prerequisites
  • Agenda

    Day 1

    Exploratory Data Analysis and Data Preprocessing

    • Introduction to Data Analytics
    • Exploratory Data Analysis (EDA) (Visualization, and Descriptive Statistics)
    • Hands-on EDA
    • Data Preprocessing (including PCA)
    • Hands-on Data Preprocessing

    Objective: On the first day of the course, the participants will be able to get a bird-eye view of the data analytics process. They will explore two of the four data analytics modules called exploratory data analysis, a necessary step to get the feel of the data, and data preprocessing, a necessary step to clean and format the data before building machine learning models. Learning will be reinforced via hands-on training in R.

    Day 2

    Supervised Machine Learning

    • Decision Tree
    • Hands-on Decision Tree
    • Regression (Linear and Logistics)
    • Hands-on Regression
    • Model Evaluation

    Objective: On the second day of the course, the participants will be able to build data-driven models via supervised machine learning algorithms. Two representative machine learning algorithms - one for classification and the other for regression - will be covered. Model evaluation matrices (e.g., confusion matrix, ROC, AUC, etc.) and model evaluation methods (e.g., 10-fold cross validation) will be discussed towards the end of the day.

    Day 3

    Ensemble Methods and Unsupervised Machine Learning

      • Hands-on Model Evaluation
      • Ensemble Methods (Bagging, Boosting and Random Forest)
      • Hands-on Ensemble Methods
      • Cluster Analysis (k-Means and Hierarchical)
      • Hands-on Cluster Analysis
      • Class feedback and wrap up

      Objective: On the third day of the course, the participants will be introduced to more advanced machine learning techniques of ensemble methods and unsupervised machine learning algorithms. They will be able to implement the complete data mining pipeline including model building and model evaluation in R. 

    • Audience

      Geoscientists, Engineers, IT professionals and aspiring Citizen Data Scientists working in the oil and gas industry who want to get introduced to data analytics techniques for building data-driven models.

    • Prerequisites


    • Prerequisites

    Filter upcoming courses by Country

    Upcoming Courses
    Houston, TX, United States July 21 - 23, 2020 Bandung, Indonesia August 03 - 05, 2020 Calgary, Alberta, Canada August 11 - 13, 2020 Stavanger, Norway August 18 - 20, 2020
    ©2000-2016 NExT. All Rights Reserved.