Data Science for Non-Data Scientists
Duration (in days):
3
Description:
Most enterprises have a lot of data, but don't fully utilize the knowledge present in the data.
This course aims at getting engineers and other college graduates to understand the opportinuties in the data and also some of the best tools to process the data.
We will start with some fundamental math and statistics and move to contemporary tools and algorithms.
Objectives:
Understand the value in your data
Understand fundamental (high-school level) math required to understand machine learning and fundamental data science
Learn how to covert domain models into useful input models for machine learning
Learn to use some of the contemporary tools (e.g., Spark MachineLearning, Tensorflow, Various Python libraries)
Learn some of the most common machine learning algorithms
Prerequisites:
Most enterprises have a lot of data, but don't fully utilize the knowledge present in the data.
This course aims at getting engineers and other college graduates to understand the opportinuties in the data and also some of the best tools to process the data.
We will start with some fundamental math and statistics and move to contemporary tools and algorithms.
Audience
Software engineers, data engineers, software architects, and technical minded managers
Outline
Introduction
What is data science?
What is machine learning?
What data is useful?
A few case studies that illustrate the value of data
Goals of this course
Introduction to Python
Python fundamentals
Introduction to NumPy
Data manipulation using Pandas
Visualization with Mathplotlib
First example of machine learning in Python
Introduction to Computational Thinking
Optimization problems
Graph-theoretic models
Stochastic thinking
Random walks
Monte Carlo simulation
Confidence intervals
Let's talk statistics
Confidence intervals
Experimental data
Machine Learning
What is machine learning, really?
Classes of algorithms
Clustering
Classification
Neural nets
Common mistakes
Best practices