Đề cương khóa học
Week 1 Big Data concepts
- VVVV (Velocity, Volume, Variety, Veracity) definition
- Limits to traditional data processing capacity
- Distributed Processing
- Statistical Analysis
- Machine Learning Analysis Types
- Data Visualization
- Distributed Processing (e.g. map-reduce)
- Introduction to used languages
- R language crash-course
- Python crash course
Weeks 2&3 Performing Data Analysis
- Statistical Analysis
- Descriptive Statistics in Big Data sets (e.g. calculating mean)
- Inferential Statistics (estimating)
- Forecasting with Correlation and Regression models
- Time Series analysis
- Basics of Machine Learning
- Supervised vs unsupervised learning
- Classification and clustering
- Estimating cost of specific methods
- Filter
Week 4 Natural Language Processing
- Processing text
- Understanding meaning of the text
- Automatic text generation
- Sentiment/Topic Analysis
- Computer Vision
Week 5&6 Tooling concept
- Data storage solution (SQL, NoSQL, hierarchical, object oriented, document oriented)
- MySQL, Cassandra, MongoDB, Elasticsearch, HDFS, etc...)
- Choosing right solution to the problem
- Distributed Processing
- Spark
- Machine Learning with Spark (MLLib)
- Spark SQL
- Scalability
- Public cloud (AWS, Google, etc...)
- Private cloud (OpenStack, cloud foundry)
- Autoscalability
Week 7 Soft Skills
- Advisory & Leadership Skills
- Making an impact: data-driven story telling
- Understanding your audience
- Effective data presentation - getting your message across
- Influence effectiveness and change leadership
- Handling difficult situations
Exam
- End of Programme graduation exam
Requirements
Participants to have good grounding in maths, at least high school level.
Though programming skills are not required, any programming skills will be useful.
Participants will be assessed and interviewed prior to participation in this training programme.
Testimonials (4)
Understanding big data beter
Shaune Dennis - Vodacom
Course - Big Data Business Intelligence for Telecom and Communication Service Providers
Subject presentation knowledge timing
Aly Saleh - FAB banak Egypt
Course - Introduction to Data Science and AI (using Python)
It is great to have the course custom made to the key areas that I have highlighted in the pre-course questionnaire. This really helps to address the questions that I have with the subject matter and to align with my learning goals.
Winnie Chan - Statistics Canada
Course - Jupyter for Data Science Teams
The example and training material were sufficient and made it easy to understand what you are doing.