Course Objectives:
By the end of the course, participants will be able to:
- Design big data implementation plans and create strategies for data driven solutions.
- Explain the challenges of big data and traditional technologies like Excel.
- Discuss the main challenges and advantages of Hadoop ecosystem and other big data distributed architectures.
- Demonstrate and discuss key technologies for big data storage and compute, such as PostgreSQL and MongoDB.
- Discuss popular machine learning algorithms and the importance of ethics in data analytics and artificial intelligence.
- Deliver an architectural diagram for analytics focused use cases.
Target Audience:
This course is ideal for data professionals, such as database administrators, system administrators, business analysts or business intelligence specialists. It is also ideal for less technically-inclined management and administrative professionals seeking to understand big data strategies and technologies. Recommended pre-knowledge includes experience analyzing data in Excel, knowledge of basic database technologies, and awareness of analytics driven business initiatives.
Target Competencies:
- Big data implementation planning
- Big data analytics structures and technologies
- Ethics and integrity for big data analytics
- Big data storage and computer system implementation
- Architecture diagram design
Day 1: Storing Big Data
- What is big data and what are the 5 “V’s” of big data?
- Big data impact on technologies.
- Open source revolution.
- Key big data concepts and data types.
- Text, audio, images.
- Big data professional roles.
- Big data architectures and paradigms.
- The Hadoop Ecosystem.
- Massively parallel processing (MPP) versus distributed in-memory applications.
- Streaming data.
Day 2: Computing Big Data
- Role of cloud computing.
- Data movement risk.
- Networking and co-location.
- Big data extract, transform, load (ETL).
- MapReduce and beyond.
- Distributed compute.
- High performance clusters.
- Spark.
- Streaming: Storm, Spark structured streaming.
- Other big data technologies: Kafka, etc.
Day 3: Introducing Big Data Analytics and Artificial Intelligence (AI)
- Basics of data analytics, Roles and objectives.
- Key math and statistics concepts.
- Supervised vs Unsupervised.
- Key technologies and applications.
- Analytics architecture.
- Cloud vs On-premise.
- Data storage, Analytics Tools, Databricks and SAS Viya.
- Cloud ML & AI solutions.
- Linear Algebra 101.
- Image classification and Importance of Ethics.
Day 4: Planning A Big Data Project For Analytics
- How big data projects meet organizational needs
- Big data case studies: Netflix, LinkedIn, Facebook, Google, Orbitz, Dell and others.
- Best practices in project design.
- Assessing the current state of your organisation.
- Vertical data teams and discussions.
- Considerations for big data project plans.
- Brainstorm a data-driven strategy.
- Practice designing architecture diagrams.
Day 5: Architecting Big Data Solutions
- Identifying analytical opportunities.
- Define and assess the problem.
- Describe the impact and use of data to address the problem.
- Identify potential data sources.
- Brainstorm an analytics strategy to implement.
- Storage and compute.
- Identify a cloud environment strategy.
- Brainstorm key storage systems and compute environments.
Language: English.
Place: London – UK.
Venue (TBC): Radisson Edwardian Sussex Hotel (Address: 19-25 Granville Place, Marylebone, London W1H 6PA – UK).