Designing the Big Data Warehouse
The two day course is an introduction to designing data warehousing architecture, data extraction, management and load. The class also covers metadata management, dimensional modeling, data aggregation, data mining and Business Intelligence. Both SQL and NoSQL databases will be employed.
2 days - $1,295.00
Course taught by an expert Big Data Instructor.
Prerequisites:
Knowledge of SQL and NoSQL is required.
Course Outline
Basic Elements of the Data Warehouse
Source System
Data Staging Area
Presentation Server
Dimensional Model
Business Process
Big Data
Apply Data Preprocessing Techniques for Cleaning, Integration, Reduction
Transformation of Data
Mart/Data Warehouse
Operational Data Store (ODS)
OLAP (On-Line Analytic Processing)
ROLAP (Relational OLAP)
MOLAP (Multidimensional OLAP)
Project Management and Requirements
The Business Dimensional Lifecycle a. Lifecycle Evolution
Lifecycle Approach
Project Planning and Management
Define and Plan the Project
Collect the Requirements
Prepare and Publish the Requirements Deliverables
Data Design
Dimensional Modeling
The Data Warehouse Bus Architecture
Basic Dimensional Modeling Techniques
Fact Tables and Dimension Tables
Foreign Keys, Primary Keys, and Surrogate Keys
Additive, Semiadditive, and Nonadditive Facts
Extended Dimension Table Designs
Many-to-Many Dimensions
Many-to-One-to-Many Traps
Extended Fact Table Designs
Build Dimensional Models
Data Warehouse Architecture
Architectural Framework
Logical Models and Physical Models
Back Room Technical Architecture
Back Room Data Stores
Back Room Services
Extract Services
Data Transformation Services
Data Loading Services
Backup and Archive Planning
Architecture for the Front Room
Front Room Data Stores
Front Room Services for Data Access
Warehouse Browsing
Access and Security Services
Activity Monitoring Services
Query Management Services
Security Management in a Data Warehouse Environment
Security: Vulnerabilities
Physical Assets
Information Assets: Data, Financial Assets, and Reputation
Software Assets
Network Threats
Security: Solutions
Routers and Firewalls
The Directory Server
Encryption
Introduction to Big Data
Defining Big Data
The four dimensions of Big Data: volume, velocity, variety, veracity
Integrating Big Data with traditional data
Storing Big Data
Overview of Big Data stores
Data models: key value, graph, document, column–family
Hadoop Distributed File System
Processing Big Data
Integrating disparate data stores
Mapping data to the programming framework
Connecting and extracting data from storage
Transforming data for processing
Subdividing data in preparation for Hadoop MapReduce
Creating the components of Hadoop MapReduce jobs
Executing Hadoop MapReduce jobs