centralized repo optimized for analytics where data from sources stored in structured format |
storage repo holds vast amount of raw data in native formats (various strucutures) |
Schema on write |
Schema on read |
Primarily structured data |
Various data structures |
Less agile due to predefined schema |
More agile |
ETL |
ELT or just load for storage purpose |
more expensive due to optimizations for complex queries |
cost-effective for storage solutions but costs rise when volume increases |
Have structured data sources and require fast and complex queries |
Mix of structures |
Data integration from diff sources are essential |
Scalable and cost-effective |
BI and analytics |
Future needs are uncertain |
|
Advanced analytics, ML or data discovery are goals |