Quick Review (cont.)
Sorting and Hashing
- Both are memory intensive
- Major concerns
- Merge efficiency & memory manangement (S)
- Hash table overflow & skewing (H)
Disk Access
- Indices usage
- Physical database design
Notes:
Sorting problems “solutions”:
Merge efficiency- forecasting (predict largest key), large cluster size, using max fan-ins (# of runs that can be merged at once).
Hashing problems “solutions”:
Overflow- avoidance (partition data according to fan-out (# of partition files made)), resolution (assume overflow won’t occur & resort to avoidance when it does).