Analysis of high dimensional data |
|
|
- Uniformly distributed data
|
|
- In a
-dimensional hypercube ![](images/img449.png)
|
|
|
|
- Dimensions are independent
|
- Most data lies near the boundary
|
|
- When within
of outer boundary, volume of inside hypercube is ![](images/img451.png)
|
|
- Example: For
, inside volume is ![](images/img453.png)
|
- Even for small size of answer set, the range on each dimension should be large
|
|
- For selectivity of
points, query range on each dimension should be ![](images/img454.png)
|
|
- Example: For
, query range is ![](images/img456.png)
|
- "Curse of dimensionality"
|