In: Proceedings of the 23rd international conference on very large data bases, Athens, Greece, August 1997 Gibbons PB, Matias Y, Poosala V (1997) Fast incremental maintenance of approximate histograms. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, Seattle, June 1998 Gibbons PB, Matias Y (1998) New sampling-based summary statistics for improving approximate query answers. In: Proceedings of the 25th international conference on very large data bases (VLDB-99), Edinburgh, September 1999 Appl Stat 34:138-147ĭonjerkovic D, Ramakrishnan R (1999) Probabilistic optimization of top N queries. Wiley, New Yorkĭiggle PJ A kernel method for smoothing point process data. In: Proceedings of the 25th international conference on very large data bases (VLDB-99), Edinburgh, September 1999Ĭhaudhuri S, Motwani R, Narasayya VR (1998) Random sampling for histogram construction: how much is enough? In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, Seattle, June 1998Ĭressie NQC (1993) Statistics for spatial data. In: Proceedings of the 1999 ACM SIGMOD international conference on management of data, Philadelphia, June 1999īruno N, Chaudhuri S, Gravano L (2001) STHoles: a multidimensional workload-aware histogram In: Proceedings of the 2001 ACM SIGMOD international conference on management of data, Santa Barbara, May 2001Ĭhaudhuri S, Gravano L (1999) Evaluating top-K selection queries. In: Proceedings of the 1999 ACM SIGMOD international conference on management of data, Philadelphia, June 1999īlohsfeld B, Korus D, Seeger B (1999) A comparison of selectivity estimators for range queries on metric attributes. In: Proceedings of the 1999 ACM SIGMOD international conference on management of data, Philadelphia, June 1999Īcharya S, Poosala V, Ramaswamy S (1999) Selectivity estimation in spatial databases. The experimental results show that the proposed techniques behave more accurately in high dimensionalities than previous approaches.Īboulnaga A, Chaudhuri S (1999) Self-tuning histograms: building histograms without looking at data. Finally, we compare the accuracy of the proposed techniques with existing techniques using real and synthetic datasets. We also show how to generalize kernel density estimators and how to apply them to the multidimensional query approximation problem. The use of overlapping buckets allows a more compact approximation of the data distribution. The size of the cells is based on the local density of the data. Our technique defines buckets of variable size and allows the buckets to overlap. We present a new histogram technique that is designed to approximate the density of multidimensional datasets with real attributes. Moreover, real-life data exhibit attribute correlations that also affect the estimator. Consequently, each value appears very infrequently, a characteristic that affects the behavior and effectiveness of the estimator. However, for many novel applications (as in temporal, spatial, and multimedia databases) attribute values come from the infinite domain of real numbers. Many traditional approaches assume that attribute values come from discrete, finite domains, where different values have high frequencies. In statistics, kernel estimation techniques are being used. In databases, such estimators include the construction of multidimensional histograms, random sampling, or the wavelet transform. More accurate estimators try to capture the joint data distribution of the attributes. The simplest approach to tackle this problem is to assume that the attributes are independent. In this paper, we consider the following problem: given a table of d attributes whose domain is the real numbers and a query that specifies a range in each dimension, find a good approximation of the number of records in the table that satisfy the query. Estimating the selectivity of multidimensional range queries over real valued attributes has significant applications in data exploration and database query optimization.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |