patrick at durusau.net
Sat Jun 30 15:15:03 CEST 2012
Ran across an article that will appear in my blog soon but wanted to
share it ahead of time.
Probabilistic merging across databases.
Well, the actual title is:
> SkyQuery: An Implementation of a Parallel Probabilistic Join Engine
> for Cross-Identification of Multiple Astronomical Databases
> Multi-wavelength astronomical studies require cross-identification of
> detections of the same celestial objects in multiple catalogs based on
> spherical coordinates and other properties. Because of the large data
> volumes and spherical geometry, the symmetric N-way association of
> astronomical detections is a computationally intensive problem, even
> when sophisticated indexing schemes are used to exclude obviously
> false candidates. Legacy astronomical catalogs already contain
> detections of more than a hundred million objects while the ongoing
> and future surveys will produce catalogs of billions of objects with
> multiple detections of each at different times. The varying
> statistical error of position measurements, moving and extended
> objects, and other physical properties make it necessary to perform
> the cross-identification using a mathematically correct, proper
> Bayesian probabilistic algorithm, capable of including various priors.
> One time, pair-wise cross-identification of these large catalogs is
> not sufficient for many astronomical scenarios. Consequently, a novel
> system is necessary that can cross-identify multiple catalogs
> on-demand, efficiently and reliably. In this paper, we present our
> solution based on a cluster of commodity servers and ordinary
> relational databases. The cross-identification problems are formulated
> in a language based on SQL, but extended with special clauses. These
> special queries are partitioned spatially by coordinate ranges and
> compiled into a complex workflow of ordinary SQL queries. Workflows
> are then executed in a parallel framework using a cluster of servers
> hosting identical mirrors of the same data sets.
Key sentence: "One time, pair-wise cross-identification of these large
catalogs is not sufficient for many astronomical scenarios. "
I suspect that to be the case for many scenarios, not just those in
But how would I reliably interchange the parameters for such queries?
Hope everyone is having a great weekend!
patrick at durusau.net
Former Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): http://tm.durusau.net
More information about the sc34wg6