以下为卖家选择提供的数据验证报告:
数据描述
Data from Gaia Data Release 3, curated for a lesson on classification of binary star systems.
Columns:
- parallax - units of milli arcseconds (mas)
- l - heliocentric Galactic longitude (degrees)
- b - heliocentric Galactic latitude (degrees)
- astrometric_excess_noise - (mas), please see Gaia documentation
- parallax_error - uncertainty in parallax measurement (mas)
- phot_g_mean_mag - G-band apparent magnitude (mag)
- bp_rp - Blue-pass minus Red-pass color index (mag)
- is_binary - target: 0 for single star system, 1 for multi-star system
Learn more:
Gaia Primer on Non-Single Stars
Curation:
Data was queried from the 3rd Gaia data release in 4 parts. First, a test and train data set were formed via cuts to the built-in "random_index" column in Gaia DR3. To help make the data more balanced, additional stars were queried for both the test and train set that required the non_single_star column to be non-zero.
In post-processing, any non-zero value for non_single_star was converted to 1 to allow for binary classification.
SELECT gaia_source.parallax,gaia_source.l,gaia_source.b,gaia_source.astrometric_excess_noise,gaia_source.parallax_error,gaia_source.phot_g_mean_mag,gaia_source.bp_rp,gaia_source.non_single_star FROM gaiadr3.gaia_source WHERE gaia_source.parallax > 0 AND gaia_source.bp_rp IS NOT NULL AND gaia_source.random_index BETWEEN 8000000 AND 8500000
To reproduce, use random_index between:
- Train set (with no conditions on non_single_star column) - 8e6 and 8.5e6
- Train set (required non_single_star column != 0) - 0 and 5e6
- Test set (with no conditions on non_single_star column) - 9e6 and 9.1e6
- Test set (required non_single_star column != 0) - 6e6 and 7e6
License:
The Gaia Data is under the following license: Open Source With Attribution to ESA/Gaia/DPAC, reproduced here:
>"The Gaia data are open and free to use, provided credit is given to 'ESA/Gaia/DPAC'. In general, access to, and use of, ESA's Gaia Archive (hereafter called 'the website') constitutes acceptance of the following general terms and conditions. Neither ESA nor any other party involved in creating, producing, or delivering the website shall be liable for any direct, incidental, consequential, indirect, or punitive damages arising out of user access to, or use of, the website. The website does not guarantee the accuracy of information provided by external sources and accepts no responsibility or liability for any consequences arising from the use of such data."
All of my course materials are free to use with attribution as well, under a CC BY-NC-SA license.
