以下为卖家选择提供的数据验证报告:
数据描述
Background
For an investigation into speed disparities in internet service offers, published last week at The Markup, reporters Leon Yin and Aaron Sankin examined more than 1 million address-specific offers across dozens of US cities. To support the findings, they’ve shared the raw data gathered from ISPs’ websites, as well as tabular files that summarize each offer and attach the contextual variables used for the analysis.
Source: https://github.com/the-markup/investigation-isp
Feature information
The dataset consists of five CSV files where each file contains a different internet service provider except for AT&T which has two files. The other files are for Earthlink, Centurylink, and Verizon. The following table contains descriptions of the features.
column | description |
---|---|
address_full | The complete postal address of a household we searched. |
incorporated_place | The incorporated city that the address belongs to. |
major_city | The city that the address is in. |
state | The state that the address is in. |
lat | The address’s latitude. From OpenAddresses or NYC Open Data. |
lon | The address’s longitude. From OpenAddresses or NYC Open Data. |
block_group | The Census block group of the address, as of 2019. From the Census Geocoder API based on lat and lon . |
collection_datetime | The Unix timestamp that the address was used to query the provider's website. |
provider | The internet service provider. |
speed_down | Cheapest advertised download speed for the address. |
speed_up | Cheapest advertised upload speed for the address. |
speed_unit | The unit of speed. This is always in megabits per second (Mbps). |
price | The cost in USD of the cheapest advertised internet plan for the address. |
technology | The kind of technology (fiber or non-fiber) used to serve the cheapest internet plan. |
package | The name of the cheapest internet plan. |
fastest_speed_down | The advertised download speed of the fastest package. This is usually the same as the cheapest plan if the speed_down is less than 200 Mbps. |
fastest_speed_price | The advertised upload speed of the fastest internet package for the address. |
fn | The name of the file of API responses where this record was parsed from. To be used for trouble shooting. API responses are hosted externally in AWS s3. |
redlining_grade | The redlining grade, merged from Mapping Inequality based on the lat and lon of the adddress. |
race_perc_non_white | The percentage of people of color (not non-Hispanic White) in the addresse's Census block group expressed as a proportion. Sourced from the 2019 5-year American Community Survey. |
median_household_income | The median household income in the addresses' Census block group. Sourced from the 2019 5-year American Community Survey |
income_lmi | median_household_income divided by the city median household income (sourced from U.S. Census Bureau). |
income_dollars_below_median | City median household income minus the median_household_income . |
ppl_per_sq_mile | People per square mile is used to determine population density. Sourced from 2019 TIGER shape files from the U.S. Census Bureau. |
n_providers | The number of other wired competitors in the addresses' Census block group. Sourced from FCC Form 477. |
internet_perc_broadband | The percentage of the population that is already subscriped to broadband in an addresses' Census block group expressed as a proportion. |
