悠悠

verify-tagTaiwan Air Quality Observation Data (2018-2022)

businesscomputer science

2

已售 0
85.33MB

数据标识:D17222539740033099

发布时间:2024/07/29

以下为卖家选择提供的数据验证报告:

数据描述

Update at 2024/1/18

English Description

This dataset represents a detailed collection of air quality measurements taken from monitoring stations throughout Taiwan from the year 2018 to 2022. Each record includes hourly data on various pollutants and environmental factors, providing insights into the air quality dynamics across different regions and times.

The initial data merging and preprocessing were performed using the MergeData.py script, which combined individual station records into annual datasets. The TransformDataframe.py script was then utilized to transform these datasets into their current format, optimizing them for analysis. This involved the pivot of hourly measurements into individual parameters columns and the integration of datetime information.

The dataset encompasses a range of measured parameters such as:

  • AMB_TEMP: Ambient Temperature
  • CH4: Methane
  • CO: Carbon Monoxide
  • NMHC: Non-Methane Hydrocarbons
  • NO, NO2, NOx: Nitrogen Oxides
  • O3: Ozone
  • PM10, PM2.5: Particulate Matter
  • RAINFALL: Rainfall
  • RH: Relative Humidity
  • SO2: Sulfur Dioxide
  • THC: Total Hydrocarbons
  • WD_HR: Wind Direction Hourly
  • WIND_DIREC: Wind Direction
  • WIND_SPEED: Wind Speed
  • WS_HR: Wind Speed Hourly

Missing values are denoted by blank entries, reflecting the unavailability of data for certain parameters at specific times. The dataset is structured to facilitate both time series analysis and cross-sectional environmental studies, serving as a resource for researchers, policymakers, and the public to understand and address air quality issues in Taiwan.

Notes on Data Values:

  • '#' indicates instrument-based invalid values.
  • '*' indicates software-based invalid values.
  • 'x' indicates manually-checked invalid values.
  • 'A' indicates invalid values due to suspected instrument malfunction.
  • Blank entries represent missing values.

Scripts Used:

  • MergeData.py: Script for merging individual station records into annual datasets.
  • TransformDataframe.py: Script for transforming the merged data into an analysis-ready format.

繁體中文描述

這個數據集包含了從2018年到2022年期間在台灣各個監測站收集的詳細空氣品質測量數據。每一條記錄都包括各種污染物和環境因素的每小時數據,提供了不同地區和時間段的空氣品質動態的見解。

初始的數據合併和預處理是使用 MergeData.py 腳本執行的,該腳本將單個站點記錄合併成年度數據集。然後使用 TransformDataframe.py 腳本將這些數據集轉換為它們當前的格式,優化它們以進行分析。這包括將每小時的測量值轉換為單獨的參數列以及整合日期時間信息。

該數據集包含了一系列測量的參數,例如:

  • AMB_TEMP:環境溫度
  • CH4:甲烷
  • CO:一氧化碳
  • NMHC:非甲烷碳氫化合物
  • NO, NO2, NOx:氮氧化物
  • O3:臭氧
  • PM10, PM2.5:顆粒物
  • RAINFALL:降雨量
  • RH:相對濕度
  • SO2:二氧化硫
  • THC:總碳氫化合物
  • WD_HR:風向(每小時)
  • WIND_DIREC:風向
  • WIND_SPEED:風速
  • WS_HR:風速(每小時)

空白條目表示缺失數值,反映了某些特定時間點某些參數的數據不可用。該數據集的結構旨在促進時間序列分析和橫截面環境研究,為研究人員、政策制定者和公眾了解和解決台灣的空氣品質問題提供了資源。

有關數據值的備註:

  • '#' 表示基於儀器的無效值。
  • '*' 表示基於軟件的無效值。
  • 'x' 表示手動檢查的無效值。
  • 'A' 表示由於懷疑儀器故障而導致的無效值。
  • 空白條目表示缺失值。

使用的腳本:

  • MergeData.py:用於將單個站點記錄合併成年度數據集的腳本。
  • TransformDataframe.py:用於將合併後的數據轉換為可進行分析的格式的腳本。
data icon
Taiwan Air Quality Observation Data (2018-2022)
2
已售 0
85.33MB
申请报告