以下为卖家选择提供的数据验证报告:
数据描述
Global Hotspots of Sharks and Longline Fishing
Machine-Learning-Assisted Spatial Distribution of At-Risk Species
By [source]
About this dataset
> This dataset provides a critical global assessment of hotspots for shark interactions with industrial longline fisheries. It utilizes machine-learning techniques to identify at-risk shark species and their spatial distribution patterns, highlighting crucial risk areas for threatened shark populations. Through the various parameters of the data, such as catch size, catch units, fish group and presence/absence of species among other details, this dataset can be used to better understand which fishing activities pose a potential threat to sharks while protecting those that are not detrimental. With this information we can help conserve our oceans' fragile ecosystems by maneuvering strategies towards sustainability in order to ensure healthy oceans for generations to come
More Datasets
> For more datasets, click here.
Featured Notebooks
> - 🚨 Your notebook can be here! 🚨!
How to use the dataset
> This dataset provides valuable insights into the spatial distribution of shark interactions with industrial longline fisheries. It can be used by researchers and conservationists to understand potential risk areas for endangered shark populations in different parts of the world, as well as for developing targeted strategies and measures to protect them. > > In order to use this dataset effectively, it is important to understand its structure and content. This dataset contains columns that provide information on each observation including: .pred_class (predicted class of the observation), pres_abs (presence or absence of species), catch (catch data for the species), rfmo (Regional Fisheries Management Organization), year (year of the observation), latitude/longitude (location information) and a variety of other variables related to environmental values, sea surface temperature/height, chlorophyll-a concentration etc. The catch data has been transformed using various methods so that they are easier to use in develop predictive models. > > In addition to these variables, this dataset also includes information on prices associated with each observed interaction as well as results from machine-learning-assisted models such as Random Forest Classification/Regression Trees, Minimum Node Size Classifier/Regressor and Mean Absolute Error scores resulting from the model. The results generated by these models can help identify potential hotspots for future interactions between sharks and industrial longline fishing operations which may lead us towards designing better policies for preserving threatened shark populations around the world
Research Ideas
> - This dataset can be used to predict future patterns of shark interactions with industrial longline fisheries, as well as identify hotspots of activity. > - This dataset can provide valuable insight into how human activities and climate change may be impacting sharks and their environments. > - This dataset can help provide early warnings for conservation efforts that should focus on particular areas in order to protect threatened species from unsustainable exploitation or other anthropogenic threats (e.g., habitat degradation)
Acknowledgements
> If you use this dataset in your research, please credit the original authors. > Data Source > >
License
> > > License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication > No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
Columns
File: IOTC_ll_untuned_final_predict.csv
Column name | Description |
---|---|
.pred_class | Predicted class of the species (String) |
pres_abs | Presence or absence of the species in the region (Boolean) |
catch | Total catch of the species (Integer) |
rfmo | Regional Fisheries Management Organization (String) |
year | Year of the data (Integer) |
latitude | Latitude of the location (Float) |
longitude | Longitude of the location (Float) |
species_sciname | Scientific name of the species (String) |
catch_units | Units of the catch (String) |
gear_group | Type of fishing gear used (String) |
spatial_notes | Notes about the spatial distribution of the species (String) |
original_effort | Original effort of the fishing gear (Integer) |
species_commonname | Common name of the species (String) |
species_group | Group of the species (String) |
species_resolution | Resolution of the species (String) |
median_price_group | Median price of the group (Float) |
median_price_species | Median price of the species (Float) |
sdm | Statistical distribution model (String) |
zone | Zone of the location (String) |
location_cluster | Cluster of the location (String) |
mean_sst | Mean sea surface temperature (Float) |
median_sst | Median sea surface temperature (Float) |
min_sst | Minimum sea surface temperature (Float) |
max_sst | Maximum sea surface temperature (Float) |
sd_sst | Standard deviation of sea surface temperature (Float) |
se_sst | Standard error of sea surface temperature (Float) |
cv_sst | Coefficient of variation of sea surface temperature (Float) |
mean_chla | Mean chlorophyll-a concentration (Float) |
median_chla | Median chlorophyll-a concentration (Float) |
min_chla | Minimum chlorophyll-a concentration (Float) |
max_chla | Maximum chlor |
min_ssh | Minimum sea surface height (Float) |
max_ssh | Maximum sea surface height (Float) |
sd_ssh | Standard deviation of sea surface height (Float) |
se_ssh | Standard error of sea surface height (Float) |
cv_ssh | Coefficient of variation of sea surface height (Float) |
bycatch_total_effort_portugal_longline | Total bycatch effort of Portugal longline (Integer) |
bycatch_total_effort_spain_longline | Total bycatch effort of Spain longline (Integer) |
bycatch_total_effort_france_longline | Total bycatch effort of France longline (Integer) |
bycatch_total_effort_india_longline | Total bycatch effort of India longline (Integer) |
bycatch_total_effort_seychelles_longline | Total bycatch effort of Seychelles longline (Integer) |
bycatch_total_effort_taiwan_longline | Total bycatch effort of Taiwan longline (Integer) |
bycatch_total_effort_madagascar_longline | Total bycatch effort of Madagascar longline (Integer) |
bycatch_total_effort_mauritius_longline | Total bycatch effort of Mauritius longline (Integer) |
bycatch_total_effort_united_kingdom_longline | Total bycatch effort of United Kingdom longline (Integer) |
bycatch_total_effort_australia_longline | Total bycatch effort of Australia longline (Integer) |
bycatch_total_effort_mozambique_longline | Total bycatch effort of Mozambique longline (Integer) |
bycatch_total_effort_malaysia_longline | Total bycatch effort of Malaysia longline (Integer) |
bycatch_total_effort_indonesia_longline | Total bycatch effort of Indonesia longline (Integer) |
bycatch_total_effort_kenya_longline | Total bycatch effort of Kenya longline (Integer) |
.final_pred | Predicted class of the species (String) |
bycatch_total_effort | Total bycatch effort (Integer) |
bycatch_total_effort_china_longline | Total bycatch effort of China longline (Integer) |
bycatch_total_effort_korea_longline | Total bycatch effort of Korea longline (Integer) |
bycatch_total_effort_japan_longline | Total bycatch effort of Japan longline (Integer) |
sd_chla | Standard deviation of chlorophyll-a concentration (Float) |
se_chla | Standard error of chlorophyll-a concentration (Float) |
cv_chla | Coefficient of variation of chlorophyll-a concentration (Float) |
mean_ssh | Mean sea surface height (Float) |
median_ssh | Median sea surface height (Float) |
File: WCPFC_ll_models_others_results.csv
Column name | Description |
---|---|
environmental_value | The environmental value associated with the area. (Float) |
include_ssh | Whether or not sea surface height was included in the model. (Boolean) |
price | The price of the data. (Float) |
catch_transformation | The transformation applied to the catch data. (String) |
mtry_class | The maximum number of variables randomly sampled at each split in the classification tree. (Integer) |
min_n_class | The minimum observations in a node for a split to be considered valid. (Integer) |
mtry_reg | The maximum number of variables randomly sampled at each split in the regression tree. (Integer) |
min_n_reg | The minimum observations in a node for a split to be considered valid. (Integer) |
rmse | The root mean square error. (Float) |
rsq | The coefficient of determination. (Float) |
mae | The mean absolute error. (Float) |
Acknowledgements
> If you use this dataset in your research, please credit the original authors. > If you use this dataset in your research, please credit .
