凤凤

verify-tagIDAO 2022

chemistryenergyintermediatecnnimageregression

1

已售 0
50.16MB

数据标识:D17222370017271832

发布时间:2024/07/29

以下为卖家选择提供的数据验证报告:

数据描述

Introduction

Two-dimensional transition metal dichalcogenides (TMDCs) are relatively new types of materials that have remarkable properties ranging from semiconducting, metallic, magnetic, superconducting to optical. The chemical composition of TMDCs is MX₂; where M is the group of transition elements most popular Molybdenum and Tungsten, and X is usually Sulfur or Selenium. Atomically thin TMDCs usually contain various defects, which enrich the lattice structure and give rise to many intriguing properties. Engineered point defects in two-dimensional (2D) materials offer an attractive platform for solid-state devices that exploit tailored optoelectronic, quantum emission, and resistive properties. Naturally occurring defects are also unavoidably important contributors to material properties and performance. The immense variety and complexity of possible defects make it challenging to experimentally control, probe, or understand atomic-scale defect-property relationships. In the figure above you can find vacancy and substitution defects in an 8x8 MoS₂ crystal lattice.

Band gap is one of the important physical attributes which describe certain characteristics of the material, that helps deriving material qualities including electric conductivity or catalytic power or photo-optical properties. Band gap is the energy difference between the valence band and conduction band and is closely related to the energy difference between highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO), materials with overlapping (between valence band and conduction band) or very small band gap are conductors and materials with small bandgap are semiconductors while materials with large bandgap are insulators.

The task is to predict band gap energy for each crystal structure.

##Input format

The training dataset contains 5933 crystal structures as a json file named with a unique identifier and is containing a special pymatgen structure (check pymatgen documentation for reference), that contains information about crystal parameters, cartesian coordinates of each atom, atom types, and other information.

The targets are stored in a csv file named targets.csv containing two columns; the first is the unique identifier of the structure and the other is the band gap value for each structure. The train and test sets are separated into public and private folders:

The public sample contains 2966 examples. The private sample contains 2967 examples.

Quality Metric

Energy within Threshold (EwT) is designed to measure the practical usefulness of a model for replacing DFT by evaluating whether the predicted energy is close to the ground truth (DFT energy). EwT is defined as the fraction of structures in which the predicted energy is within ϵ=0.02eV (electronvolt) of the ground truth energy.

EwT=1/N ​∑i ​∣Epredicted,i​ − EDFT,i​∣<ϵ

Where N is the number of samples in the dataset indexed by i.

References

(1) Chen, C., Ye, W., Zuo, Y., Zheng, C. and Ong, S.P., 2019. Graph networks as a universal machine learning framework for molecules and crystals. Chemistry of Materials, 31(9), pp.3564-3572.

(2) Hu, Z., Wu, Z., Han, C., He, J., Ni, Z. and Chen, W., 2018. Two-dimensional transition metal dichalcogenides: interface and defect engineering. Chemical Society Reviews, 47(9), pp.3100-3128.

(3) Manzeli, S., Ovchinnikov, D., Pasquier, D., Yazyev, O.V. and Kis, A., 2017. 2D transition metal dichalcogenides. Nature Reviews Materials, 2(8), pp.1-15.

data icon
IDAO 2022
1
已售 0
50.16MB
申请报告