AlphaNum Dataset

Abstract

The AlphaNum dataset is a collection of 108.791 grayscale images of handwritten characters and numerals as well as special character, each sized 24x24 pixels. This dataset is designed to bolster Optical Character Recognition (OCR) research and development.

For consistency, images extracted from the MNIST dataset have been color-inverted to match the grayscale aesthetics of the AlphaNum dataset.

Data Sources

In an effort to maintain uniformity, the dataset files have been resized to 24x24 pixels and recolored from white-on-black to black-on-white.

Dataset Structure

Instance Description

Each dataset instance contains an image of a handwritten character or numeral, paired with its corresponding ASCII label.

Data Organization

The dataset is organized into three separate .zip files: train.zip, test.zip, and validation.zip. Each ASCII symbol is housed in a dedicated folder, the name of which corresponds to the ASCII value of the symbol.

train.zip size: 55.9 MB
test.zip size: 16 MB
validation.zip size: 8.06 MB

Dataset Utility

The AlphaNum dataset caters to a variety of use cases including text recognition, document processing, and machine learning tasks. It is particularly instrumental in the development, fine-tuning, and enhancement of OCR models.

Null Category Image Generation

The 'null' category comprises images generated by injecting noise to mimic randomly distributed light pixels. The creation of these images is accomplished through the following Python script: This approach is particularly valuable as it enables the model to effectively disregard specific areas of the training data by utilizing a 'null' label. By doing so, the model becomes better at recognizing letters and can ignore irrelevant parts, enhancing its performance in reallive OCR tasks.

The 'null' labelled images in this dataset have been generated using the following algorithm. (Please note that this is a non-deterministic approach, so you will most likely get different results.)

import os import numpy as np from PIL import Image, ImageOps, ImageEnhance  def generate_noisy_images(num_images, image_size=(24, 24) output_dir='NoisyImages', image_format='JPEG'):     if not os.path.exists(output_dir):         os.makedirs(output_dir)              for i in range(num_images):         variation_scale = abs(np.random.normal(30, 15))         # Generate random noise with reduced strength         noise = np.random.rand(image_size[0], image_size[1]) * 0.05         noise = (noise * 255).astype(np.uint8)                  # Create a PIL image from the noise         image = Image.fromarray(noise, mode='L')  # 'L' for grayscale                  # Invert the image         inverted_image = ImageOps.invert(image)                  # Enhance the contrast with increased amplitude         enhancer = ImageEnhance.Contrast(inverted_image)         contrast_enhanced_image = enhancer.enhance(variation_scale)  # Increased amplitude (e.g., 3.0)                  # Save the image         contrast_enhanced_image.save(os.path.join(output_dir, f'{i}.jpg'), format=image_format)  generate_noisy_images(5000)

看了又看

验证报告

以下为卖家选择提供的数据验证报告：

AlphaNum

￥3

已售 0

81.64MB

申请报告

AlphaNum

AlphaNum Dataset

Abstract

Data Sources

Dataset Structure

Instance Description

Data Organization

Dataset Utility

Null Category Image Generation

关于典枢

下载与支持

服务协议

关于我们

官方公众号

技术交流群