数据描述
About
This dataset lists over 215k top projects by star with over 167 stars. Contains a lot of useful information (attributes).
I collected this dataset using github search api. This allows you to get only the first thousand for a query, so I looped through the low/high (stars) pairs that return less than a thousand repositories when query=stars:{low}..{high}
.
The Github API Terms of Service apply. > You may not use this dataset for spamming purposes, including for the purposes of selling GitHub users' personal information, such as to recruiters, headhunters, and job boards.
Columns
Column name | Description |
---|---|
Name | The name of the GitHub repository |
Description | A brief textual description that summarizes the purpose or focus of the repository |
URL | The URL or web address that links to the GitHub repository, which is a unique identifier for the repository |
Created At | The date and time when the repository was initially created on GitHub, in ISO 8601 format |
Updated At | The date and time of the most recent update or modification to the repository, in ISO 8601 format |
Homepage | The URL to the homepage or landing page associated with the repository, providing additional information or resources |
Size | The size of the repository in bytes, indicating the total storage space used by the repository's files and data |
Stars | The number of stars or likes that the repository has received from other GitHub users, indicating its popularity or interest |
Forks | The number of times the repository has been forked by other GitHub users |
Issues | The total number of open issues |
Watchers | The number of GitHub users who are "watching" or monitoring the repository for updates and changes |
Language | The primary programming language |
License | Information about the software license using a license identifier |
Topics | A list of topics or tags associated with the repository, helping users discover related projects and topics of interest |
Has Issues | A boolean value indicating whether the repository has an issue tracker enabled. In this case, it's true, meaning it has an issue tracker |
Has Projects | A boolean value indicating whether the repository uses GitHub Projects to manage and organize tasks and work items |
Has Downloads | A boolean value indicating whether the repository offers downloadable files or assets to users |
Has Wiki | A boolean value indicating whether the repository has an associated wiki with additional documentation and information |
Has Pages | A boolean value indicating whether the repository has GitHub Pages enabled, allowing the creation of a website associated with the repository |
Has Discussions | A boolean value indicating whether the repository has GitHub Discussions enabled, allowing community discussions and collaboration |
Is Fork | A boolean value indicating whether the repository is a fork of another repository. In this case, it's false, meaning it is not a fork |
Is Archived | A boolean value indicating whether the repository is archived. Archived repositories are typically read-only and are no longer actively maintained |
Is Template | A boolean value indicating whether the repository is set up as a template |
Default Branch | The name of the default branch |
验证报告
以下为卖家选择提供的数据验证报告:

Most Popular Github Repositories (Projects)
67.02MB
申请报告