数据描述
About Dataset
Context
The growing availability of information in the past decade has allowed internet users to find vast amounts of information online, but this has come with more and more deceptive articles designed to advertise or promote a product or ideology. In addition, hidden sponsored news articles have grown in prevalence in recent years as news organizations have shifted their business strategy to account for developments in technology and content consumption. It is for this reason that having a system in place to detect these deceptive practices is more important than ever.
Content
This dataset consists of articles that were tagged by users as having a "promotional tone" (promotional.csv) and of articles that were tagged as "good articles" (good.csv).
The each promotional article can have multiple labels (quotes from Wikipedia tags):
- advert - "This article contains content that is written like an advertisement."
- coi - "A major contributor to this article appears to have a close connection with its subject."
- fanpov - "This article may be written from a fan's point of view, rather than a neutral point of view."
- pr - "This article reads like a press release or a news article or is largely based on routine coverage or sensationalism."
- resume - "This biographical article is written like a résumé."
The "good articles" are articles that were deemed "well written, contain factually accurate and verifiable information, are broad in coverage, neutral in point of view, stable, and illustrated."
验证报告

卖家暂未授权典枢平台对该文件进行数据验证,您可以向卖家
