The Last of Us Reviews

dddd

The Last of Us Reviews

video gamesnlptext miningtext

￥20

已售 0

14.44MB

数据标识：D17171585160254876

发布时间：2024/05/31

Context

The propagation of covid-19 worried a lot to all us 😷. In that sense, a zombie pandemic was always a very used topic in all times. Certainly, is a horrible way to finish our existence, so, this stories were very violent and the characters were trying to survive.. That's great, however, in this century, many projects considered adding other facets: the social and psychological consequences in the characters in that world. ellie_lou2

That's how we got here. The Last of Us is a masterpiece in the industry of the videogames where many experts, critics and web-pages are agree. Justly, its story was based in that hopeless, post-apocalyptic situation. A strong point here was the exploration in this types of events. Other point, and no less important, was the gameplay and the interactions. So, this game won many prizes and maybe was a pioneer in its category 🙌 . You can find the reasons of its success in the section reviews_g1 and then establish insights for future similar games.

In the next year a dlc was released: Left Behind. It’s a prologue to the events of the original game, being Ellie the main character. In this way, the character and her actions are better understood. The game was well received. You can analize it in the section reviews_lb and identify the reviews about Ellie and its friendship. 😄

Finally, The Last of Us Part II (and the reason that I wanted to create this dataset). It shows very opposite reviews 🤔. It's amazing to see this high divergence. Personally, I like this game too, it presents incredible graphics and is very realistic. But i understand the other point of view, surely you know some reasons as the inconsistency in character decisions or the changes in the trailers. But exist other reasons, you can analize it in depth in the section reviews_g2 and if is possible, propose any predictive model. In this case you can start here.

Now, a serie will be released. All of us hope it'll be a success 🎉🎉

Content

This kaggle dataset contains information scraped from metacritics using Scrapy and BeautifulSoup. More info about the used web-scraping in this github repository. The dataset contains 3 main sections: The Last of Us part II, The Last of Us, The Last of Us Left Behind where each one contains two type of files: users and critics.

The collection methodology is explained below: -The sample: The scraped reviews are the most recommend reviews. In one case is possible download all reviews but in other cases was not possible (it's possible but it's not good abuse web scraping in a web-page). However, the retrieved information is sufficient for further analysis. With the 6 files, it has a total of 40000 observations and 8 variables. Have fun! -Set of items: The game-users and/or fans of the sequel (or critics). Maybe a bot, but is just a hypothesis. Another point, the user reviews are more greater thar critic reviews by far. -Set of variables: All user data contains the following variables.

Variable	Description
Id	The nick of the game-user. Is a unique value
Review	The review of the user
Type_review	Some reviews are large or present spoilers. Expanded is that and normal is the rest.
Views	Number of views in a review
Votes	Number of votes that it was received
Date	Date when the review was published
Language	Used language in the review
Score	Proposed punctuation given for the user. The target

In the case of critic data, only contain Id, Review, Date and Score.

An update: I created new files. There are the files that ends in u. Those files are a duplicated of the originaI, i only added two new variables:

Variable	Description
Platform	Now, the set contains information about ps3 and ps4 reviews
Split	For the modeling and the tasks.

Pd1: Please check out the tasks. If you are interested, please propose any notebook 😊. If the dataset is not enough and you consider that is necessary get more variables, please let me know in the discussions. Pd2: Now, the id is not unique in tables with the variable platform. In fact, this is a gamer-id and he can write a review in both platforms.

Usage

Text classification: The main topic in this types of datasets. Vectorize the reviews and define a predictive model. Identify strong and weak points of the game. Compare each games: What is preferred? In what points? Why did this game is better than other this? Reduction of dimention: Detect similar word and then, clustering the reviews. Pd: Important. Mantain discretion. Some reviews are disrespectful, violent and difficult to read 😅. And obviously contain spoilers.

Acknowledgements

Thanks to Kaggle and its community. In general, thanks to the learners and teachers in machine learning, deep learning and computer vision.

Inspiration

Natural language processing is a great tool. One application that I'm interested is detect bullies messages in any social network. I know that exist many notebooks and papers, but I'd like to build a bot that detect all possible cases and surely, there exist!

看了又看

验证报告

以下为卖家选择提供的数据验证报告：

The Last of Us Reviews

￥20

已售 0

14.44MB

申请报告

The Last of Us Reviews

Context

Content

Usage

Acknowledgements

Inspiration

关于典枢

下载与支持

服务协议

关于我们

官方公众号

技术交流群