鱼泪

verify-tagTrue Car Listings 2017 Project

united statesbusinessautomobiles and vehicles

6

已售 0
95.24MB

数据标识:D17220618693047978

发布时间:2024/07/27

以下为卖家选择提供的数据验证报告:

数据描述

Context

This project is my first database creation. Taking real-life data from TrueCar.com listings, scraped and posted publicly by another Kaggle user, I attempt on my own to create, preprocess, and scrutinize the data, first by building a schema to format a database in PostgreSQL13 and running several queries based on self-designated questions. Using Jupyter Notebook, I then run the data through Python’s pandas and Scikit learn packages for basic regression analysis. Finally, I created a dashboard via Tableau Public for helpful visualizations.

Content

The dataset shares all but one added column with its original: Region. The original columns include id, price, year, mileage, city, state, vin, make, and model. The addition of the Region column was a self-assigned SQL task: after the original file was uploaded into SQL, I created a new table "Regions" in the database. This data is used to visualize sales across six regions of the U.S.: Pacific, Rockies, Southwest, Midwest, Southeast, and Northeast. City and State were combined in a new column to see data to unique cities, in cases where cities share the same name with others (e.g. Pasadena, Arlington, etc.).

PostgreSQL | See my Database Creation Notes here. Python | See my notebook for performing simple analysis. Tableau | A dashboard can be found in my Tableau Public profile.

Acknowledgements

The dataset utilizes a .csv file extracted from www.TrueCar.com, scraped by Kaggle user Evan Payne (https://www.kaggle.com/jpayne/852k-used-car-listings/data?select=tc20171021.csv).

data icon
True Car Listings 2017 Project
6
已售 0
95.24MB
申请报告