以下为卖家选择提供的数据验证报告:
数据描述
Context
When a data scientist wishes to include a categorical variable with more than two level in a multiple regression prediction model, additional steps are needed to insure that the results are interpretable. These steps include recoding the categorical variable into a number of separate, dichotomous variables. This recoding is called "dummy coding." In order for the rest of the chapter to make sense, some specific topics related to multiple regression will be reviewed at this time.
Content
Regression is a linear transformation of the X variables such that the sum of squared deviations of the observed and predicted Y is minimized. The prediction of Y is accomplished by the following equation:
Y'i = b0 + b1X1i + b2X2i + ... + bkXki
The "b" values are called regression weights and are computed in a way that minimizes the sum of squared deviations.
Inspiration
How categorical variables with more than two levels could be used in a regression prediction model.
