Yannick Kurvers, 8008701
3c)
Log transformations are applied to linearise non-linear relationships between variables and
make the analysis better fit realistic scenarios. For the number of positive and negative ratings, the
marginal impact is not constant: an additional rating has a larger effect for vendors with few ratings
than for vendors with thousands of ratings. If we apply this little trick and transform logarithmically, the
marginal effects decrease as the values increase. This now shows this dynamic.
In mathematics, we learned that ln(0) is undefined, if we add 1 we avoid this problem. This
also allows sellers without ratings to be included in the analysis, albeit with a minimal value.
The sales price (the dependent variable) is also logged. This is intended to model
multiplicative changes: a change in the independent variables has a proportional effect on price, not
an absolute effect. This is particularly useful in analyses of economic data, where relative changes are
often more important than absolute differences.
This provides a more accurate and better intuitive understanding of how valuations and other
factors affect the selling price.
3e)
The results of our regression analysis are most similar to the results in Table 3 (OLS 1) of
Przepiorka (2013). Most of the coefficients are similar in direction and significance, e.g. the variables
list, paypal and the categorical variables of cap and format. The explained percentage variance (R 2 =
0.6246) in our analysis is close to the (R2) 0.67 in the paper. Small differences in R2 may be due to
differences in how we prepared the data.
Differences that do are significant can be seen for ln_sepos (positive ratings) and ln_seneg
(negative ratings). In our analysis, the coefficient of ln_sepos is not significant and close to zero (-
0.0003), while in the paper it is positive and significant (0.078). What is also different is that the
coefficient of ln_seneg in our analysis is not significant (0.005), while in the paper it is negative and
significant (-0.065).
These differences can possibly be explained by deviations in the dataset, the way outliers
were treated, or other specifications in the model. However, some significant differences, the overall
results show that our analysis is broadly consistent with the findings of Przepiorka (2013).
Table 1. OLS Regression Results Model 1
Variable Coefficient (SE)
_Cons 2.755***
ln_sepos -0.0003
(0.0039)
ln_seneg 0.0054
(0.0049)
List 0.147***
(0.013)
Seisid -0.232***
(0.019)
Sehasme -0.132***
(0.011)
Paypal -0.078***
(0.019)
Ptrans 0.183***
(0.014)
Cap_1 -1.361***
(0.068)
Cap_2 -5.179***
(0.033)
Cap_3 -2.562***
(0.024)
Cap_4 -1.797***
(0.016)
Cap_5 -0.842***
(0.009)
Format_2 0.040***
(0.009)
Format_3 0.626***
1
, Yannick Kurvers, 8008701
(0.014)
Brand_2 -0.176***
(0.014)
Brand_3 -0.173***
(0.015)
Cond_1 0.731***
(0.036)
Cond_3 0.164***
(0.014)
Cond_4 -0.088***
(0.022)
Seorig_1 -0.072***
(0.016)
Seorig_2 0.666***
(0.019)
Seorig_4 0.027
(0.026)
Seorig_5 -0.133***
(0.021)
Seorig_6 -0.148***
(0.026)
Seorig_7 0.345***
(0.025)
Seorig_8 0.181***
(0.035)
Seorig_9 0.135***
(0.033)
Seorig_10 0.425***
(0.047)
Tusbr 0.817***
(0.061)
Tcase 0.817***
(0.061)
Tadap -0994***
(0.036)
Number of 35,456
observations
R² 0.6246
P<0.01***
4a)
The relationship between item price and positive reviews, as shown in our regression results,
shows that the coefficient of ln_sepos is -0.0003. This suggests that the number of positive reviews
has almost no effect on the logarithmic selling price. Moreover, the effect is not statistically significant
(𝑝=0.933), indicating that there is no strong or consistent relationship between positive reviews and
price.
For negative ratings, the regression shows that the coefficient of ln_seneg is 0.005. This
suggests that an increase in negative reviews may have a small positive effect on the logarithmic
selling price. However, even this effect is not significant (𝑝=0.265), so we cannot draw reliable
conclusions about the relationship between negative reviews and price.
The coefficient of ln_sepos is -0.0003, meaning that a 1% increase in positive ratings results in
a negligible percentage change of -0.0003% in sales price. This effect is extremely small and also not
significant, so it has little meaning in practical terms.
The coefficient of ln_seneg is 0.005, suggesting that a 1% increase in negative ratings results
in a slight increase of 0.005% in the selling price. Again, this effect is not significant and hence cannot
be reliably interpreted.
2
3c)
Log transformations are applied to linearise non-linear relationships between variables and
make the analysis better fit realistic scenarios. For the number of positive and negative ratings, the
marginal impact is not constant: an additional rating has a larger effect for vendors with few ratings
than for vendors with thousands of ratings. If we apply this little trick and transform logarithmically, the
marginal effects decrease as the values increase. This now shows this dynamic.
In mathematics, we learned that ln(0) is undefined, if we add 1 we avoid this problem. This
also allows sellers without ratings to be included in the analysis, albeit with a minimal value.
The sales price (the dependent variable) is also logged. This is intended to model
multiplicative changes: a change in the independent variables has a proportional effect on price, not
an absolute effect. This is particularly useful in analyses of economic data, where relative changes are
often more important than absolute differences.
This provides a more accurate and better intuitive understanding of how valuations and other
factors affect the selling price.
3e)
The results of our regression analysis are most similar to the results in Table 3 (OLS 1) of
Przepiorka (2013). Most of the coefficients are similar in direction and significance, e.g. the variables
list, paypal and the categorical variables of cap and format. The explained percentage variance (R 2 =
0.6246) in our analysis is close to the (R2) 0.67 in the paper. Small differences in R2 may be due to
differences in how we prepared the data.
Differences that do are significant can be seen for ln_sepos (positive ratings) and ln_seneg
(negative ratings). In our analysis, the coefficient of ln_sepos is not significant and close to zero (-
0.0003), while in the paper it is positive and significant (0.078). What is also different is that the
coefficient of ln_seneg in our analysis is not significant (0.005), while in the paper it is negative and
significant (-0.065).
These differences can possibly be explained by deviations in the dataset, the way outliers
were treated, or other specifications in the model. However, some significant differences, the overall
results show that our analysis is broadly consistent with the findings of Przepiorka (2013).
Table 1. OLS Regression Results Model 1
Variable Coefficient (SE)
_Cons 2.755***
ln_sepos -0.0003
(0.0039)
ln_seneg 0.0054
(0.0049)
List 0.147***
(0.013)
Seisid -0.232***
(0.019)
Sehasme -0.132***
(0.011)
Paypal -0.078***
(0.019)
Ptrans 0.183***
(0.014)
Cap_1 -1.361***
(0.068)
Cap_2 -5.179***
(0.033)
Cap_3 -2.562***
(0.024)
Cap_4 -1.797***
(0.016)
Cap_5 -0.842***
(0.009)
Format_2 0.040***
(0.009)
Format_3 0.626***
1
, Yannick Kurvers, 8008701
(0.014)
Brand_2 -0.176***
(0.014)
Brand_3 -0.173***
(0.015)
Cond_1 0.731***
(0.036)
Cond_3 0.164***
(0.014)
Cond_4 -0.088***
(0.022)
Seorig_1 -0.072***
(0.016)
Seorig_2 0.666***
(0.019)
Seorig_4 0.027
(0.026)
Seorig_5 -0.133***
(0.021)
Seorig_6 -0.148***
(0.026)
Seorig_7 0.345***
(0.025)
Seorig_8 0.181***
(0.035)
Seorig_9 0.135***
(0.033)
Seorig_10 0.425***
(0.047)
Tusbr 0.817***
(0.061)
Tcase 0.817***
(0.061)
Tadap -0994***
(0.036)
Number of 35,456
observations
R² 0.6246
P<0.01***
4a)
The relationship between item price and positive reviews, as shown in our regression results,
shows that the coefficient of ln_sepos is -0.0003. This suggests that the number of positive reviews
has almost no effect on the logarithmic selling price. Moreover, the effect is not statistically significant
(𝑝=0.933), indicating that there is no strong or consistent relationship between positive reviews and
price.
For negative ratings, the regression shows that the coefficient of ln_seneg is 0.005. This
suggests that an increase in negative reviews may have a small positive effect on the logarithmic
selling price. However, even this effect is not significant (𝑝=0.265), so we cannot draw reliable
conclusions about the relationship between negative reviews and price.
The coefficient of ln_sepos is -0.0003, meaning that a 1% increase in positive ratings results in
a negligible percentage change of -0.0003% in sales price. This effect is extremely small and also not
significant, so it has little meaning in practical terms.
The coefficient of ln_seneg is 0.005, suggesting that a 1% increase in negative ratings results
in a slight increase of 0.005% in the selling price. Again, this effect is not significant and hence cannot
be reliably interpreted.
2