Write My Paper Button

WhatsApp Widget

Write My Paper Button

WhatsApp Widget

Statistical Inference Used Car Project

MAT10251 Statistical Analysis – Project Part B: Statistical Inference and Regression Analysis

Assessment Overview and Purpose

The MAT10251 Project Part B serves as the summative application of statistical inference, simple linear regression, and multiple linear regression. Utilizing the Oz-Price-Watch dataset, you are required to transition from the descriptive statistics performed in Part A to predictive and inferential modeling. This task assesses your capacity to communicate complex quantitative results to a non-technical audience while maintaining rigorous statistical appendices.

Task 1: Appendices for Statistical Inference and Regression

Your appendices must document the mechanical process of your analysis. Use Excel Data Analysis Toolpak for all computations. Each question requires defined variables, stated hypotheses ($H_0$ and $H_a$), and a clear decision rule based on your chosen level of significance ($alpha$).

Question 1: Population Mean Estimation (Topic 5)

Identify the population mean price for 2016 and 2017 vehicles within your specific sample. You must construct a confidence interval (typically 95%) to provide Oz-Price-Watch with a reliable range for the average cost of two- and three-year-old cars. Ensure you filter your data to include only the relevant model years before running the Descriptive Statistics or T-Distribution functions.

Question 2: Hypothesis Testing for Proportions (Topic 6)

Evaluate the claim that white cars constitute more than 30% of the market. This requires a one-sample z-test for proportions. A finding where $p < alpha$ suggests that restricting a search to white vehicles does not significantly limit buyer choice, as they represent a substantial market share.

Question 3: Comparison of Means (Topic 7)

Perform an independent samples t-test to determine if a price discrepancy exists between private sellers and used car dealers. You must first test for the equality of variances (F-test) to choose the correct t-test template (Equal vs. Unequal variances). This analysis informs buyers whether the “dealer premium” is statistically significant.

Question 4 & 5: Predictive Modeling (Topics 8 & 9)

Develop both a Simple Linear Regression (SLR) and a Multiple Linear Regression (MLR) model. In Question 4, select either Age or Odometer as the independent variable ($X$) to predict Price ($Y$). In Question 5, expand this to include Age, Odometer, and Transmission. Note that Transmission must be coded as a dummy variable (e.g., Manual = 0, Automatic = 1) before Excel can process the regression.

Task 2: Written Report Components

The report must translate the technical findings of Task 1 into a professional narrative for Oz-Price-Watch. Focus on the practical implications of the gradients and coefficients. For instance, explain the “slope” of the regression line as the average dollar depreciation for every additional year of the car’s age.

Submission Requirements

  • Length: 500 to 1,100 words (report component).
  • Format: Single Word document containing coversheets, report, and appendices.
  • Evidence: Integrated Excel outputs (tables and plots) within the appendices.
Section Key Deliverable Weighting
Statistical Appendices Hypothesis tests, Regression equations, Excel tables. 38 Marks
Written Report Non-statistical interpretation and conclusions. 12 Marks

Sample Answer Pool

Regression analysis indicates a statistically significant relationship between the age of a used vehicle and its listing price within the Oz-Price-Watch dataset. Calculations for the 2016 and 2017 models provide a 95% confidence interval that allows for precise estimation of the population mean across specified regions. Multiple linear regression outputs suggest that while odometer readings remain primary predictors of value, the inclusion of transmission type as a dummy variable improves the coefficient of determination. Results from the hypothesis testing on car color proportions confirm whether market saturation of white vehicles exceeds the 30% threshold traditionally associated with limited consumer choice. Findings from the t-test for independent samples highlight whether a significant price gap exists between private sellers and commercial dealerships. Scholarly work by Yan and Zhao (2022) confirms that multi-variable models provide the necessary robustness for pricing assets in volatile secondary markets (https://doi.org/10.3390/axioms11100512).

Applying multiple linear regression to automotive datasets allows researchers to isolate the specific impact of mechanical variables while filtering out market noise. High coefficient of determination ($R^2$) values in these models signify that age and mileage account for the vast majority of price variance in the 2024–2026 market cycle. Industry data suggests that the transition toward hybrid models is shifting traditional depreciation curves, which makes the analysis of internal combustion vehicles a vital baseline for modern statistical comparisons.

Submit a 500- to 1,100-word statistical report for MAT10251 Project Part B. Complete hypothesis tests and regression models to predict used car prices based on age and odometer data. Develop a comprehensive statistical analysis of used car data including hypothesis testing for proportions and multiple linear regression models. Optimized for MAT10251 Statistical Analysis.

References & Resources (APA 7th Edition)

He, X., & Wang, J. (2022). Quantitative analysis of the used car market: A regression approach. Journal of Mathematical Finance, 12(1), 215–230. https://doi.org/10.4236/jmf.2022.121013

Kraus, S. (2024). Statistical models in automotive economics: Predictive validity of odometer vs. age. International Journal of Forecasting, 40(2), 112–125. https://doi.org/10.1016/j.ijforecast.2023.09.004

Nguyen, T. (2025). Data-driven decision making in secondary markets: An inferential study. Journal of Business Research, 158, 113-128. https://doi.org/10.1016/j.jbusres.2024.113654