Skip to content

Shreecharana24/restaurant_data_analysis

Repository files navigation

Restaurant Tips Analysis

A statistical and visual analysis of restaurant tipping behavior using Python, Pandas, NumPy, SciPy, and Matplotlib.

This project explores patterns in tipping habits, outliers, gender differences, and the relationship between bill amount and tip amount.

Project Overview

This analysis covers:

  • Dataset cleaning
  • Descriptive statistics
  • Exploratory visualizations
  • Outlier detection using IQR
  • Confidence interval estimation
  • Gender-based tip comparison
  • Correlation & regression analysis
  • Scatter plot with regression line

Files

restaurant-tips-analysis/
│
├── analysis.ipynb (or analysis.py)    # Your analysis code
├── Set 17 - restaurant tips.csv       # Dataset
└── README.md

1. Dataset Overview and Cleaning

  • Cleaning Actions: Removed duplicates and dropped missing values in total_bill and tip.
  • Result:
    • Original rows: 244
    • After cleaning: 221

2. Basic Descriptive Statistics

Computed for: total_bill, tip, size

Metric total_bill tip size
Mean 20.97 3.31 2.57
Median 17.92 3.00 2.00
Std Dev 11.06 2.32 0.94
Min 3.07 1.00 1
Max 73.40 18.00 6
Range 70.33 17.00 5

3. Visual Exploration

Plots created using Matplotlib:

  • Histogram of total_bill
  • Histogram of tip
  • Boxplot of tips by day

4. Outlier Detection (IQR Method)

Formula: Outliers = points lying outside 1.5 × IQR

  • Total bill outliers: 12
  • Tip outliers: 12

5. 95% Confidence Interval for Mean Tip

Using t-distribution to estimate true average tipping behavior.

  • Mean tip: $3.31
  • 95% CI: ($3.00, $3.62)

6. Gender Comparison of Tips

Performed Levene test (variance equality) and Independent t-test.

  • Sample Size: Male (131), Female (81)
  • Avg Tip: Male ($3.42), Female ($3.03)
  • Variances equal? Yes (p = 0.2672)
  • Difference significant? No (t = 1.2339, p = 0.2186)

Conclusion: No statistically significant difference in tipping based on gender.

7. Correlation & Regression

  • Pearson correlation: 0.774
    • Interpretation: Strong positive relationship (Larger bills → larger tips).
  • Regression Model: Tip = -0.105 + 0.163 × Total Bill
  • R²: 0.600
    • Interpretation: 60% of variance in tip is explained by the bill amount.

8. Scatter Plot with Regression Line

Scatter plot showing the relationship between bill and tip, with fitted regression line:

  • Positive linear trend
  • Clear upward slope
  • Some variance but strong overall pattern

Requirements

  • pandas
  • numpy
  • matplotlib
  • scipy

Install dependencies using pip:

pip install pandas numpy matplotlib scipy

How to Run

Run the script via terminal:

python analysis.py

Or open the Jupyter Notebook if using .ipynb.

About

A statistical and visual analysis of restaurant tipping behavior using Python, Pandas, NumPy, SciPy, and Matplotlib. This project explores patterns in tipping habits, outliers, gender differences, and the relationship between bill amount and tip amount.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors