View the Project on GitHub alyssrod/recipes-and-ratings-analysis
This project analyzes the Recipes and Ratings dataset to explore how recipe characteristics (e.g., prep time, ingredient count) influence calories. I built a regression model that predicts calorie content based solely on features a user would see before cooking.
Name: Alyssa Rodriguez
Email: alyssrod@umich.edu
This histogram shows that most recipes fall between 100–800 kcal, though there are some high outliers, likely rich desserts or large dishes.
This scatterplot suggests a mild positive relationship: recipes with longer prep times tend to have slightly higher calories.
This table groups recipes by number of ingredients and shows their average calorie content. The trend confirms that more ingredients generally mean more calories.
I treated this as a regression problem: calories are a continuous value with meaningful differences (e.g., 300 vs 900 kcal).
Target Variable: calories
Metric: RMSE (Root Mean Squared Error)
I only used features known at “time of prediction” — prep time, ingredient count, tag length, etc.
minutes
, n_steps
, n_ingredients
, tag_count
, step_length_avg
minutes_per_step
, ingredient_density
max_depth
, n_estimators
, min_samples_split
)This scatterplot compares the final model’s predictions to the true calorie values. Points clustered near the diagonal line indicate accurate predictions.