View the Project on GitHub alyssrod/recipes-and-ratings-analysis
This project analyzes the Recipes and Ratings dataset to explore how recipe characteristics (e.g., prep time, ingredient count) influence calories. I built a regression model that predicts calorie content based solely on features a user would see before cooking.
Name: Alyssa Rodriguez
Email: alyssrod@umich.edu

This histogram shows that most recipes fall between 100–800 kcal, though there are some high outliers, likely rich desserts or large dishes.

This scatterplot suggests a mild positive relationship: recipes with longer prep times tend to have slightly higher calories.

This table groups recipes by number of ingredients and shows their average calorie content. The trend confirms that more ingredients generally mean more calories.
I treated this as a regression problem: calories are a continuous value with meaningful differences (e.g., 300 vs 900 kcal).
Target Variable: calories
Metric: RMSE (Root Mean Squared Error)
I only used features known at “time of prediction” — prep time, ingredient count, tag length, etc.
minutes, n_steps, n_ingredients, tag_count, step_length_avgminutes_per_step, ingredient_densitymax_depth, n_estimators, min_samples_split)
This scatterplot compares the final model’s predictions to the true calorie values. Points clustered near the diagonal line indicate accurate predictions.