Genetic Algorithms: Modeling "Ghost Kitchen" Menu with PyGAD
In this notebook we'll design the optimal menu for a delivery-only "ghost kitchen." The goal is to create a focused, highly profitable menu that targets a specific local demographic (e.g., "healthy office lunches," "late-night student comfort food") by finding the best combination of dishes, prices, and ingredients.
Main Challenge: High Combinatorial ComplexitySearching through combinations of cuisines, hundreds of possible dishes, price points, and ingredient sourcing options. The relationships are complex (e.g., overlapping ingredients reduce waste and cost).
Our Objective: Maximize the estimated profit margin.Problem Definition
Chromosome
A chromosome is a menu. It could be represented as a list of Dish IDs, where each dish has pre-defined attributes like food cost, prep time, and cuisine type. The GA would select the best subset of dishes.
[1, 0, 0, 1, ..., 1, 0]
Fitness Evaluation
The fitness function calculates the potential profit of a given menu, rewarding ingredient overlap (to reduce costs) and penalizing menus that don't align with a target cuisine.
Core Metric: Estimated Profit ()
This formula calculates the profitability of the menu based on the profit margin and popularity of each dish.
Bonus 1: Ingredient Overlap ()
This formula rewards menus that efficiently reuse ingredients, which helps to reduce waste and operational costs.
Bonus 2: Cuisine Cohesion ()
This formula rewards menus that have a focused culinary theme, making them more appealing and coherent to customers.
Penalty: Menu Size ()
This formula applies a penalty if the menu size deviates from a predefined target, preventing menus that are either too small or too large and complex.
Final Fitnes Function
This is the master formula that combines the core profit metric, the bonuses for smart design, and the penalty for size deviation to produce a single score for the GA to optimize.
Data Mining
We can scrape data from platforms like DoorDash or Uber Eats for a specific zip code to analyze the existing "supply" of restaurants and menu prices. We can use public census data to model the local demographics. Food costs can be estimated from online wholesale suppliers.
For this prototype we'll use a prepared CSV with a 20 positions:from IPython.display import display, Markdown
printdf = lambda df: display(Markdown(df.to_markdown()))
import pandas as pd
df = pd.read_csv('./01_02_dishes.csv')
printdf(
df.head(2)
)
id | name | price | food_cost | cuisine | ingredients | prep_time | popularity | |
---|---|---|---|---|---|---|---|---|
0 | 0 | Margherita Pizza | 12 | 3 | Italian | dough;tomato;cheese | 8 | 90 |
1 | 1 | Pepperoni Pizza | 14 | 4 | Italian | dough;tomato;cheese;pepperoni | 9 | 95 |
In this dataset, we need to normalize the "ingridients":
df['ingredients'] = df['ingredients'].str.split(';')
printdf(
df.head(2)
)
id | name | price | food_cost | cuisine | ingredients | prep_time | popularity | |
---|---|---|---|---|---|---|---|---|
0 | 0 | Margherita Pizza | 12 | 3 | Italian | ['dough', 'tomato', 'cheese'] | 8 | 90 |
1 | 1 | Pepperoni Pizza | 14 | 4 | Italian | ['dough', 'tomato', 'cheese', 'pepperoni'] | 9 | 95 |
Implementing the Fitness Function
We'll use PyGAD - a python library for building the genetic algorithm and optimizing ML algorithms:
! pip install -q pygad
First we'll define some parameters for our GA:
TARGET_MENU_SIZE = 8 # The menu size we're looking for.
WEIGHT_INGREDIENT = 3000 # Weight for Ingridient Bonus function.
WEIGHT_COHESION = 5000 # Weight for Cohesion Bonus function.
PENALTY_SIZE = 1000 # Penalty mutiplier for bad menu size
Now the fitness function implementation:
def fitness_func(ga_instance, solution, solution_idx):
menu_df = df[solution == 1]
if menu_df.empty:
return -99999
num_dishes = len(menu_df)
# Calculate Core Profit (Vectorized)
profit_per_dish = (menu_df['price'] - menu_df['food_cost']) * menu_df['popularity']
total_profit = profit_per_dish.sum()
# Calculate Ingredient Overlap Bonus
all_ingredients = menu_df['ingredients'].explode().tolist()
if not all_ingredients or len(all_ingredients) == 0:
ingredient_bonus = 0
else:
unique_ingredients = set(all_ingredients)
overlap_ratio = 1 - (len(unique_ingredients) / len(all_ingredients))
ingredient_bonus = WEIGHT_INGREDIENT * overlap_ratio
# Calculate Cuisine Cohesion Bonus
if menu_df.empty:
cohesion_bonus = 0
else:
dominant_cuisine_count = menu_df['cuisine'].value_counts().iloc[0]
cohesion_ratio = dominant_cuisine_count / num_dishes
cohesion_bonus = WEIGHT_COHESION * (cohesion_ratio ** 2)
# Calculate Menu Size Penalty
size_deviation = abs(num_dishes - TARGET_MENU_SIZE)
size_penalty = PENALTY_SIZE * size_deviation
# Final Fitness Calculation
fitness = total_profit + ingredient_bonus + cohesion_bonus - size_penalty
return fitness
Running GA
Next we'll create a GA instance; Ref: PyGAD Docs
import pygad
ga_instance = pygad.GA(
num_generations=100,
num_parents_mating=10,
fitness_func=fitness_func,
sol_per_pop=50,
num_genes=len(df),
gene_space=[0, 1],
parent_selection_type="sss",
crossover_type="single_point",
mutation_type="random",
mutation_percent_genes=5,
keep_parents=1
)
ga_instance.run()
Interpret The Results
solution, solution_fitness, solution_idx = ga_instance.best_solution()
print(f"Fitness value of the best solution = {solution_fitness:.2f}\n")
print("--- Best Menu Found ---\n")
best_menu_df = df[solution == 1]
for index, dish in best_menu_df.iterrows():
print(f"- {dish['name']} (${dish['price']}) | Cuisine: {dish['cuisine']}\n")
print(f"Number of Items: {len(best_menu_df)}\n")
profit_score = ((best_menu_df['price'] - best_menu_df['food_cost']) * best_menu_df['popularity']).sum()
print(f"Estimated Profit Score: {profit_score:.2f}\n")
Fitness value of the best solution = 9834.12
--- Best Menu Found ---
-
Lasagna ($16) | Cuisine: Italian
-
Classic Burger ($15) | Cuisine: American
-
BBQ Ribs ($20) | Cuisine: American
-
Chicken Wings ($13) | Cuisine: American
-
Mac & Cheese ($11) | Cuisine: American
-
Onion Rings ($7) | Cuisine: American
-
Veggie Burger ($14) | Cuisine: American
-
Fries ($5) | Cuisine: American
Number of Items: 8
Estimated Profit Score: 6006.00
ga_instance.plot_fitness()
/home/anton/.pyenv/versions/rnd/lib/python3.13/site-packages/pygad/visualize/plot.py:120: UserWarning: No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument. matplt.legend()