Skip to content

Genetic Algorithms: Modeling "Ghost Kitchen" Menu with PyGAD

DOI

In this notebook we'll design the optimal menu for a delivery-only "ghost kitchen." The goal is to create a focused, highly profitable menu that targets a specific local demographic (e.g., "healthy office lunches," "late-night student comfort food") by finding the best combination of dishes, prices, and ingredients.

Main Challenge: High Combinatorial Complexity

Searching through combinations of cuisines, hundreds of possible dishes, price points, and ingredient sourcing options. The relationships are complex (e.g., overlapping ingredients reduce waste and cost).

Our Objective: Maximize the estimated profit margin.

Problem Definition

Chromosome
A chromosome is a menu. It could be represented as a list of Dish IDs, where each dish has pre-defined attributes like food cost, prep time, and cuisine type. The GA would select the best subset of dishes.

[1, 0, 0, 1, ..., 1, 0]

Fitness Evaluation
The fitness function calculates the potential profit of a given menu, rewarding ingredient overlap (to reduce costs) and penalizing menus that don't align with a target cuisine.

Core Metric: Estimated Profit (PP)

This formula calculates the profitability of the menu based on the profit margin and popularity of each dish.

P=i=1n(si(priceicosti)popularityi)P = \sum_{i=1}^{n} (s_i \cdot (\text{price}_i - \text{cost}_i) \cdot \text{popularity}_i)

Bonus 1: Ingredient Overlap (BingB_ing)
This formula rewards menus that efficiently reuse ingredients, which helps to reduce waste and operational costs.

Bing=Wing(1count(unique_ingredients)count(total_ingredients))B_{ing} = W_{ing} \cdot \left(1 - \frac{\text{count}(\text{unique\_ingredients})}{\text{count}(\text{total\_ingredients})}\right)

Bonus 2: Cuisine Cohesion (BcuiB_cui)
This formula rewards menus that have a focused culinary theme, making them more appealing and coherent to customers.

Bcui=Wcui(count(dishes_in_dominant_cuisine)count(total_dishes))2B_{cui} = W_{cui} \cdot \left(\frac{\text{count}(\text{dishes\_in\_dominant\_cuisine})}{\text{count}(\text{total\_dishes})}\right)^2

Penalty: Menu Size (FsizeF_size)
This formula applies a penalty if the menu size deviates from a predefined target, preventing menus that are either too small or too large and complex.

Fsize=Psizecount(total_dishes)target_sizeF_{size} = P_{size} \cdot |\text{count}(\text{total\_dishes}) - \text{target\_size}|

Final Fitnes Function
This is the master formula that combines the core profit metric, the bonuses for smart design, and the penalty for size deviation to produce a single score for the GA to optimize.

Fitness=P+Bing+BcuiFsize\text{Fitness} = P + B_{ing} + B_{cui} - F_{size}

Data Mining

We can scrape data from platforms like DoorDash or Uber Eats for a specific zip code to analyze the existing "supply" of restaurants and menu prices. We can use public census data to model the local demographics. Food costs can be estimated from online wholesale suppliers.

For this prototype we'll use a prepared CSV with a 20 positions:
from IPython.display import display, Markdown
printdf = lambda df: display(Markdown(df.to_markdown()))
 
import pandas as pd
 
df = pd.read_csv('./01_02_dishes.csv')
printdf(
  df.head(2)
)
idnamepricefood_costcuisineingredientsprep_timepopularity
00Margherita Pizza123Italiandough;tomato;cheese890
11Pepperoni Pizza144Italiandough;tomato;cheese;pepperoni995

In this dataset, we need to normalize the "ingridients":

df['ingredients'] = df['ingredients'].str.split(';')
printdf(
  df.head(2)
)
idnamepricefood_costcuisineingredientsprep_timepopularity
00Margherita Pizza123Italian['dough', 'tomato', 'cheese']890
11Pepperoni Pizza144Italian['dough', 'tomato', 'cheese', 'pepperoni']995

Implementing the Fitness Function

We'll use PyGAD - a python library for building the genetic algorithm and optimizing ML algorithms:

! pip install -q pygad

First we'll define some parameters for our GA:

TARGET_MENU_SIZE = 8       # The menu size we're looking for.
WEIGHT_INGREDIENT = 3000   # Weight for Ingridient Bonus function.
WEIGHT_COHESION = 5000     # Weight for Cohesion Bonus function.
PENALTY_SIZE = 1000        # Penalty mutiplier for bad menu size

Now the fitness function implementation:

def fitness_func(ga_instance, solution, solution_idx):
    menu_df = df[solution == 1]
    
    if menu_df.empty:
        return -99999
        
    num_dishes = len(menu_df)
 
    # Calculate Core Profit (Vectorized)
    profit_per_dish = (menu_df['price'] - menu_df['food_cost']) * menu_df['popularity']
    total_profit = profit_per_dish.sum()
 
    # Calculate Ingredient Overlap Bonus
    all_ingredients = menu_df['ingredients'].explode().tolist()
    if not all_ingredients or len(all_ingredients) == 0:
        ingredient_bonus = 0
    else:
        unique_ingredients = set(all_ingredients)
        overlap_ratio = 1 - (len(unique_ingredients) / len(all_ingredients))
        ingredient_bonus = WEIGHT_INGREDIENT * overlap_ratio
 
    # Calculate Cuisine Cohesion Bonus
    if menu_df.empty:
        cohesion_bonus = 0
    else:
        dominant_cuisine_count = menu_df['cuisine'].value_counts().iloc[0]
        cohesion_ratio = dominant_cuisine_count / num_dishes
        cohesion_bonus = WEIGHT_COHESION * (cohesion_ratio ** 2)
 
    # Calculate Menu Size Penalty
    size_deviation = abs(num_dishes - TARGET_MENU_SIZE)
    size_penalty = PENALTY_SIZE * size_deviation
 
    # Final Fitness Calculation
    fitness = total_profit + ingredient_bonus + cohesion_bonus - size_penalty
    
    return fitness

Running GA

Next we'll create a GA instance; Ref: PyGAD Docs

import pygad
 
ga_instance = pygad.GA(
  num_generations=100,
  num_parents_mating=10,
  fitness_func=fitness_func,
  sol_per_pop=50,
  num_genes=len(df),
  gene_space=[0, 1],
  parent_selection_type="sss",
  crossover_type="single_point",
  mutation_type="random",
  mutation_percent_genes=5,
  keep_parents=1
)
ga_instance.run()

Interpret The Results

solution, solution_fitness, solution_idx = ga_instance.best_solution()
print(f"Fitness value of the best solution = {solution_fitness:.2f}\n")
print("--- Best Menu Found ---\n")
 
best_menu_df = df[solution == 1]
 
for index, dish in best_menu_df.iterrows():
    print(f"- {dish['name']} (${dish['price']}) | Cuisine: {dish['cuisine']}\n")
 
print(f"Number of Items: {len(best_menu_df)}\n")
profit_score = ((best_menu_df['price'] - best_menu_df['food_cost']) * best_menu_df['popularity']).sum()
print(f"Estimated Profit Score: {profit_score:.2f}\n")

Fitness value of the best solution = 9834.12

--- Best Menu Found ---

  • Lasagna ($16) | Cuisine: Italian

  • Classic Burger ($15) | Cuisine: American

  • BBQ Ribs ($20) | Cuisine: American

  • Chicken Wings ($13) | Cuisine: American

  • Mac & Cheese ($11) | Cuisine: American

  • Onion Rings ($7) | Cuisine: American

  • Veggie Burger ($14) | Cuisine: American

  • Fries ($5) | Cuisine: American

Number of Items: 8

Estimated Profit Score: 6006.00

ga_instance.plot_fitness()

/home/anton/.pyenv/versions/rnd/lib/python3.13/site-packages/pygad/visualize/plot.py:120: UserWarning: No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument. matplt.legend()

png

png

Anton Nesterov © 2025 | vski·science