Appendix · descriptive statistics

The data behind the directory.

Two views, side by side: the full population of 1,601,155 candidate pairs (everything the embedding can produce), and the curated 9,304 pairs surfaced as pages. The curated set is a deliberate editorial slice — these numbers tell you how it differs from the underlying population, and where the biases come from.

Ingredients

1,790

Pairs scanned

1,601,155

Ranked & curated

9,304

Flavor modes

543

Source corpus: the Epicure embeddings were trained on 4,135,189 recipes across 8 languages (en, zh, ru, vi, es, tr, id, de). The language mix is uneven and this shows up in the data — Asian ingredients tend to appear more in dense cooc clusters than European or Latin American ones.

Where pairs sit, before any curation

Quadrant distribution — full population (1.6M pairs)

Across all candidate pairs, 99.04% are neutral (no strong signal either way). The "interesting" pairs — classic, complementary, substitute — together account for just 0.96%. The curated set below cherry-picks from these.

Classic2,912

Complementary9,222

Substitute3,244

Neutral1,585,777

Quadrant distribution — curated 9,304 (what the site exposes)

After stratified ranking, the curated set deliberately over-represents classic and complementary pairs. This is editorial choice, not a property of the data.

Classic2,551

Complementary3,407

Substitute1,837

Neutral1,509

Per-sibling score summary

Min / median / p95 / p99 / max for each cosine measure. Population row is the truth; curated row shows how aggressive the cut is. Negative cosines exist in all three — most users never see them because the curated set is far above the median.

Sibling	Set	Min	Median	p95	p99	Max
Cooks with cooc	Population	-0.180	0.091	0.221	0.289	0.642
Cooks with cooc	Curated	0.040	0.336	0.415	0.465	0.642
Blended core	Population	-0.088	0.336	0.531	0.636	0.931
Blended core	Curated	-0.031	0.625	0.795	0.843	0.931
Tastes like chem	Population	-0.139	0.107	0.234	0.335	0.810
Tastes like chem	Curated	-0.084	0.381	0.568	0.654	0.810

The shape of the curated subset

2,000 of the curated pairs plotted on (cooks-with, tastes-like). Sample is deterministic across builds. Dashed lines mark the quadrant thresholds (cooc ≥ 0.30, chem ≥ 0.40). The full 1.6M population would be one dense blob near the origin — this is the editorial selection on top.

Score distributions — population (top row) vs curated (bottom row)

Same x-axis range in each column so you can read the curation cut directly. Each sibling has its own range because the geometries differ. Out-of-range counts: cooc 0 · core 0 · chem 0.

Cooks with — population

All 1,601,155 pairs. Range [-0.20, 0.65].

Cooks with — curated

9,304 pairs. Same range, so heights compare directly.

Blended — population

All 1,601,155 pairs. Range [-0.10, 0.95].

Blended — curated

9,304 pairs. Same range, so heights compare directly.

Tastes like — population

All 1,601,155 pairs. Range [-0.15, 0.85].

Tastes like — curated

9,304 pairs. Same range, so heights compare directly.

Most-present ingredients in the curated set

Number of curated pairs each ingredient appears in. NOT a measure of 'culinary centrality' — it's a measure of which ingredients survived the stratified top-9,304 cut. Asian ingredients dominate because the source corpus has dense cooc clusters around Asian dishes; reranking the curation tier would shift this list.

The 543 flavor modes

Mode sizes

Median 81 members; range 13–254.

Largest modes

Broad themes the embedding settled on.

Smallest modes

Tight clusters — often near-perfect substitution sets.

Modes by property axis

The paper labels each mode with an axis: either a continuous flavor property (sweet_score, nova_level, fatty_score…) or one of the emergent factor axes (F_0, F_3, …). Count per property tells you which axes were most prolific in the clustering.

cf_savory20

nova_level18

cf_minty17

F_817

bitter_score15

fatty_score15

F_015

F_315

F_915

F_1015

F_1315

sweet_score14

pungent_score14

F_714

F_1114

F_1214

cf_sweet13

cf_balsamic13

cf_citrus13

cf_woody13

F_213

F_413

umami_score12

usda_fiber_g12

F_112

cf_meaty11

sour_score11

F_611

F_1411

cf_earthy10

usda_protein_g10

usda_caloric_density10

fg_Vegetable10

F_1510

F_510

fg_Pantry9

fg_Beverage9

fg_Spice9

fg_Dairy9

usda_sugars_g8

fg_Grain8

F_198

fg_Fruit6

F_185

F_174

F_163

Does the embedding get culinary canon right?

Hand-curated list of obvious classic pairings. Each row shows the actual cosine scores and the quadrant the model assigns. Look for surprises — these reveal where the embedding's recipe corpus is weak.

Canon pair	Cooks with	Tastes like	Blended	Verdict
Tomato + Basil	0.32	0.15	0.47	Complementary
Lemon + Garlic	0.26	0.13	0.45	Neutral
Beef + Rosemary	0.22	0.06	0.39	Neutral
Chocolate + Strawberry	0.31	0.32	0.38	Complementary
Peanut + Chocolate	0.15	0.18	0.35	Neutral
Apple + Cinnamon	0.39	0.22	0.46	Complementary
Coffee + Cream	0.34	0.34	0.31	Complementary
Bacon + Egg	0.11	0.24	0.44	Neutral
Lime + Cilantro	Not in vocabulary
Honey + Thyme	0.26	0.10	0.31	Neutral
Tomato + Mozzarella cheese	0.27	0.23	0.51	Neutral
Chicken + Lemon	0.14	0.09	0.28	Neutral
Pork + Apple	0.10	0.10	0.36	Neutral
Miso + Ginger	0.18	0.22	0.46	Neutral
Soy sauce + Ginger	0.47	0.23	0.55	Complementary
Vanilla + Chocolate	0.49	0.40	0.51	Complementary
Mint + Lamb	0.25	0.18	0.42	Neutral
Fish + Lemon	0.22	0.13	0.42	Neutral
Onion + Garlic	0.45	0.53	0.73	Classic
Olive oil + Garlic	0.40	0.28	0.69	Complementary
Dill + Salmon	0.27	0.17	0.39	Neutral
Caraway + Rye flour	Not in vocabulary
Fennel + Sausage	0.13	0.13	0.44	Neutral
Maple syrup + Bacon	0.16	0.22	0.49	Neutral
Banana + Peanut	0.20	0.06	0.34	Neutral

The corpus is heavily East-Asian-weighted (39% of recipes). Western European classics like lemon+garlic and beef+rosemary score weaker than intuition expects — this is the corpus bias showing through, not a model failure.

How much do the three siblings agree?

Pearson r over the full 1.6M pair population. Higher = the two siblings say similar things about pair strength. Lower = the three-lens framing has real signal.

Cooks ↔ Tastes

0.541

Moderate overlap

Cooks ↔ Blended

0.542

Moderate overlap

Tastes ↔ Blended

0.619

Moderate overlap

All three correlate moderately (0.54–0.62) — they don't collapse into one another, which justifies showing all three on every pair page. Blended (Core) correlates highest with both, as expected: it's designed to blend the signals.

Outlier pairs in the curated set

Highest Cooc — most-co-occurring

Often near-substitutes: things that always appear together because they're regional cousins.

Highest Chem — most aromatic overlap

The strongest aroma-chemistry matches in the curated set.

Bourbon + Whiskey0.81
Armagnac + Brandy0.79
Longan + Red date0.79
Pecorino cheese + Romano cheese0.78
Cream of chicken soup + Cream of mushroom soup0.77

Biggest surprise — siblings disagree most

Largest |cooc − chem| gap. Where one lens says yes and the other says no.

Recipe corpus by cuisine — the bias source

The Epicure corpus is heavily skewed: East Asian recipes make up ~38% of the source data. This shapes every cooc-based statistic on this page.

Region	Recipes	Share	Modes	Traditions
East Asian	1,549,034	67.5%	77	Chinese, Korean
Western Atlantic	198,086	8.6%	7	American, British, German, Scandinavian
Mediterranean	164,107	7.1%	41	Italian, French, Iberian, Greek, Levantine, North African, Turkish
Eastern European	154,479	6.7%	1	Russian, Ukrainian, Polish, Hungarian, Georgian
Southeast Asian	107,964	4.7%	6	Thai, Vietnamese, Filipino, Indonesian, Malay
South Asian	47,462	2.1%	8	Indian, Pakistani, Sri Lankan, Bangladeshi
Latin American	40,618	1.8%	14	Mexican, Caribbean, Brazilian, Peruvian, Colombian
Japanese	33,923	1.5%	2	Japanese

Vocabulary by ingredient type

Rule-based classifier (regex over canonical names). 736 of 1790 (41%) didn't match any category — heuristics are conservative. Real coverage is better than this number suggests.

other736

vegetable152

fruit119

herb spice118

seafood86

sauce condiment82

grain starch80

cheese64

wine spirit63

oil fat53

dairy egg47

sweetener41

meat33

nut seed29

vinegar acid25

beverage25

poultry25

broth stock6

bean legume6

Pairs per ingredient in the curated set

Distribution of how many curated pairs each ingredient appears in. Of 1,790 ingredients, 1,671 appear in at least one curated pair; 119 are absent from the curated set (they exist in the vocab but no pair survived the cut).

Mode cohesion — how tight are the clusters?

Mean intra-mode cosine across all member pairs. Higher = members really do cluster together. Range: 0.10 (loosest single mode) to 0.61 (tightest); median 0.20.

Tightest modes

Members are very similar in the mode's own sibling.

Loosest modes

Members are nominally clustered but spread out — broader themes or noisier groupings.

What these numbers don't tell you

Recipe-language bias: the corpus is multilingual but uneven. Chinese and other Asian recipes are over-represented in the dense cooc clusters, which inflates Asian ingredients in every cooc-driven stat above.
Ingredient-cut bias: the vocab uses base entries (chicken, beef, pork) without cut-level subdivisions. "Chicken breast" and "chicken thigh" both fold into "chicken." Cuisines with fine-grained cut vocabulary lose detail.
Chemistry coverage: Chem is from FlavorDB, which catalogs aroma compounds for ~1,500 ingredients globally. Anything outside FlavorDB has a sparse Chem profile — particularly fermented and processed ingredients.
No quantitative recipe weight: the cooc graph treats "1 cup of garlic" and "1 clove of garlic" identically. It counts co-presence, not relative importance.
Quadrant thresholds are pragmatic: cooc ≥ 0.30, chem ≥ 0.40 are calibrated to ~70th percentiles per axis, not derived from theory. Adjusting them reshapes every quadrant count.

Last regenerated 5/29/2026