Altered macronutrient composition and genetics influence the complex transcriptional network associated with adiposity in the Collaborative Cross

Obesity is a serious disease with a complex etiology characterized by overaccumulation of adiposity resulting in detrimental health outcomes. Given the liver’s critical role in the biological processes that attenuate adiposity accumulation, elucidating the influence of genetics and dietary patterns on hepatic gene expression is fundamental for improving methods of obesity prevention and treatment. To determine how genetics and diet impact obesity development, mice from 22 strains of the genetically diverse recombinant inbred Collaborative Cross (CC) mouse panel were challenged to either a high-protein or high-fat high-sucrose diet, followed by extensive phenotyping and analysis of hepatic gene expression. Over 1000 genes differentially expressed by perturbed dietary macronutrient composition were enriched for biological processes related to metabolic pathways. Additionally, over 9000 genes were differentially expressed by strain and enriched for biological process involved in cell adhesion and signaling. Weighted gene co-expression network analysis identified multiple gene clusters (modules) associated with body fat % whose average expression levels were influenced by both dietary macronutrient composition and genetics. Each module was enriched for distinct types of biological functions. Genetic background affected hepatic gene expression in the CC overall, but diet macronutrient differences also altered expression of a specific subset of genes. Changes in macronutrient composition altered gene expression related to metabolic processes, while genetic background heavily influenced a broad range of cellular functions and processes irrespective of adiposity. Understanding the individual role of macronutrient composition, genetics, and their interaction is critical to developing therapeutic strategies and policy recommendations for precision nutrition.

metabolic syndrome, type 2 diabetes, and certain types of cancer [97]. The simplest definition of obesity is excessive adiposity resulting from the chronic imbalance between energy intake and expenditure. The underlying mechanisms involved in maintaining energy balance are complex and regulated by numerous factors such as genetic background [3,52,82], metabolism [19,84,90], gut microbiome [36,57,91], and environmental factors such as diet in the context of overfeeding [11-13, 76, 81]. Additionally, the specific interaction of dietary macronutrients and the endocrine system, in particular insulin response and signaling, has a critical role in the etiology of obesity [53]. Differences in dietary macronutrient composition can influence substrate utilization; specifically, rapidly digestible carbohydrates may interact with insulin and other hormones to increase fat accumulation relative to other macronutrients.
In addition to the complex interactions between adipose tissue, the central nervous system, nutrients, and hormones that regulate energy balance [3,25], the liver also influences the susceptibility to obesity, given its major role in the metabolism and processing of macronutrients including glycogenolysis, production of triglycerides, lipogenesis, and the synthesis of amino acids, cholesterol, and lipoproteins [75,93]. Obesity in turn can induce the pathological response of insulin resistance in the liver, which results in an impaired ability of insulin to decrease glucose output from the liver while continuing to stimulate lipogenesis; this disruption of appropriate carbohydrate and lipid metabolism is thought to contribute to some of the health complications associated with obesity like metabolic syndrome and cardiovascular disease. Adipokines such as adiponectin, adipocyte dysfunction, metabolism, and circulating metabolite levels affect hepatic gene expression [21,56], which regulates the mechanisms involved in lipid processing, determination of metabolic rate, and other physiological processes associated with energy imbalance [46,93]. Furthermore, an individual's inherent genetic architecture and specific environmental exposures such as diet also shape hepatic gene expression [31,41,80]. Given that the liver regulates so many biological processes related to obesity development, elucidating the effects of genetic architecture and diet on hepatic gene expression is necessary to understand the mechanisms underlying susceptibility to obesity and development of effective prevention and treatment regimes.
Modern molecular biology techniques have revolutionized our ability to detect changes in gene expression [50,74], which allows one to infer potential candidate genes and pathways underlying metabolic dysfunction [16,33]. Identification of genes and pathways that determine susceptibility to obesity facilitates the understanding of the underlying mechanisms behind the development of obesity, which is instrumental to determining effective methods of prevention and treatment. Simultaneous to the advances in high-throughput assessment of gene expression, a novel population of mice has been developed. Derived from elaborate intercrosses of eight founder mouse strains [7,35,89], the CC is a large recombinant inbred mouse population with tremendous genetic diversity and genetic contribution from five classically inbred strains, A/J, C57BL/6J (B6), 129S1/SvImJ (129), NOD/ ShiLtJ (NOD), and NZO/HILtJ (NZO), and three wildderived strains, CAST/EiJ (CAST), PWK/PhJ (PWK), and WSB/EiJ (WSB) [9,64,79,85]. The genetic and phenotypic diversity of the CC is of similar scale to the human population [86] and provides an opportunity to address the complex interactions between genetics and dietary macronutrient composition that affect hepatic gene expression. The ability to utilize multiple replicates of individual CC strains allows for more precise delineation between confounding environmental influences and dietary effects within the context of a known genetic architecture.
Previously, we examined the effects of diet and genetic background on adiposity and other obesityrelated traits [101]. In the current study, we focus on the effects of macronutrient composition and strain (genetic background) on hepatic gene expression and relate these to phenotypic traits and biological functions. To find potential candidate genes or functional pathways underlying metabolic dysfunction regulated by diet in a genetically diverse population, we administered a challenge of either high-protein (HP) or highfat high-sucrose (HS) diet to 22 strains of mice from the Collaborative Cross (CC) mouse panel for 8 weeks and performed microarray gene expression analysis of 11,542 genes using high-quality RNA from liver tissue, in addition to extensive phenotyping. To ascertain the expression of genes (mRNA) associated with adiposity, determine which genes were differentially expressed by dietary macronutrients and genetic strain, and identify groups of related genes affected by genetic background and/or diet in the liver, we examined hepatic gene expression levels and related them to phenotypes using one analyses pipeline centered around linear models for microarray (limma) and a separate analyses pipeline focused on weighted gene co-expression network analysis (WGCNA) (see Supplementary Fig. 1, Additional file 1), which facilitated exploration of gene expression from two perspectives: for individual genes using the limma approach and for groups of genes using the network approach.

Differential gene expression analysis identified 1344 genes responsive to differences in dietary macronutrient composition
Both genetics and environmental factors such as diet are critical determinants of obesity. Although genetics have a stronger effect on susceptibility to developing obesity than diet alone [10,29], the role of diet as an environmental factor that influences gene expression is still important, since changes in dietary patterns can help mitigate the degree of obesity that develops by altering gene expression levels. To assess which genes' expression levels are affected by diet, differential gene expression analysis was performed using the R package limma (linear models for microarray) on liver gene expression data. Comparing the HS diet to the HP diet revealed 1344 genes that were differentially expressed by diet (p adj < 0.05, Supplementary Table 3, Additional file 2) with the top 20 most significant hits showing patterns of expression clustering by diet ( Fig. 2A), where 16 genes showed increased expression and 4 genes showing decreased expression in mice fed the HP diet relative to the HS diet, though expression patterns exhibited some degree of inter-strain variation depending on the gene and strain. The opposite patterns of expression for these genes were shown in mice fed the HS diet, i.e., genes that showed increased expression in mice fed the HP diet had decreased levels of expression in mice fed the HS diet ( Fig. 2A). The expression levels of 389 differentially expressed genes (DEGs) by diet were significantly correlated with body fat % (p < 0.05), including Irs2 and Pik3r1.
The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and gene ontology (GO) enrichment analyses identified 20 significantly overrepresented KEGG pathways and 187 significantly overrepresented GO terms for DEGs by diet (Fig. 2B-E; see  Supplementary Table 4, Additional file 2), with varying degrees of gene richness defined by the number of up-or downregulated DEGs found belonging to each KEGG pathway or GO term out of the total number of genes that comprise each KEGG pathway or GO term. The most significantly overrepresented KEGG pathways identified were metabolic pathways, oxidative phosphorylation, and biosynthesis of amino acids (p adj ≤ 5.05 × 10 −8 ). In terms of each GO term category, 105 GO biological processes, 45 GO cellular components, and 37 GO molecular functions were significantly overrepresented (p adj < 0.05), with the top 10 most significantly overrepresented GO terms in DEGs by diet shown in Fig. 2C-E. The majority of enrichment terms were related to metabolism of a wide variety of substrates with numerous enrichments of mitochondrial cellular components (see Supplementary Table 4, Additional file 2).

Genetic architecture perturbed global hepatic gene expression to a greater extent than macronutrient composition
Genetics is clearly an important factor affecting susceptibility to metabolic dysfunction. We tested the role of genetics in regulatory gene expression by performing Fig. 1 Top 30 genes with expression levels most significantly correlated with body fat %. Multiple biweight midcorrelations (bicor) and their corresponding Student correlation p-values were calculated between phenotypic data and microarray liver gene expression data to properly take into account the actual number of observations when determining which genes' expression levels were correlated with post-diet phenotypes of interest. The top 15 genes whose expression is most significantly positively correlated with body fat % (bicor ≥ 0.410, p ≤ 2.53 × 10 −6 ) and top 15 genes whose expression is most significantly negatively correlated with body fat % (bicor ≤ −0.466, p ≤ 5.42 × 10 −8 ) are shown. With the exception of insulin and glucose/insulin ratio, most of the top 30 genes' expression most significantly correlated with body fat % were not significantly correlated with circulating analytes but were significantly correlated with metabolic (energy regulation) traits. Genes are ordered on the y-axis in descending order of bicor with the strongest positive correlation at the top and the strongest negative correlation at the bottom. Scale indicates bicor value with color darkness as indicator of correlation strength. †Indicates genes that are also differentially expressed by diet; all 30 genes were found to be differentially expressed by strain. *Indicates genes found to be associated with at least one obesity-related trait in humans according to the GWAS catalog. Annotation for all genes with expression significantly correlated with body fat % are shown in Supplementary  (Fig. 3A). Unlike the inter-strain variation of expression patterns for diet DEGs, expression patterns were consistent across diets for strain DEGs. DEGs by CC strain showed similar levels of expression within each CC strain regardless of the diet fed. One-thousand onehundred thirty-one DEGs by CC strain were also differentially expressed by diet (such as Irs2 and Pik3r1), and 2367 of DEGs by CC strain were correlated with body fat % (nominal p < 0.05), including Ide, Insig1, Irs2, Jak1, and Pik3r1. Interestingly, additional genes encoding proteins crucial to insulin signaling [2,5,6,30,60,100] were differentially expressed by strain but not diet, specifically high mobility group AT-hook 1 (Hmga1), insulin-induced gene 2 (Insig2), and insulin receptor substrate 1 (Irs1) (Supplementary Table 5, Additional file 2).
KEGG pathway and GO enrichment analyses identified fewer overrepresented KEGG pathways and GO terms for genes differentially expressed by CC strain than diet. For strain DEGs, 13 significantly overrepresented KEGG pathways and 163 significantly overrepresented GO terms were identified (p adj < 0.05, Fig. 3B-E; see Supplementary Table 6, Additional file 2), with varying degrees of gene richness. The most significantly overrepresented KEGG pathways identified were cell adhesion molecules (CAMs), ECM-receptor interaction, and focal adhesion (p adj ≤ 2.6 × 10 −3 ), which are pathways important to cell signaling and structural binding between cells. For each GO term category, 95 GO biological processes, 44 GO cellular components, and 24 GO molecular functions were significantly overrepresented in strain DEGs (p adj < 0.05), with the top 10 most significantly overrepresented GO terms in DEGs by strain shown in Fig. 3C-E. In contrast to the enrichments for diet DEGs, very few of the enrichments for strain DEGs were related to metabolism. Instead, most enrichment terms were related to basal biological functions such as cell or tissue motility, cell division, tissue development, and substrate binding; most cellular compartment enrichments were derivatives of the cell membrane as opposed to the mitochondria (Supplementary Table 6, Additional file 2).
A query of the GWAS catalog identified DEGs in the CC that were associated with obesity-related traits in humans We were next interested in identifying clinically important genes that are suspected of causing underlying complex traits in humans to provide context for our findings relative to human obesity. Using the GWAS catalog to guide our search, we found that 15.8% of the genes expressed in the liver in this study (1819/11,542) have been previously found to be associated with obesity traits in humans [4]. Of these 1819 genes expressed in the livers of the CC mice that were also found associated with obesity traits in humans, greater than 85% (1570/1819) were found to be diet DEGs, strain DEGs, or significantly correlated with body fat % in this CC study. Using the CC as a model for obesity, we identified over 1500 genes expressed in the liver whose expression levels were either under genetic regulation, influenced by diet, or correlated with body fat %, which were also clinically important in humans.
Of the 1344 genes differentially expressed by diet, 214 genes were found to be associated with obesity traits in humans according to the GWAS database; 65 of these 214 genes were also significantly correlated with body fat % in the CC ( Fig. 4A; see Supplementary Table 7, Additional file 2). Out of 9436 genes differentially expressed by CC strain, 1516 genes were found to be associated with obesity traits in humans according to the GWAS database, including Hmga1 and Irs1; 431 of these 1516 genes were also significantly correlated with body fat % in the CC ( Fig. 4B; see Supplementary Table 8, Additional file 2). By intersecting our lists of genes across multiple analyses, we found 434 differentially expressed genes with expression levels correlated with body fat % in the CC that were associated with obesity traits in humans ( Fig. 4C; see (See figure on next page.) Fig. 2 Expression patterns and enrichment of diet DEGs. A The top 20 most significant (BH-adjusted p ≤ 2.37 × 10 −8 ) diet DE genes' average Z scores of median robust multi-array average (RMA) normalized gene expression for each CC strain on either the high-protein (HP) or high-fat high-sucrose (HS) diet shown ordered from top to bottom by level of gene expression on the HP diet (highest to lowest). The genes' average Z scores for each CC strain and diet are clustered by Euclidean distance on the x-axis. ‡Denotes genes also differentially expressed by strain. *Indicates genes with human homologs found in the GWAS catalog to be associated with at least one obesity-related trait. Annotation and limma results are shown for all diet DEGs in Supplementary Table 3, Additional file 2. Limma analysis of microarray data revealed genes differentially expressed by diet showing significant enrichment (p adj < 0.05) for B KEGG (20 total), C GO biological pathways (105 total), D GO cellular components (45 total), and E GO molecular functions (37 total). Pathways are ordered from top to bottom by significance (highest to lowest) and colored by gene richness. The top 10 enrichments for each ontology category were all upregulated on the HP diet, except for the GO cellular component and "integral component of membrane, " which was downregulated. All significant enrichment terms and enrichment analysis results are shown in Supplementary Supplementary Tables 7 and 8, Additional file 2), with three genes exclusively differentially expressed by diet, 369 genes exclusively differentially expressed by strain (e.g., Ide), and 62 genes differentially expressed by both diet and strain (e.g., Pik3r1).

Differences in diet macronutrient composition had mild effects on broad sense heritability (H 2 ) estimates for gene expression levels
To quantify the degree to which genetic variation influences variation in gene expression levels, we calculated broad sense heritability (H 2 ) for the 11,542 genes used for differential gene expression analysis, which estimates the proportion of phenotypic variation attributed to differences between genetic variation [20]. Using hepatic gene expression as the observed "phenotype" in this study, H 2 was estimated by calculating the intraclass correlation (r I ) and coefficient of genetic determination (g 2 ) from the between-and within-strain mean square values (MSB and MSW, respectively) derived from linear models. The proportion of variation accounted for by differences between strains can be approximated by r I in general, while the calculation of g 2 takes into consideration the additive genetic variance that doubles during inbreeding [18,20,47]. Estimates of H 2 based on g 2 calculated using MSB and MSW derived from the "full" additive linear models for the 11,542 genes expressed in the liver used for differential gene expression analysis ranged from −0.056 to 0.983 with a median g 2 of 0.173. To assess whether differences in macronutrient composition ("diet environment") influenced H 2 by DEG status, r I and g 2 summary statistics were calculated for all expressed genes, diet DEGs, and strain DEGs (Table 1); g 2 for diet DEGs ranged from −0.044 to 0.735 with a median of 0.195, while g 2 for strain DEGs ranged from 0.045 to 0.983 with a median of 0.211. For diet-specific g 2 , the minimum g 2 values were slightly less than 0, implying that the variation in expression levels for these genes was greater within strains than between strains, but maximum g 2 and median g 2 values were similar both across diets and DEG status. Overall, the distributions of g 2 specifically for the HP and HS diets did not differ significantly neither by the Mann-Whitney test (W = 67,447,080, p = 0.098) nor the Kolmogorov-Smirnov test (D = 0.017, p = 0.074), demonstrating that the proportion of variation in gene expression levels attributed to genetic variation stays relatively constant despite differences in macronutrient composition.
To quantify the proportion of the total gene expression variation that is accounted for by differences between diet, we next calculated the diet intraclass correlation (ICC) using the diet MSB and MSW values derived from the "full" additive linear models and then calculated summary statistics by DEG status group, i.e., all expressed genes, strain DEGs, and diet DEGs ( Table 1). Diet ICC for all expressed genes ranged from −0.017 to 0.799 with a median diet ICC of 0.015. Similarly, diet ICC for strain DEGs ranged from −0.017 to 0.787 with a median of 0.019. Though the maximum diet ICC for diet DEGs (diet ICC = 0.799) was similar to the diet ICC maximum values for all expressed genes and strain DEGs (Table 1), the diet DEGs' minimum (diet ICC = 0.099) and median (diet ICC = 0.235) estimates were slightly higher, confirming that the proportion of gene expression variation explained by diet differences was modestly increased for diet DEGs.
To investigate the degree to which gene × environmental (diet) effects mediate variation in gene expression relative to genetics and environment, additional linear mixed model analyses with strain, diet, and strain × diet interactions all as random effects were performed for each gene to estimate the relative heritable variation that can be attributed to strain, diet, and strain × diet effects. From the results of these models, we calculated the variance for each of these terms and found that the proportion of heritable variation for gene expression attributed to strain × diet interactions on average was small (2.6%) and remained the same regardless of DEG status ( Table 2). For all genes used in differential expression analysis, the largest proportion of heritable variation for gene expression can be attributed to genetic background (strain) on average (30.3%),

Fig. 3
Expression patterns and enrichment of transcripts differentially expressed by CC strain. A The top 20 most significant (BH-adjusted p ≤ 2.631 × 10 −56 ) strain DE genes' average Z scores of median robust multi-array average (RMA) normalized, gene expression for each CC strain on either the high-protein (HP) or high-fat high-sucrose (HS) diet shown. Gene average RMA Z scores for each CC strain and diet are clustered according to Euclidean distance by CC strain and diet on the x-axis and by gene on the y-axis. The human homolog of Gdpd3 was found in the GWAS catalog to be associated with at least one obesity-related trait. Annotation and limma results are shown for all strain DEGs in Supplementary Table 5, Additional file 2. Limma analysis of microarray data revealed genes differentially expressed by strain showing significant enrichment (p adj < 0.05) for B KEGG (13 total), C GO biological pathways (95 total), D GO cellular components (44 total), and E GO molecular functions (24 total). Pathways are ordered from top to bottom by significance (highest to lowest) and colored by gene richness. The top 10 enrichments for each ontology category were all upregulated on the HP diet, except for the linoleic acid metabolism KEGG pathway, and GO molecular functions "monooxygenase activity" and "oxidoreductase activity, acting on paired donors…, " which were downregulated. All significant enrichment terms and enrichment analysis results are shown in Supplementary Table 6 while the proportions of heritable variation for gene expression attributed to diet (3.9%) and strain × diet interactions (2.6%) were much smaller. As expected, the proportion of heritable variation for gene expression attributed to diet was increased in diet DEGs (18.7%), and the proportion of heritable variation for gene expression attributed to strain was increased in strain DEGs (36.0%).

Transcriptional co-expression network analysis identified key modules associated with adiposity
Because polygenic obesity is a complex physiological trait, we used a gene co-expression network approach to characterize the effects of strain and diet on expression of groups of related genes in addition to assessment of genes individually, specifically weighted gene co-expression network analysis (WGCNA). WGCNA determines which genes have similar expression profiles DEGs in the CC were associated with obesity-related traits in humans. Comparisons of differentially expressed genes, genes with expression levels significantly correlated with body fat % (BF%), and genes previously found to be associated with obesity-related traits in the GWAS catalog revealed A the number of genes differentially expressed by diet that also had expression levels significantly correlated with body fat % and associated with obesity traits in humans (65), B the number of genes differentially expressed by CC strain that also had expression levels significantly correlated with body fat % and associated with obesity traits in humans (431), and C the number of genes that fall under all four categories (62). Gene annotation, body fat % correlations, limma statistics, and a subset of related GWAS annotation are shown for the 65 diet DEGs in Supplementary using a clustering method based on correlations of gene expression, which identifies the network modules (groups of related genes); measures derived from gene expression correlations influence the strength of connections between genes within the network, where the highly interconnected genes that form modules may be components of biological pathways, helping to bridge the effects of individual genes and resulting phenotypes [45,106,107]. Taking a global approach to elucidate the relationship between gene expression and emergent phenotypes, WGCNA was performed using the 11,542 genes expressed in the liver and identified 13 clusters of genes (modules) each assigned an arbitrary color, where the Table 1 Heritability estimate and diet intraclass correlation summary statistics for all expressed genes and DEGs Post-diet heritability estimates were calculated from linear models including strain, diet, and week as covariates (r I or g 2 "full") for gene expression of the 11,542 expressed genes used in limma differential gene expression analysis. Diet-specific estimations of broad sense heritability were also calculated accordingly for gene expression levels represented by intraclass correlations (r I ) and coefficients of genetic determination (g 2 ) for each trait using the MSB and MSW for strain derived from linear models with strain and week as covariates using only data from each experimental diet per model as indicated to assess how different diet "environments" affect heritability. The intraclass correlation for diet (Diet ICC), which is the proportion of the total phenotypic variation that is accounted for by differences between diet, was calculated to compare the proportion of variation in gene expression attributed to diet in general or genetics. Summary statistics were calculated for each group of genes after heritability estimates, and diet ICC were obtained. g 2 accounts for the additive genetic variance that doubles during inbreeding and may be a more appropriate estimate for broad sense heritability in this study. However, both r I and g 2 values are presented to facilitate comparisons with other findings in the literature  number of genes contained in each module ranged from 42 to 3319 (Fig. 5A, Table 3; see Supplementary Table 9, Additional file 2 and Supplementary Fig. 2, Additional file 1) with varying degrees of connectivity between genes (see Supplementary Fig. 3, Additional file 1 for an example). The percentage of genes significantly correlated with body fat % (15.1-69.0%), and the percentage of DEGs by diet (0-49.5%) showed a wide range of variation in gene numbers across modules, but the percentage of DEGs by CC strain remained consistently high (> 69%) for all modules (Table 3, Fig. 5B); the consistently high presence of strain DEGs in all modules compared to the lower percentage and variation of diet DEGs between modules suggest a stronger effect of CC strain than diet on gene expression. Of the DEGs with expression levels correlated with body fat % and associated with obesity-related traits in humans, the three diet DEGs were each assigned to different modules (black, blue, and pink); the range of strain DEGs per module was 1-106, with the turquoise module containing the highest number of strain DEGs (Table 4). Per module, the range of DEGs differentially expressed by both diet and strain with expression levels correlated with body fat % and also associated with obesity-related traits in humans was 0-19, where most modules contained at least one DEG and yellow contained the most DEGs (Table 4). After establishing the modules, module eigengenes (MEs) were calculated to estimate the average expression profiles of each module, and Spearman's correlations were performed between MEs and phenotype data from all mice to determine the relationships between the modules and measured phenotypic traits, revealing significant correlations between the pink, yellow, salmon, tan, red, and magenta modules and body fat % ( Fig. 5C; see Supplementary Table 10, Additional file 2). Concurrent with ME × phenotype data correlations, modules that were significantly correlated with body fat % had relatively higher percentages of individual genes whose expression levels were significantly correlated with body fat %.  Because multiple modules were associated with clinical phenotypes (Fig. 5C), we performed enrichment analysis to determine potential mechanisms underlying these associations. Module enrichment varied widely (Table 5), from no enrichments at all (tan) to 419 total enrichments (brown). Figure 6A-D shows the top enrichments for each module if present. Of the modules that were significantly correlated with body fat % in the CC, the tan module showed no enrichments, the pink module showed enrichment for the RNA binding GO molecular function (GO: 0003723) (p adj = 0.042), the salmon module showed enrichment for the regulation of angiogenesis (GO: 0045765) (p adj = 0.009) and cGMP metabolic process GO biological processes (GO: 0046068) (p adj = 0.046), and the magenta, red, and yellow modules showed multiple enrichments for GO biological processes, GO Fig. 4, Additional file 1); genes assigned to the red module were significantly enriched for GO terms and KEGG pathways involved in steroid, cholesterol, and fatty acid biosynthesis/metabolism ( Supplementary Fig. 5, Additional file 1); and genes found in the yellow module were significantly enriched for a variety of functions in terms of GO terms and KEGG pathways, such as photoperiodism, transcription regulation, insulin signaling, and more (Supplementary Fig. 6, Additional file 1).

Both diet macronutrient composition and genetic background affected expression of modules containing homologs associated with obesity in humans
The magenta, red, and yellow modules were enriched for biological pathways and correlated with body fat % (Figs. 5C and 6E-G; Supplementary Figs. 4-6, Additional file 1). To determine whether these modules contained DEGs in the CC associated with obesity in humans, the lists of genes assigned to each module were intersected with the list of genes previously found to be associated with obesity traits in humans in the GWAS catalog (Supplementary Table 9, Additional file 2), with examples for these modules shown in Table 6. By intersecting our results across different analyses, DEGs important to obesity in humans were found in biologically relevant modules associated with body fat % in the CC, where the DEG distribution across modules highlighted the larger contribution of differential expression by strain over diet.
After finding that gene modules were correlated with body fat % and contained DEGs, we ascertained whether the average gene expression profile of these modules defined by their ME first principal components (PC1) differed by diet and/or strain. Wilcoxon ranked-sum test of the PC1 between mice fed the HP and HS diets for each module (Fig. 7A-E) revealed significant differences by diet for the yellow, red, magenta, pink, and tan modules (p < 0.01), but not the salmon module (p > 0.1). Interestingly, when the Kruskal-Wallis test was performed to determine whether PC1 differed by strain for each module (Fig. 7F-H), PC1 significantly differed by strain  for the yellow, red, magenta, and salmon modules (all p ≤ 8.1 × 10 −4 ), but not the pink nor tan modules. Of the modules with MEs significantly correlated with body fat %, the yellow, red, and magenta modules exhibited differences by both macronutrient composition and CC strain.
Relating module MEs and body fat %, Spearman's correlations performed between MEs and body fat % for the yellow, red, and magenta modules using data from all samples revealed a significant negative correlation between body fat % and the yellow module (rho = −0.28, p = 0.0016) and significant positive correlations between body fat % and the magenta (rho = 0.19, p = 0.037) and red (rho = 0.27, p = 0.0027) modules (Fig. 6E-G). Given the many enrichments in biological pathways found and significant differences in MEs by diet and CC strain for these three modules, Spearman's correlations were performed between MEs and body fat % by diet for each module to determine whether the relationship between MEs and body fat % remained consistent across different diets for enriched modules. The correlation between expression of the yellow module and body fat % was significant and negative for the HS diet only, while the correlation between expression of the magenta module and body fat % was significant and positive for the HS diet only ( Supplementary Fig. 7, Additional file 1). Unlike the yellow and magenta modules where the correlations between MEs and body fat % were only significant for the HS diet, the correlation between the red ME and body fat % remained significant and consistently positive for both diets (Supplementary Fig. 7, Additional file 1). In summary, Spearman's correlations performed between MEs and body fat % by diet for biologically relevant modules illustrated alterations in the direction and magnitude of associations between module MEs and body fat % depending on diet for the yellow and magenta modules, in contrast to the red module where the direction and magnitude of associations between module MEs and body fat % for the red module remained consistent regardless of diet, demonstrating the modules' different responses to diet.

Discussion
Obesity is a complex and heterogeneous disease whose development is caused by numerous biological factors, particularly genetics, diet, and gene expression. Though long established that obesity results from a chronic imbalance between energy intake and expenditure at a fundamental level, our understanding of exactly how diet and genetics interact to influence gene expression and how gene expression regulates the development of obesity remains to be fully elucidated. Because the liver regulates metabolism of macronutrients, cholesterol, and triglycerides, we measured hepatic gene expression in the CC to gain insight of how diet and genetic background impact obesity and related obesity-related traits. Correlations performed between hepatic gene expression levels and post-diet phenotype data revealed 2552 genes whose expression levels were significantly correlated with body fat % in the CC, some which were negatively correlated such as ApoM and Fmo3, while others were positively Table 6 DEGs assigned to enriched modules associated with obesity traits in humans Multiple DEGs in the CC assigned to enriched modules were associated with obesity traits in humans in the GWAS catalog. The number of DEGs for the magenta, red, and yellow modules identified by WGCNA illustrates the larger contribution of differential expression by strain over diet. Examples of genes with human homologs associated with obesity traits are shown for each module, where a denotes genes that are significantly correlated with body fat % in the CC  correlated such as Aldh1a1 and Adipor2. ApoM encodes a membrane-bound apolipoprotein associated with highdensity lipoproteins, low-density lipoproteins, and triglyceride-rich lipoproteins; secreted through the plasma membrane, alipoprotein M is involved in lipid transport [99]. In the mouse, leptin the "satiety" hormone and leptin receptor are essential for expression of ApoM, but excess concentrations of leptin inhibited ApoM mRNA expression in a dose-dependent manner in the human hepatoma cell line HepG2, suggesting that leptin may mediate ApoM expression [55]. Although FMO3 is more well-known for its role in preventing trimethylaminuria (fishy odor syndrome) in humans [92], FMO3 also functions as a drug-metabolizing enzyme to catalyze the NADPH-dependent oxygenation of various molecules including therapeutic drugs and dietary compounds [65]. Intriguingly, studies in the mouse have suggested additional roles for FMO3 in health and disease, such as modulating cholesterol metabolism [96], glucose, and lipid homeostasis [78], and as a target for downregulation by insulin [58]. Since adipocyte secretion of leptin and insulin occurs in proportion with the volume of adipose tissue under "normal" circumstances, this may partially explain the negative correlations between body fat % and expression of ApoM and Fmo3.

Magenta module
In the current study, the hepatic gene expression levels of Aldh1a1 and Adipor2 were positively correlated with body fat %. Aldh1a1 encodes the protein aldehyde dehydrogenase 1 family, member A1 (ALDH1A1), also known as retinaldehyde dehydrogenase 1 (RALDH1), which is a prominent enzyme in the oxidative pathway of alcohol metabolism. However, various studies in mice have shown that ALDH1A1 also modulates hepatic gluconeogenesis and lipid metabolism through its role in retinoid metabolism [39], and upregulation of ALDH1A1 is associated with reduced adiponectin expression in adipose tissue after high-fat diet feeding [44]. Furthermore, mice without ALDH1A1 are resistant to diet-induced obesity, and inhibition of ALDH1A1 in mice suppresses weight gain [27,28], which is consistent with our finding and illustrates the potential for ALDH1A1 as a drug target for obesity prevention or treatment. Adipor2 encodes adiponectin receptor 2 which interacts with adiponectin to mediate fatty acid oxidation and glucose uptake [103]. An agonist of adiponectin receptor 2, the adipokine adiponectin, is inversely correlated with body fat mass and visceral adiposity in humans, though the mechanisms of how adiponectin's interactions with its receptors to elicit antidiabetic, anti-atherogenic, and anti-inflammatory effects are not fully understood [63].
After confirming the relationship between expression of genes related to obesity and body fat % in the CC, we investigated the effects of genetic background (strain) and diet on hepatic gene expression levels. Similar to adiposity and the obesity-related traits examined in our previous study [101], genetic background had a far stronger effect on hepatic gene expression than diet, as shown by the overwhelmingly larger number of significant DEGs by strain (9436) compared to the number of DEGs by diet (1344). Interestingly, gene expression of 28.9% of diet DEGs was significantly correlated with adiposity (389/1344) compared to 25% of strain DEGs (2367/9436). Of the top 20 most significant diet DEGs identified in the CC, carbamoyl-phosphate synthase 1 (Cps1), isovaleryl-CoA dehydrogenase (Ivd), neuropilin 1 (Nrp1), and pyruvate kinase L/R (Pklr) were previously found to be associated with obesity traits in humans [38,51,69,72,108], but only one of the top 20 most significant strain DEGs was associated with at least one obesity trait in humans, namely glycerophosphodiester phosphodiesterase domain containing 3 (Gdpd3) [108].
Gene enrichment analysis of DEGs revealed different trends between DEGs by diet compared to strain. DEGs by diet showed enrichment for KEGG pathways and Gene Ontology (GO) biological processes related to numerous types of metabolism, amino acid synthesis, and nonalcoholic fatty liver disease, whereas DEGs by strain showed enrichment for cell function pathways, type 1 diabetes, and fatty acid metabolism. Like KEGG pathway enrichment, GO term enrichment for cellular components and molecular functions also showed distinct differences between DEGs by diet compared to strain; DEGs by diet showed enrichment for multiple cellular components related to the mitochondrion, endoplasmic reticulum, and cell membrane, while DEGs by strain showed enrichment for cellular components related to the cell membrane, extracellular components, and cell surface. In terms of molecular functions, DEGs by diet showed enrichment for metabolism and binding for nutrients and small molecules such as cofactor binding, vitamin B6 binding, catalytic activity, and electron transfer activity, while DEGs by strain showed enrichment for binding related to general cell and tissue functions, such as extracellular matrix, collagen, signaling receptor, and fibronectin binding. The culmination of our results suggests that generally, diet alters gene expression for "acute" metabolic processes sensitive to environmental changes, but genetic background more heavily influences overall "essential" cellular function.
Having identified genes with expression strongly influenced by diet or strain, we used the GWAS catalog as a guide to highlight clinically important genes found in our study by determining which DEGs may be most relevant to obesity-related traits in humans. The comparison between DEGs in the CC and genes in the GWAS catalog revealed that 65 diet DEGs and 431 strain DEGs correlated with body fat % in the CC have previously been identified as associated with obesity-related traits such as body fat distribution, BMI, waist-hip ratio, weight, and fat body mass in humans. One caveat regarding the number of DEGs in the CC found to be associated with obesity-related traits in humans is that our study focused only on gene expression in the liver of CC mice, while the genes listed in the GWAS catalog associated with obesity traits include candidates found in multiple tissue types; thus, including gene expression from additional tissue type such as brain or adipose tissue could yield additional candidate genes. Nonetheless, we identified genes expressed in the liver whose expression levels were either under genetic regulation, influenced by diet, or correlated with body fat %, which were also clinically important in humans using the CC panel as a model for obesity, which enabled the use of genetic "replicates" with high genetic diversity so that the results from this study are additive in scope.
Using the between-and within-strain mean square values derived from linear models, we calculated H 2 estimates to quantify the degree to which genetic variation affects hepatic gene expression level variation. For the 11,542 genes included in our analysis, the range of coefficient of genetic determination (g 2 ) was broad as expected (g 2 = −0.056-0.983), but the median was lower than anticipated (g 2 = 0.173) given the strong effect of strain on the expression of most genes. Median H 2 estimates by DEG status increased slightly but not drastically (diet DEG g 2 = 0.195, strain DEG g 2 = 0.211), while H 2 estimates remained similar, suggesting that differences in macronutrient composition did not have a large impact on hepatic gene expression in this study. Upon examination of the relative heritable variation that can be attributed to strain, diet, and strain × diet effects for all genes, the largest proportion of heritable variation for gene expression can be attributed to genetic background (strain) on average (30.3%), while the proportions of heritable variation for gene expression attributed to diet (3.9%) and strain × diet interactions (2.6%) were much smaller, which reaffirms the strong effect of strain on gene expression relative to diet and strain × diet effects. However, one caveat of these approximations is that increasing the sample size would provide a better estimation of the relative heritable variation since the number of mice per strain per diet is relatively low, so the estimation of strain × diet effect may not be precise.
Since obesity is a complex trait regulated by multiple genes, we used a gene co-expression network approach including the 11,542 expressed genes to find groups of genes that are similarly regulated by diet or strain and identified 13 gene modules comprised of a wide number of genes from 42 to 3319. Consistent with our DEG analyses, all modules were comprised largely of genes that were strain DEGs (> 69%), while the proportion of diet DEGs (0-49.5%) and genes with expression significantly correlated with body fat % (15.1-69.0%) varied much more widely, illustrating the variable effect of diet on gene expression compared to genetic background. Spearman's correlation of the MEs for identified modules with phenotypic data revealed six modules related to body fat %: tan, pink, salmon, magenta, red, and yellow. The MEs for all of these modules differed significantly by diet, except for the salmon module, suggesting that differences in diet macronutrient composition induce changes in gene expression for entire groups of genes. Similar to diet, the MEs for most of the modules also differed significantly by strain, except for the pink and tan modules. However, it is important to note that the ME variation within each strain appeared much higher for these two modules than the magenta, red, and salmon modules, an observation shown through the ability of utilizing genetic "replicates" with high genotypic and phenotypic diversity that is inherent to the CC; in fact, increasing the number of "replicates" would enhance the ability to find significant strain-by-diet differences. Thus, we show that both diet and strain may strongly affect hepatic gene expression, and that the CC can be used to interrogate the sources of inter-individual variation that underlies the variable response to diet observed in humans and mice.
Enrichment analysis performed using the lists of genes assigned to each module allowed us to assess which modules identified in the CC may be most biologically relevant to obesity and human health. Of the six modules whose MEs were significantly correlated with body fat %, the number of enrichment terms were few to none for the salmon, pink, and tan modules, but the magenta, red, and yellow modules were significantly enriched for numerous functional pathways, biological processes, and/or diseases. For example, the magenta module was enriched for pathways related to endoplasmic reticulum (ER) function and contained 163 genes total, with 16 strain DEGs and five DEGs by both diet and strain associated with at least one obesity trait in humans. Two DEGs associated with obesity in humans from the magenta module that merit further study are stress-associated endoplasmic reticulum protein 1 (Serp1) and UDP-glucose glycoprotein glucosyltransferase 1 (Uggt1). Serp1 participates in the metabolism of proteins in the ER by protecting target proteins against degradation [102] and was differentially expressed by strain in the CC. Similarly, Uggt1 encodes the enzyme UDP-glucose:glycoprotein glucosyltransferase (UGT), which is also located in the lumen of the ER and provides quality control for protein transport by selectively enabling misfolded glycoproteins to rebind calnexin, resulting in either the proper folding of the glycoprotein or exposure to degradation enzymes if proper folding fails to occur [15]; Uggt1 was differentially expressed by both diet and strain in the CC. Studies have demonstrated that hepatic ER stress induced by obesity can lead to the development of hepatic insulin resistance and gluconeogenesis, likely through the activation of the JNK pathway [40,62,105]. Our findings reaffirm the association between obesity and alterations in hepatic gene expression related to ER function, suggest potential candidate genes for future study in relation to patient screening for diabetes risk, and provide a link between diet, five hepatic ER genes, obesity, and insulin resistance.
Focusing on nine major genes pivotal to insulin signaling expressed in the liver of CC mice, the expression levels of six genes were significantly correlated with body fat % (Ide, Insig1, Insr, Irs2, Jak1, and Pik3r1), while six genes were only differentially expressed by strain (Hmga1, Ide, Insig1, Insig2, Irs1, and Jak1) and two genes were differentially expressed by both strain and diet (Irs2 and Pik3r1). Although all nine genes except Jak1 were assigned to a module in our network analysis, only Insig1, Insig2, and Irs2 were found in the enriched modules correlated with body fat % (magenta, red, or yellow). Assigned to the red module, Insig1 (insulin-induced gene 1) illustrates one pathway that insulin signaling regulates to alter lipid metabolism in both mice and humans [61]. In the livers of transgenic mice, overexpression of the INSIG1 protein reduces insulin-stimulated lipogenesis by inhibiting processing of sterol regulatory element-binding proteins (SREBPs) in the ER, membrane-bound transcription factors that activate lipid synthesis [17]. In humans, INSIG1 variants have been shown to influence obesity-related hypertriglyceridemia [83]. Two genes crucial to insulin signaling that were assigned to the yellow module were Insig2 (insulin-induced gene 2) and Irs2 (insulin receptor substrate 2). Similar to Insig1, Insig2 obstructs processing of SREBPs by binding to SREBP cleavage-activating protein in the ER, which results in blockage of cholesterol synthesis [100]. Genetic variants in INSIG2 (rs75666605) have been associated with severe obesity in a North Indian human population [68] and increased blood pressure and triglyceride levels in Brazilian obese patients [60]. Differentially expressed by both strain and diet, IRS2 is a vital mediator of insulin signaling since it acts as an immediate downstream substrate of insulin receptors and activates a cascade of serine-protein kinases to modulate numerous metabolic processes [2,14]. In mice, conditional knockout of Irs2 led to increased appetite and insulin resistance that progressed to diabetes [48] and lower levels of thyroid hormones [34]. In summary, our findings help explain the influences of genetic background and dietary macronutrient composition on clinically significant genes involved in insulin response relative to obesity development.
For future studies, investigating the transcriptome and epigenome of both adipose tissue and hepatic tissue together would further clarify the genetic and dietary mechanisms that drive the cross talk between tissue types to modulate energy balance and insulin response in the context of obesity development. If possible, integrating microbiome data would provide yet another "piece of the puzzle" for the elucidation of how genetic and environmental factors interact in the development of obesity. Nonetheless, our findings show that both variation in genetic background and diet can strongly influence hepatic gene expression of both individual genes and groups of related genes relevant to obesity.

Conclusions
This study determined the relationship between genetics and macronutrient composition on hepatic gene expression relative to obesity. To relate adiposity and obesityrelated traits to hepatic gene expression, correlations were performed using phenotype data and microarray data, revealing 2552 genes whose expression levels were significantly correlated with adiposity. In general, the effect of strain was much stronger than diet on hepatic gene expression as demonstrated by differential gene expression analysis which found over 9000 genes differentially expressed by strain compared to 1344 genes differentially expressed by diet. Interestingly, diet differentially expressed genes (DEGs) were enriched for many biological pathways associated with substrate metabolism, whereas strain DEGs were enriched for pathways less sensitive to environmental perturbations. Because common obesity is caused by multiple genes, weighted gene co-expression network analysis (WGCNA) was performed to identify clusters of related genes grouped into "modules. " Multiple gene modules were found that differed in average expression by both diet and strain, where three of the gene modules were correlated with adiposity and enriched for biological pathways related to obesity development. By combining all the analyses above and searching in the genome-wide association studies (GWAS) catalog, the list of obesity candidate genes found via GWAS in humans can be narrowed down to increase the success of future functional validations studies. Furthermore, we demonstrated that both strain and diet influence expression of individual genes as well as the expression for groups of related genes. By integrating phenotype data into our analysis, we found both individual genes and gene modules expressed in the liver that were related to adiposity and other clinical traits. This work sheds light on one way that genetic background and diet influence adiposity, where the identification of genes expressed in the liver related to adiposity provides concrete preliminary suggestions of specific "intermediary" mechanisms that bridge genetics and diet with obesity such as insulin signaling, which may be validated in future studies and contribute to the field of precision nutrition.

Animals, husbandry, diets, and phenotyping
Details on the origin, housing, husbandry, treatment of the CC mice, diet compositions, and phenotyping have been described previously [101]. Briefly, female mice from 22 CC strains (total n = 204) were obtained from the UNC Systems Genetics Core Facility in 2016 and put on either a high-protein (n = 102) or high-fat highsucrose (n = 102) diet for 8 weeks followed by analysis of body composition, metabolic rate, and physical activity. After 8 weeks on experimental diets, mice were euthanized following a 4-h fast for the collection of blood and liver tissue. Subsequently, circulating cholesterol, triglyceride (TG), glucose, albumin, creatinine, urea/BUN, aspartate transaminase (AST), and alanine transaminase (ALT) levels were quantified using the Cobas Integra 400 Plus (Roche Diagnostics, Indianapolis, IN), according to manufacturer's instructions. Circulating insulin was measured using an ultrasensitive mouse insulin ELISA (ALPCO Diagnostics, Salem, NH) per manufacturer's instructions. Trimethylamine N-oxide (TMAO), choline, betaine, and carnitine were quantified using liquid chromatography-mass spectrometry (LC-MS) methods as described with modifications [95]. Metabolic health scores were calculated using measurements of several metabolic risk factors (circulating glucose, insulin, glucose/insulin ratio, cholesterol, triglycerides, and body fat %) to approximate overall metabolic health [101].

Microarray analysis for identification of gene expression levels associated with post-diet traits and differentially expressed genes in liver tissue
Methods of RNA extraction from livers and evaluation of RNA integrity were performed as previously described [8]. Randomly selecting 3 mice per stain per diet for microarray analysis, high-quality RNA was available from livers of 127 of the 204 CC mice and hybridized to Affymetrix Mouse Gene 2.1 ST 96-Array Plate using the GeneTitan Affymetrix instrument (Affymetrix, Inc., Santa Clara, CA) according to standard manufacturer's protocol. The robust multiarray average (RMA) method was used to estimate normalized expression levels of transcripts (median polish and sketch-quantile normalization) using the affy R package [23]. The quality of sample arrays was then assessed using the R package arrayQualityMetrics [37] for outlier detection using 3 methods: distance between arrays/principal component analysis, computation of the Kolmogorov-Smirnov statistic K a between each array's intensity distribution and the intensity distribution of the pooled data to compare individual array intensity to the intensity of all arrays, and computing Hoeffding's statistic D a to check individual array quality. Sample arrays identified as outliers by all three methods were removed, i.e., a sample array was removed if all three methods indicated that it was an outlier, leaving 123 out of 127 arrays for analysis (Supplementary Table 12, Additional file 2).
Probes and transcript cluster IDs (TC IDs) were first filtered as described [70], resulting in the total number of 24,004 unique probes post-filter corresponding to 23,626 genes. Next, TC IDs were kept for analysis if their median expression was above the mean of all TC ID medians or if their median expression was above the mean of all TC ID medians in over 12.5% of samples, based on the assumption that by chance, one of the 8 founders may be contributing low/no expression alleles. For TC IDs associated with the same gene, the TC ID with the highest expression was selected to represent that gene so that each gene was represented by a unique TC ID for analysis, resulting in 11,542 TC IDs (genes) used for differential gene expression analysis and correlations between gene expression levels and phenotype data.
After filtering TC IDs and arrays for quality, calculations of multiple biweight midcorrelations (bicor) and their corresponding Student correlation p-values were performed for the unique TC IDs corresponding to 11,542 genes using the bicorAndPvalue function from the weighted gene co-expression network analysis (WGCNA) R package [45] to ascertain which genes' expression in the liver was correlated with post-diet traits. Next, differential gene expression analysis was performed using the linear models for microarray analysis (limma) R package version 3.6.1 [73] and methods described [66] to find genes that were significantly differentially expressed by diet or CC strain. Genes with a Benjamini-Hochberg (BH)-adjusted p-value < 0.05 were designated as differentially expressed (DE). The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and gene ontology (GO) enrichment analyses were performed using the kegga and goana functions in limma for differentially expressed genes with the false discovery rate (FDR) cutoff set to 0.05.

Broad-sense heritability estimates and diet intraclass correlations of hepatic gene expression levels
Broad-sense heritability (H 2 ) estimates and the intraclass correlations (ICC) for diet were calculated as described previously [101] for the 11,542 genes used in limma analysis to assess the degree of influence on gene expression variation from genetics (strain) and diet, respectively. H 2 was estimated by calculating the intraclass correlation (r I ) and the coefficient of genetic determination (g 2 ) using mean square between (MSB) strains and mean square within (MSW) strains values derived from linear regression analysis [20]. The following linear models were fit using the lm function and implementing Satterthwaite approximations on the output of lm as described [54] to obtain MSB and MSW values for r I and g 2 calculations: (1) a "full" additive model with strain, diet, and week (mouse "batch") as variables fitted with gene expression data from both experimental diets, (2) a "HP" additive model including strain and week as variables fitted with gene expression data from only mice fed the HP diet, and (3) a "HS" additive model including strain and week as variables fitted with gene expression data from only mice fed the HS diet. H 2 estimates derived from models fitted with data from all mice post-diet compare the contribution of genetics (strain) and diet overall to heritable gene expression level variance, while diet-specific H 2 estimates were calculated to discern differences in heritability affected by differences in macronutrient composition. The diet ICCs were calculated using the mean square between (MSB) diets and mean square within (MSW) diets derived from the "full" additive linear model described above. Additional linear mixed model analyses with strain, diet, and strain × diet interactions as all random effects were performed for each gene to estimate the relative heritable variation in gene expression that can be attributed to strain, diet, and strain × diet effects.

Weighted gene co-expression network analysis (WGCNA)
The WGCNA R package was used to identify modules for the 11,542 expressed genes used in microarray analysis of differentially expressed genes since complex traits often result from changes in expression of multiple genes. Expression data from the 123 non-outlier sample arrays were used to detect modules, which are groups of highly correlated genes with similar connection strengths [24,106]. The soft threshold was chosen by running the pickSoftThreshold function to determine the best fit to a scale-free topology, and beta was set to 5 because it was the lowest power value where the R 2 value crossed the 0.9 threshold for approximate scale-free topology and connectivity measures implicated the possibility of finding highly connected genes. The blockwiseModules function was run to construct the unsigned network in one block, calculate an adjacency matrix with Pearson correlations, calculate the topological overlap matrix (TOM) using the signed method, cluster genes using the default average linkage hierarchical clustering, and establish modules by the dynamic hybrid tree cut method [45]. Next, the mergeCloseModules function was used to merge closely related and highly correlated modules. Module eigengenes were calculated, and Spearman's correlations were performed between module eigengenes and measured phenotypes. KEGG pathway enrichment and gene ontology analyses were performed on genes within each module using Enrichr as described [70] to see which modules contained genes associated with biological function or diseases. Cytoscape [77] was used to generate a visualization of the relationship between genes within a module, using the magenta module as an example.

Human GWAS catalog analysis
Entries in the EMBL-EBI Human GWAS catalog v1.0.2 accessed in 2021 were indexed to matching mouse genes [4] to compare the DEGs found in the CC with homologous genes in humans. Human gene symbols from the "MAPPED_GENE" catalog column (described here: https:// www. ebi. ac. uk/ gwas/ docs/ metho ds/ curat ion) were matched against mouse gene symbols after case normalization, white space removal, and, in the case of multiple mapped genes, delimiter separation.

Additional statistical analyses
All statistical analyses were performed in R (v.3.6.1) [71]. Diet or strain effects on module eigengenes were assessed using the two-group Mann-Whitney U (Wilcoxon rank) test or Kruskal-Wallis statistical test, respectively. The Mann-Whitney U (Wilcoxon rank) test and Kolmogorov-Smirnov test were performed to test whether the distributions of diet-specific H 2 estimates (g 2 ) differed significantly. In general, p-values were adjusted using the Benjamini-Hochberg (BH) method where indicated.

Supplementary Information
The online version contains supplementary material available at https:// doi. org/ 10. 1186/ s12263-022-00714-x.  Table 2. Significant correlations between phenotypes and the top 30 genes with expression levels most strongly correlated with post-diet BF%. Supplementary Table 3. All genes with significant differential expression by diet (1,344