Distribution of variants in multiple vitamin D-related loci (DHCR7/NADSYN1, GC, CYP2R1, CYP11A1, CYP24A1, VDR, RXRα and RXRγ) vary between European, East-Asian and Sub-Saharan African-ancestry populations

Background The frequency of vitamin D-associated gene variants appear to reflect changes in long-term ultraviolet B radiation (UVB) environment, indicating interactions exist between the primary determinant of vitamin D status, UVB exposure and genetic disposition. Such interactions could have health implications, where UVB could modulate the impact of vitamin D genetic variants identified as disease risk factors. However, the current understanding of how vitamin D variants differ between populations from disparate UVB environments is limited, with previous work examining a small pool of variants and restricted populations only. Methods Genotypic data for 46 variants within multiple vitamin D-related loci (DHCR7/NADSYN1, GC, CYP2R1, CYP11A1, CYP27A1, CYP24A1, VDR, RXRα and RXRγ) was collated from 60 sample sets (2633 subjects) with European, East Asian and Sub-Saharan African origin via the NCBI 1000 Genomes Browser and ALFRED (Allele Frequency Database), with the aim to examine for patterns in the distribution of vitamin D-associated variants across these geographic areas. Results The frequency of all examined genetic variants differed between populations of European, East Asian and Sub-Saharan African ancestry. Changes in the distribution of variants in CYP2R1, CYP11A1, CYP24A1, RXRα and RXRγ genes between these populations are novel findings which have not been previously reported. The distribution of several variants reflected changes in the UVB environment of the population’s ancestry. However, multiple variants displayed population-specific patterns in frequency that appears not to relate to UVB changes. Conclusions The reported population differences in vitamin D-related variants provides insight into the extent by which activity of the vitamin D system can differ between cohorts due to genetic variance, with potential consequences for future dietary recommendations and disease outcomes.


Introduction
Ultraviolet B radiation (UVB; 290-320 nm) exposure is the primary factor influencing vitamin D status in humans, with environmental UVB levels varying considerably by latitude and season. Furthermore, vitamin D status is modulated by variance in vitamin D-associated genes [1,2], with key genes relating to the production (DHCR7/NADSYN1), binding and transport (GC), metabolism (CYP2R1, CYP27A1, CYP27B1, CYP11A1 and CYP24A1), and activation of vitamin D (VDR and RXRα, RXRβ, RXRγ) [3]. Both UVB exposure and vitamin D-associated single nucleotide polymorphisms (SNPs) are risk factors for vitamin D insufficiency and many related diseases, such as cardiovascular disease, infectious diseases and cancers [1,4,5].
The impact of UVB and vitamin D-related genetics are not merely additive, but may also be interactive. Indeed, there is evidence that the frequency of SNPs in vitamin D-associated genes reflect changes in UVB environment [6][7][8][9]. These findings indicate that the functionality of the vitamin D system varies between individuals of differing ethnicities or UVB environments. Genetic differences between populations may also modify vitamin D's influence on related disease risk [1,4], warranting further investigation in this area given the current lack of convincing evidence around vitamin D's roles in many diseases [10]. However, despite an abundance of research into vitamin D-related variants, studies focusing on how the distribution of such variants differs between geographic populations is limited.
The relationship between vitamin D-associated SNPs and skin pigmentation is an important consideration regarding differences between geographically defined populations. Skin pigmentation is an apparent adaptation to differing UVB environments, with darker-pigmented populations originating in areas of high UVB, and lighter-pigmented populations in lower UVB areas [11][12][13]. However, the genetic architecture underlying skin pigmentation differs even between populations exposed to similar UVB regimes. A key example of this is the fact that parts of Europe and East Asia share similar UVB conditions, but the evolution of lighter skin phenotypes in these populations evolved independently, via different genetic adaptions [14,15].
Similar geographic patterns may exist in vitamin Dassociated SNPs. Both vitamin D and skin pigmentation pathways respond to changes in UVB. Importantly, the vitamin D hypothesis proposes that the reduction of skin pigmentation in early humans migrating out of Africa to areas of lower UVB areas occurred to facilitate vitamin D production [11,12]. This hypothesis is based on the UVB induced synthesis of vitamin D being dependent on skin pigmentation levels, with competition for UVB absorption existing between pigments and the vitamin D cholesterol precursor. Consequently, lighter-skinned individuals can synthesise up to 30 times more vitamin D than darker-skinned individuals following identical UVB exposure [16].
Our current understanding of how variation in vitamin D-associated genes differs between global populations is limited. Notably, there has been a significant focus on examining vitamin D genetics in Europeans [17][18][19] with little attention given to other global populations. Therefore, in the present study, a more comprehensive approach has been taken; genotypic data for variants within multiple vitamin D-related genes was collated from 60 sample sets [2633 subjects] with European, East Asian and Sub-Saharan African origin to examine for potential patterns in the geographic distribution of vitamin D-associated SNPs.

Validation of European, East Asian and Sub-Saharan African groups with skin pigmentation SNPs
The mean allelic frequencies of SLC24A5 rs1426654, SLC45A2 rs16891982 and OCA2 rs1800414 in derived geographic groups did not deviate from previously reported frequencies in populations of European (EUR), East Asian (EAS) and Sub-Saharan African (AFR) ancestry [20,21]. rs1426654 and rs16891982 frequency were the highest in EUR (0.99 and 0.91, respectively). Conversely, rs1426654 and rs16891982 were near absent in EAS and AFR (mean frequencies 0.00-0.08; Table 1). Presence of rs1800414 was exclusive to the EAS group (mean frequency 0.59).
Annual UVB levels in European, East Asian and Sub-Saharan African sample set areas Global mean annual UVB levels and sample set locations are shown in Fig. 1, with the highest mean annual UVB levels found in AFR locations followed by EAS and EUR sample set locations as expected (82.2 vs. 48.1 vs. 18.4 Mw/m 2 /nm respectively). Intergroup comparisons found significant differences between all geographic areas for annual UVB levels (p < 0.001).
Distribution of vitamin D production/transport-related variants (NADSYN1/DHCR7 and GC) across European, East Asian and Sub-Saharan African groups Sixteen variants in genes involved in vitamin D production (NADSYN1/DHCR7) and transport (GC) were examined, eight within the NADSYN1/DHCR7 loci and eight within GC ( Table 2).
The frequencies of all NADSYN1/DHCR7 variants varied by geographic group (p < 0.0001, r 2 0.59-0.87). Patterns of distribution varied by SNP (Table 2). For NADSYN1/DHCR7 variants rs11603330, rs7944926 and rs3794060, allelic frequency differed between all geographic groups, with their distribution coinciding with changes in environmental UVB. rs7944926 increased in areas of increased environmental UVB (i.e. frequency highest in AFR, lowest in EUR), whilst rs11603330 and rs3794060 decreased with increased UVB levels (i.e. frequency lowest in AFR, highest in EUR).
Four other NADSYN1/DHCR7 variants, rs3750997, rs1790325, rs7928249 and rs12800438, frequencies differed in EUR compared to EAS and AFR. rs3750997, rs7928249 and rs12800438 frequencies were increased in EAS and AFR, compared to EUR, with the inverse relationship observed for rs1790325. Another NADSYN1/ DHCR7 variant, rs12280295, was near absent in the EUR and EAS (mean frequencies of 0.00), with higher frequency in AFR (0.23). Considering these distribution patterns together, there was no apparent trend for NADSYN1/DHCR7 polymorphisms to be in higher in one geographic region over another.
The allelic frequency of all examined GC genotypes varied by geographic group (p < 0.0001, r 2 0.64-0.94). The largest effect was observed for rs705117 (p < 0.0001, r 2 0.94), with the frequency of this variant differing between all geographic regions, and decreasing in geographic areas of increasing UVB (EUR 0.84, EAS 0.50 and AFR 0.17). Interestingly, five other GC variants followed this distribution pattern (rs7041, rs222047, rs222016, rs222020, rs843006 and rs705117). Another GC variant, rs4364228 had reduced frequencies in EUR (0.09) and EAS (0.12) compared to AFR (0.45), and a further variant, rs3737549, was shown to absent in the EUR group (0.00), but increasingly present in EAS and  Table 2). Considered together, frequencies of examined GC variants were the highest in either EUR or AFR groups, with high frequencies in EAS uncommon.
Distribution of variants in vitamin D metabolism genes (CYP11A1, CYP24A1, CYP27A1 and CYP2R1) across European, East Asian and Sub-Saharan African groups Fourteen cytochrome P450 (CYP) variants fit the inclusion criteria (two in CYP11A1, five each in CYP24A1 and CYP27A1 and two in CYP2R1). Allelic frequency of all 14 variants varied by geographic groups (p < 0.0001; Table 3). Two CYP11A1 variants varied in frequency by geographic group (rs11632698 and rs2073475; p < 0.0001, r 2 0.86 and 0.88, respectively) but displayed different distribution patterns across geographic groups. The distribution of CYP11A1 rs2073475 coincided with increasing UVB (EUR 0.16, EAS 0.45 and 0.58). CYP11A1 rs11632698 frequency significantly differed in EUR compared to EAS and AFR (mean frequency of 0.57 in EUR and 0.20 in EAS and AFR).
Two of the 5 examined CYP27A1 variants, rs691414 and rs692290, appeared to be fixed in EUR and EAS (mean allelic frequencies of 1.00). Conversely, frequencies were significantly reduced in AFR (rs691414; 0.78 and rs692290; 0.60). These variants had the largest effect sizes of examined CYP27A1 variants (p < 0.0001, rs691414 r 2 0.89, rs692290; r 2 0.96). The remaining examined CYP27A1 variants displayed differing patterns in allelic frequency. rs7568196 had low frequencies in EAS and AFR (0.02-0.22), with increased frequency in EUR (0.40). Frequency of rs13013510 and rs4674338 were significantly different in all geographic groups, with the highest frequency for rs13013510 reported in AFR (0.65), and EAS for rs4674338 (0.93). Interestingly, despite differing distribution patterns observed for CYP27A1 variants, there was a trend for frequencies of these variants to be the highest in EUR and EAS over AFR. The frequencies of CYP2R1 variants (rs16930625 and rs11023374) differed by geographic group (p < 0.0001, rs16930625; r 2 0.41 rs11023374; r 2 0.79), although there was no trend for CYP2R1 variants to be higher in one geographic region over others. rs16930625 had low frequencies in all groups (0.06-0.21), but was higher in AFR compared to EUR. rs11023374 had a lower frequency in EAS and AFR (0.01-0.11), compared to EUR (0.28).

Distribution of variants in genes relating to vitamin D activity (VDR, RXRα and RXRγ) across European, East Asian and Sub-Saharan African groups
Sixteen variants in vitamin D-related nuclear receptor genes were examined (five VDR, seven RXRα and four RXRγ; Table 4).
Six of the seven examined RXRα variants varied by the examined geographic groups (rs1805343, rs1805352, rs10881582, rs3118571, rs731516 and rs7040434; p < 0.0001; r 2 0.95-0.99). Interestingly, these six RXRα variants followed the same distribution pattern, with differences in AFR when compared to EUR and EAS. For five variants (rs1805343, rs1805352, rs10881582, rs3118571 and rs731516), the allelic frequency was reduced in AFR compared to EAS and EUR. Notably, RXRα rs731516 was fixed in EUR and EAS (mean frequency of 1.0), with reduced frequency in AFR (0.59). rs7040434 was absent in EUR and EAS (0.00) but not AFR (0.53; r 2 0.99).
There was no trend for examined VDR and RXRγ variants to be higher in specific geographic groups, although frequencies of examined RXRα variants appeared to be the highest in either EUR or EAS. However, genotypic

Discussion
This study demonstrates that variant frequency in multiple vitamin D-associated genes (VDR, RXRα, RXRγ, GC, CYP2R1, CYP27B1, CYP24A1, CYP11A1 and DHCR7/NADSYN1) varies by environmental UVB and ancestry. For many SNPs, frequency followed a trend to either decrease or increase in geographic regions of increasing environmental UVB. However, several SNPs displayed a population-specific pattern that cannot be explained by changes in UVB levels alone. This provides insights into the extent to which vitamin D regulation differs by cohort, and may have consequences for public health recommendations and disease outcomes. The reported geographic patterns in the frequency of SNPs in CYP genes and RXRα are novel findings. Whilst such variants have been examined previously in differing cohorts, details into how the distribution of these variants differs by ancestry has not been highlighted. CYP2R1 and CYP27A1 enzymatically activate vitamin D, and formation of the excretory form is enzymatically regulated by CYP24A1. CYP11A1 is highly expressed in the skin and represents an important alternative vitamin D metabolism pathway [3,22]. As such, genetic variance in these pathways may influence vitamin D status and homeostasis.
Multiple RXRα variants displayed similar frequencies in EUR and EAS populations, potentially related to a broad reduction in UVB in Europe and East Asia compared to Sub-Saharan Africa. RXR are the most common subunit forming heterodimers with VDR, but little is known about the influence of RXR variants on vitamin D activity [23]. Expression of the RXRα subtype is particularly high in skin, and therefore SNPs could be of functional relevance to UVB-induced vitamin D activity [24,25]. However, other UVB-related roles of retinoids and vitamin A derivatives in the skin should be considered, including involvement in circadian rhythm and photoprotection [26].
DHCR7/NADSYN1, VDR, RXRγ, CYP2R1, CYP24A1 and CYP11A1 variants did not display clear patterns of geographic distribution, likely reflecting diverse functional consequences. However, the majority of examined variants reside within introns or untranslated regions.  Therefore, linkage disequilibrium of these variants with nearby functional variants needs to be considered. It was hypothesised that selection of vitamin D-related SNPs would parallel geographic selection for skin pigmentation. The reported associations support this and indicate vitamin D SNPs display population-specific patterns, with genetic differences observed between populations which did not reflect increases and/or decreases in ancestral UVB environments. These population-specific patterns could coincide with migration patterns, as in the case of variants underlying skin pigmentation [14,15] and support a link between vitamin D and the evolution of lighter skin, with further examination into this association warranted. Notably, evidence of positive selection for DHCR7/NADSYN1 variants has been reported; however, evidence of selection was not found for other examined vitamin D-related genes (CYP2R1 and GC), possibly due to selection taking place at an earlier time than examined, and/or in other vitamin Dassociated genes, such as CYP27B1, CYP24A1 or VDR.
Many of the reported associations support previously reported frequency patterns in GC, VDR and DHCR7/ NADSYN1 variants [6,7,27,28]. GC rs7041 is a genetic determinant of vitamin D status, with a negative association between frequency and latitude reported [28,29]. Here, similar latitudinal/UVB clines for several additional GC variants were observed. Of these, rs705117 and rs222020 have been linked to vitamin D status [30,31]. Latitudinal clines in VDR SNPs have been observed, although these associations were limited to the Africa-Europe axis [6][7][8]. Potential latitudinal clines exist for several VDR variants examined here along this axis, but not when considering the East Asian populations. Several examined DHCR7/NADSYN1 variants (rs12800438, rs7944926, rs3794060, rs12280295) are part of a large haplotype block previously noted to have high frequency in Europeans and North East Asians [27]. Here multiple additional variants in this locus that differed in frequency between populations that may be functionally relevant were identified.
Strengths of this study include the collation of numerous cohorts from three genetically distinct populations exposed to differing UVB regimes and the simultaneous examination of multiple vitamin D-associated variants. However, the analysis was limited by data availability. Furthermore, the inclusion of multiple cohorts from the same area (e.g. multiple Italian and Han cohorts) might have resulted in over-representation of sub-populations in derived geographic groups.
This data is interesting from a human evolution perspective but also has relevance for public health recommendations and understanding disease risk. Vitamin D insufficiency is more likely in darker-skinned individuals, attributed to diminished synthesis of the vitamin due to pigmentation [5,32,33]. However, variants displaying apparent interethnic differences in frequency may also contribute to population differences in vitamin D status, and therefore current global and national dietary recommendations for this vitamin may not meet the needs of all populations equally. Further, numerous SNPs in vitamin D pathways have been identified as risk factors for multiple adverse health conditions [1,4]. Given that variant frequency appears to vary by ancestry, disease risk factors could be population specific. A further possibility is that risks conferred by vitamin D SNPs may change depending on environmental factors, such as UVB exposure, with these concepts requiring further examination.

Conclusions
This study reports population differences for gene variants within multiple vitamin D-related loci that have not been explored previously. A key finding was that the frequency of many of these vitamin D variants are population-specific, and do not reflect changes in ancestral UVB environments. These population differences provide insight into the extent to which vitamin D metabolism and activity may vary between populations of different ancestry via genetic variance in numerous vitamin D-related genes. Given multiple SNPs within examined loci have been identified as disease risk factors, further examination of identified gene variants displaying interethnic differences in frequency and their potential relevance to disease outcomes is warranted.