Study subjects
The participants of this study were a part of a direct-to-consumer (DTC) genetic testing service in Japan, “Health Data Lab,” provided by Yahoo! Japan Corporation (Tokyo, Japan) and Genequest Inc. (Tokyo, Japan). The participants were aged ≥ 18 years old and asked to complete internet-based questionnaires covering sociodemographic factors, lifestyle habits, and medical history at the time of enrollment. All participants gave written, informed consent for the general use of their genotype and questionnaire data. After informing the participants of this study’s purpose, an additional agreement was obtained with the opportunity to opt out. Among the 12,621 participants, 1 who opted out was excluded from this study. We obtained approval from the Ethics Committee of Genequest Inc. This study was conducted according to the principles expressed in the Declaration of Helsinki.
Fish intake frequency
The questionnaires included a question about fish intake frequency, i.e., “How frequently do you eat fish (raw fish, boiled fish, grilled fish, or etc.)?” The intake level included eight categories: “hardly eat,” “1 to 3 times per month,” “1 to 2 times per week,” “3 to 4 times per week,” “5 to 6 times per a week,” “once per day”, “twice per day,” or “≥ 3 times per day.” We converted the category into a continuous variable that represented intake frequency per week, i.e., “hardly eat” was coded as 0.0, “1 to 3 times per month” as 0.46, “1 to 2 times per week” as 1.5, “3 to 4 times per week” as 3.5, “5 to 6 times per week” as 5.5, “once per day” as 7.0, “twice per day” as 14.0, and “≥ 3 times per day” as 21.0. Considering that a year has 365.25 days, the intake frequency for the answer “1 to 3 times per month” was calculated as \( \frac{1+3}{2}\times 12\div 365.25\times 7=0.46 \) times per week.
Adjustment variables
In the present study, the association analysis was adjusted for age and sex. In some analyses, study region, alcohol consumption, and/or alcohol drinking frequency were additionally used for the adjustment. Age, sex, and study region were obtained from the questionnaire. Alcohol drinking frequency was defined according to the question, “How frequently do you drink alcohol?” with seven categories—“hardly drink” (coded as 0.0), “less than 1 times per month” (coded as 0.12), “1 to 3 times per month” (coded as 0.46), “1 to 2 times per week” (coded as 1.5), “3 to 4 times per week” (coded as 3.5), “5 to 6 times per week” (coded as 5.5), and “everyday” (coded as 7.0). Alcohol consumption was calculated by summing the amount of alcohol in grams per day obtained from beer, red wine, white wine, highballs/cocktails, rice wine, and distilled spirits.
DNA sampling, genotyping, and quality control
The Oragene DNA (OG-500) Collection Kit (DNA Genotek Inc., Ottawa, Ontario, Canada) was used for the collection, stabilization, and transportation of saliva samples. Genomic DNA was extracted from saliva by genotyping technology according to the manufacturer’s instructions. The participants were genotyped using two platforms: the HumanCore-12 + Custom BeadChip (Illumina Inc., San Diego, CA, USA) containing 302,072 markers and the HumanCore-24 + Custom BeadChip (Illumina) containing 309,725 markers. In this study, we used 296,675 markers included in both genotyping platforms.
After excluding subjects who did not live in Japan, 12,603 participants remained. According to a previous study [5], we divided our study participants into eight regional groups: “Hokkaido,” “Tohoku,” “Kanto-Koshinetsu,” “Tokai-Hokuriku,” “Kinki,” “Chugoku-Shikoku,” “Kyushu,” and “Okinawa.” We then applied the quality control and association analysis procedures described below for each regional group.
Single nucleotide polymorphism (SNP) markers with low call rates (< 0.95), low Hardy–Weinberg equilibrium exact test P values (< 1 × 10−6), or low minor allele frequencies (MAFs; < 0.01) were filtered out. Subjects who had inconsistent sex information between genotype and questionnaire, who had a low call rate (< 0.95), or who had an estimated non-Japanese ancestry [5, 17] were excluded. In addition, we excluded either close relationship pairs determined by the identity-by-descent method (PI_HAT > 0.1875) as in previous studies [8, 18, 19]. These quality control procedures were performed using PLINK [20, 21] (version 1.90b3.42) and Eigensoft [17] (version 6.1.3) software.
Genome-wide association and meta-analysis
For each regional group, the association between each variant and fish intake frequency was tested by a linear regression model with adjustment for age, sex, and population stratification (5 principal components). Inflation of test statistics due to confounding from population stratification was assessed by calculating the inflation factor (λ), which is defined as the median of the observed test statistic divided by the median of the expected test statistic [17]. An inflation factor near 1.0 (i.e., 0.95–1.05) indicates that confounding from population stratification has been well adjusted for. This genome-wide association analysis was performed using PLINK software.
Summary statistics (beta coefficients and their standard error) from eight regional groups were meta-analyzed using a fixed-effect model and the inverse-variance weighting method with METAL software [22] (version 2011-03-25). Variants with a meta-analysis P value < 5 × 10−8 and < 1 × 10−5 were considered as GWS and having suggestive significance, respectively. Except for the 12q24 locus, variants achieving suggestive significance within 500 kb were grouped into a single locus, and for each locus, a lead variant was defined as the variant with the lowest P value in that locus. Regarding the 12q24 locus, variants within 2000 kb were considered as a single locus because the locus had a long-range disequilibrium. We checked the LD (R2) in the Japanese population between the lead and other variants using LDlink [6]. We confirmed that significant variants (P > 1 × 10−5) were in moderate or high LD (R2 > 0.3) with the lead variant for each locus.