JIA-2018-09
1975 YANG Hai-long et al. Journal of Integrative Agriculture 2018, 17(9): 1972–1978 multiple considerations (Pritcharda et al . 2009; Porras- Hurtado et al . 2013). Another well-known tool is PLINK, which has high computational performance and significant analyzing features (Purcell et al . 2007). Compared to the programs mentioned above, the SNPhylo tool has its own specific advantages, including better accessibility of huge amounts of data and a highly automatic data import process without obscure data pre-handling (Table 1). However, the original SNPhylo manual didn’t show details of importing file containing non-numeric chromosomes and how to plot detailed phylogenetic tree of huge SNP data by MEGA and Adobe Illustrator. Therefore, the simple method developed in this study provides a more elaborate and better way to parse genotyping data, which can meet the increasing demand of analyzing and visualizing huge SNP data generated by large-scale genotype sequencing. Interestingly, apart from SNPhylo, the output files from Structure and PLINK can also be subjected to subsequent analysis (Table 1). For example, the Structure output file can be plotted and presented by Cluster Matching and Permutation Program (CLUMPP) (Jakobsson and Rosenberg 2007), distruct (Rosenberg 2004), Cluster Markov Packager Across K (CLUMPAK) (Kopelman et al . 2015), and Structure Harvester (Earl 2012). PLINK results can be displayed by gPLINK (zzz. bwh.harvard.edu/plink/gplink.shtml) and Haploview (Barrett et al . 2005; Barrett 2009). However, these methods still have their own limitations. The first limitation is that the raw data file needs to convert the required corresponding format. The input format of SNP data for Structure software requires Genetic Analysis in Excel (GenAIEx) (Peakall and Smouse 2006), xmfa2struct (www.xavierdidelot.xtreemhost . com/clonalframe.htm), or Clustal X/W (Larkin et al . 2007) to convert sequence data into Structure input format. The input format of PLINK is a PED/MAP file, which also requires tools or its own command, such as the recode option, to convert SNP data. More optimally, our first step only used Excel to substitute the extra chromosome number for scaffold number to make maximum use of SNP data generated by crop plants, such as maize inbreds. The second limitation is the complicated subsequent analysis, such as plotting. In Structure software, output format files, such as indivq, popq, names, languages, and perm, as well as parameter settings, such as K value modification, number of individuals (NUMINDS), and number of pre-defined populations (NUMPOPS), are necessary in the process of plotting population structure. PLINK can stratify population structure with its own command mds-plot that is believed to be slightly worse than PCA when correcting population structure in some specific genome wide association analysis (Wang et al . 2009) and further plot with R scripts. However, Tropical_subtropical S N P S M Fig. 2 Final step: edit output file with Adobe Illustrator. The red group is tropical_subtropical, the green group is stiff stalk (SS), the blue group is non-stiff stalk (NSS), the purple group is sweet corn, the orange group is popcorn, and the out-of-group is mixed. The close-up box is an amplified part of the stiff stalk (SS) group.
Made with FlippingBook
RkJQdWJsaXNoZXIy MzE3MzI3