Skip to main content

Table 3 Results from the two-population model simulations

From: Detecting individual ancestry in the human genome

Variable

sNMF (R2)

Admixture (R2)

fastStructure (R2)

Sampling depth, n1, n2

   

8

99.92

100

39.56

10

99.83

100

34.03

20

99.87

100

100

40

99.81

100

100

100

99.74

100

100

Uneven sampling, n1

   

8

98.94

99.45

98.59

10

99.43

99.78

99.32

20

99.61

100

92.21

40

99.67

100

100

100

99.74

100

100

Sequencing depth, nsnps

   

10

3.13

0.65

18.51

50

66.56

75.54

74.42

100

85.33

92.95

91.89

500

96.78

99.87

99.93

1,000

98.62

99.99

100

5,000

99.74

100

100

Population size, theta

   

1

99.73

100

100

2

99.74

100

100

5

99.74

100

100

10

99.72

100

100

Effective population size, N2

   

100

99.98

100

100

2,500

99.94

100

100

7,500

99.82

100

100

10,000

99.74

100

100

Divergence time (F st ), T/(4 N1)

   

0.000075

0.54

0.38

0.01

0.00025

0.24

0.03

0

0.00125

6.19

0.03

0.24

0.0025

69.36

95.28

0.53

0.0125

98.36

100

100

0.05

99.74

100

100

Constant migration rate, 4 Nm

   

0.1

99.77

100

100

1

99.78

100

100

5

99.56

100

100

10

99.15

99.99

100

50

93.95

99.98

33.3

100

41.61

94.06

0.56

  1. We simulated two populations using ms [75], which splitted and evolved independently t generations ago. See Table 1 for default parameters. Each simulation comprises 1,000 independent regions of 2 kb, from which one SNP per region is sampled at random. Each parameter set was replicated ten times. For each algorithm, the estimated ancestry proportions over the different runs were sorted according to the expected ancestry matrix denoting the true population labels using CLUMPP [44]. From this, standard denoted demographic parameters were successively varied to exemplify the impact on the estimates. We report the coefficient of determination that can be understood as the percentage of the true outcome.