|
|
|
|
|
|
|
|
|
|
|
Brief
description of the dataset |
Thousands
of short DNA sequences were generated in a computer, and the
melting temperature of each of them was calculated by five
different approaches. Several comparative measures were used
to assess the differences and similarities of the observed
melting temperatures. The approaches included the basic melting
temperature calculation (denominated bas),
the salt adjusted melting temperature calculation (denominated
sal), and the
nearest neighbors thermodynamic method based on the three
most commonly used parameter sets. These thermodynamic parameter
sets included the Breslauer table (Breslauer et al., 1986)
denominated here as Th1
(Nearest-neighbor
thermodynamic 1),
the Santalucia table (Santalucia et al., 1996) called
Th2 (Nearest-neighbor
thermodynamic 2),
and the Sugimoto table (Sugimoto et al., 1996) as
Th3
(Nearest-neighbor
thermodynamic 3).
The mathematical expressions used to calculate the melting
temperatures are described in detail here. The length of the
DNA sequences was limited between 16 and 30 nucleotides, which
is the most commonly used length range for PCR primer design
and in situ synthetized oligonucleotide microarray. For each
length, ten different CG-content classes were defined ranging
between 0 and 100, thus covering the complete CG-content range.
Finally, a total of 2.000 DNA sequences were randomly generated
for each particular combination of length and CG-content class.
For each sequence, the melting temperatures were calculated
using the methods already described and several comparisons
were carried out (see below).
|
Comparative
measures |
Several
measures of similarity between the Tm values reported for
any two methods were used in this work. All comparisons were
done within a fixed oligonucleotide length and percentage
of CG-content class, each involving a total of 2.000 oligonucleotide
sequences. The calculated measures include the maximal observed
absolute difference (MaxAD), the minimal observed absolute
difference (MinAD), the average absolute difference (AveAD),
the standard deviation of absolute differences (DevAD), the
correlation coefficient (CC), and the percentage of cases
where the absolute difference between calculated Tm values
was equal or less than 5 (Per5C) and 3 (Per3C) Celcius degrees.
The correlation coefficient was only calculated between thermodynamic
methods because the other two methods have a variance equal
to zero for oligonucleotides of fixed length and CG-content
(they depend only on these two variables, which were of course
identical at each grid point where the analysis was carried
out).
|
|
Breslauer,K.J.,
Frank,R., Blöcker,H. and Marky,L.A. (1986) Predicting
DNA duplex stability from the base sequence. Proc.
Natl. Acad. Sci. USA, 83, 3746-3750.
SantaLucia,J.,Jr,
Allawi,H.T. and Seneviratne,P.A. (1996) Improved nearest-neighbor
parameters for predicting DNA duplex stability. Biochemistry,
35, 3555-3562.
Sugimoto,N.,
Nakano,S., Yoneyama,M. and Honda,K. (1996) Improved thermodynamic
parameters and helix initiation factor to predict stability
of DNA duplexes. Nucleic Acids Res., 24, 4501-4505.
|
|
|