Identification of Transcription Factor Binding Sites
Using Local Markov Models (submitted)


Haiyan Huang*  Ming-Chih Kao*  Xianghong Zhou
Jun S Liu  Wing H. Wong


Transcription factor binding sites (TFBSs), often short and degenerate, are computationally difficult to identify without being overwhelmed by false-positive calls. Since the structure of the genomic sequence is heterogeneous, we develop a Local Markov Model (LMM) to assign probabilistic significance to  each TFBS candidate with respect to its local sequence context. We show that the p-value for a TFBS candidate under the LMM can be computed exactly and efficiently. We apply LMM to large-scale human binding site sequences in situ and found that, compared to current popular methods, LMM analysis can reduce false positive errors by more than 50% without compromising sensitivity.


Online Supplement

  1. The derivation of the probability generating function for the 3rd-order Markov case.


Software


To reguest the software for Local Markov Model, please enter your email below:

What is your name?

Where are you from?

E-mail address?