Transcription factor discovery using support vector machines and heterogeneous data

José F. Barbe; Ahmed H. Tewfik; Arkady B. Khodursky

doi:10.1109/GENSIPS.2007.4365812

Transcription factor discovery using support vector machines and heterogeneous data

José F. Barbe, Ahmed H. Tewfik, Arkady B. Khodursky

Synthetic Biology and Biotechnology Division

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

In this work we analyze the suitability of expression and sequence data for discovery of co-regulatory relationships using Support Vector Machines. In addition, we try to assess the possibility of improving such results by heterogeneous data fusion and by estimating a probability of a correct classification. As shown in other studies, we have found that transcription co-expression is a good estimator for genetic co-regulation. We also have found some evidence that operator site sequence motifs can be used to estimate coregulation, but the kernels used for feature extraction did not achieve classification rates comparable to expression data. Finally, the additional information provided by combining sequence and expression data can be exploited to estimate the probability of correct classification.

Original language	English (US)
Title of host publication	5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07
DOIs	https://doi.org/10.1109/GENSIPS.2007.4365812
State	Published - 2007
Event	5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07 - Tuusula, Finland Duration: Jun 10 2007 → Jun 12 2007

Publication series

Name	GENSIPS'07 - 5th IEEE International Workshop on Genomic Signal Processing and Statistics

Other

Other	5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07
Country/Territory	Finland
City	Tuusula
Period	6/10/07 → 6/12/07

Access

10.1109/GENSIPS.2007.4365812

OpenUrl availability

Full text

Cite this

Barbe, J. F., Tewfik, A. H., & Khodursky, A. B. (2007). Transcription factor discovery using support vector machines and heterogeneous data. In 5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07 Article 4365812 (GENSIPS'07 - 5th IEEE International Workshop on Genomic Signal Processing and Statistics). https://doi.org/10.1109/GENSIPS.2007.4365812

Transcription factor discovery using support vector machines and heterogeneous data. / Barbe, José F.; Tewfik, Ahmed H.; Khodursky, Arkady B.
5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07. 2007. 4365812 (GENSIPS'07 - 5th IEEE International Workshop on Genomic Signal Processing and Statistics).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Barbe, JF, Tewfik, AH & Khodursky, AB 2007, Transcription factor discovery using support vector machines and heterogeneous data. in 5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07., 4365812, GENSIPS'07 - 5th IEEE International Workshop on Genomic Signal Processing and Statistics, 5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07, Tuusula, Finland, 6/10/07. https://doi.org/10.1109/GENSIPS.2007.4365812

@inproceedings{96b7684e699d4fd096cd2fda20d28f5d,

title = "Transcription factor discovery using support vector machines and heterogeneous data",

abstract = "In this work we analyze the suitability of expression and sequence data for discovery of co-regulatory relationships using Support Vector Machines. In addition, we try to assess the possibility of improving such results by heterogeneous data fusion and by estimating a probability of a correct classification. As shown in other studies, we have found that transcription co-expression is a good estimator for genetic co-regulation. We also have found some evidence that operator site sequence motifs can be used to estimate coregulation, but the kernels used for feature extraction did not achieve classification rates comparable to expression data. Finally, the additional information provided by combining sequence and expression data can be exploited to estimate the probability of correct classification.",

author = "Barbe, {Jos{\'e} F.} and Tewfik, {Ahmed H.} and Khodursky, {Arkady B.}",

year = "2007",

doi = "10.1109/GENSIPS.2007.4365812",

language = "English (US)",

isbn = "1424409993",

series = "GENSIPS'07 - 5th IEEE International Workshop on Genomic Signal Processing and Statistics",

booktitle = "5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07",

note = "5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07 ; Conference date: 10-06-2007 Through 12-06-2007",

}

TY - GEN

T1 - Transcription factor discovery using support vector machines and heterogeneous data

AU - Barbe, José F.

AU - Tewfik, Ahmed H.

AU - Khodursky, Arkady B.

PY - 2007

Y1 - 2007

N2 - In this work we analyze the suitability of expression and sequence data for discovery of co-regulatory relationships using Support Vector Machines. In addition, we try to assess the possibility of improving such results by heterogeneous data fusion and by estimating a probability of a correct classification. As shown in other studies, we have found that transcription co-expression is a good estimator for genetic co-regulation. We also have found some evidence that operator site sequence motifs can be used to estimate coregulation, but the kernels used for feature extraction did not achieve classification rates comparable to expression data. Finally, the additional information provided by combining sequence and expression data can be exploited to estimate the probability of correct classification.

AB - In this work we analyze the suitability of expression and sequence data for discovery of co-regulatory relationships using Support Vector Machines. In addition, we try to assess the possibility of improving such results by heterogeneous data fusion and by estimating a probability of a correct classification. As shown in other studies, we have found that transcription co-expression is a good estimator for genetic co-regulation. We also have found some evidence that operator site sequence motifs can be used to estimate coregulation, but the kernels used for feature extraction did not achieve classification rates comparable to expression data. Finally, the additional information provided by combining sequence and expression data can be exploited to estimate the probability of correct classification.

UR - http://www.scopus.com/inward/record.url?scp=47049110864&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=47049110864&partnerID=8YFLogxK

U2 - 10.1109/GENSIPS.2007.4365812

DO - 10.1109/GENSIPS.2007.4365812

M3 - Conference contribution

AN - SCOPUS:47049110864

SN - 1424409993

SN - 9781424409990

T3 - GENSIPS'07 - 5th IEEE International Workshop on Genomic Signal Processing and Statistics

BT - 5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07

T2 - 5th IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS'07

Y2 - 10 June 2007 through 12 June 2007

ER -

Transcription factor discovery using support vector machines and heterogeneous data

Abstract

Publication series

Other

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this