Inferring graph grammars by detecting overlap in frequent subgraphs

Jacek P. Kukluk; Lawrence B. Holder; Diane J. Cook

Inferring graph grammars by detecting overlap in frequent subgraphs

Jacek P. Kukluk; Lawrence B. Holder; Diane J. Cook

International Journal of Applied Mathematics and Computer Science (2008)

Volume: 18, Issue: 2, page 241-250
ISSN: 1641-876X

Access Full Article

top

Access to full text

Full (PDF)

Abstract

top

In this paper we study the inference of node and edge replacement graph grammars. We search for frequent subgraphs and then check for an overlap among the instances of the subgraphs in the input graph. If the subgraphs overlap by one node, we propose a node replacement graph grammar production. If the subgraphs overlap by two nodes or two nodes and an edge, we propose an edge replacement graph grammar production. We can also infer a hierarchy of productions by compressing portions of a graph described by a production and then inferring new productions on the compressed graph. We validate the approach in experiments where we generate graphs from known grammars and measure how well the approach infers the original grammar from the generated graph. We show graph grammars found in biological molecules, biological networks, and analyze learning curves of the algorithm.

How to cite

top

MLA
BibTeX
RIS

Jacek P. Kukluk, Lawrence B. Holder, and Diane J. Cook. "Inferring graph grammars by detecting overlap in frequent subgraphs." International Journal of Applied Mathematics and Computer Science 18.2 (2008): 241-250. <http://eudml.org/doc/207881>.

@article{JacekP2008,
abstract = {In this paper we study the inference of node and edge replacement graph grammars. We search for frequent subgraphs and then check for an overlap among the instances of the subgraphs in the input graph. If the subgraphs overlap by one node, we propose a node replacement graph grammar production. If the subgraphs overlap by two nodes or two nodes and an edge, we propose an edge replacement graph grammar production. We can also infer a hierarchy of productions by compressing portions of a graph described by a production and then inferring new productions on the compressed graph. We validate the approach in experiments where we generate graphs from known grammars and measure how well the approach infers the original grammar from the generated graph. We show graph grammars found in biological molecules, biological networks, and analyze learning curves of the algorithm.},
author = {Jacek P. Kukluk, Lawrence B. Holder, Diane J. Cook},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {grammar induction; graph grammars; graph mining; multi-relational data mining},
language = {eng},
number = {2},
pages = {241-250},
title = {Inferring graph grammars by detecting overlap in frequent subgraphs},
url = {http://eudml.org/doc/207881},
volume = {18},
year = {2008},
}

TY - JOUR
AU - Jacek P. Kukluk
AU - Lawrence B. Holder
AU - Diane J. Cook
TI - Inferring graph grammars by detecting overlap in frequent subgraphs
JO - International Journal of Applied Mathematics and Computer Science
PY - 2008
VL - 18
IS - 2
SP - 241
EP - 250
AB - In this paper we study the inference of node and edge replacement graph grammars. We search for frequent subgraphs and then check for an overlap among the instances of the subgraphs in the input graph. If the subgraphs overlap by one node, we propose a node replacement graph grammar production. If the subgraphs overlap by two nodes or two nodes and an edge, we propose an edge replacement graph grammar production. We can also infer a hierarchy of productions by compressing portions of a graph described by a production and then inferring new productions on the compressed graph. We validate the approach in experiments where we generate graphs from known grammars and measure how well the approach infers the original grammar from the generated graph. We show graph grammars found in biological molecules, biological networks, and analyze learning curves of the algorithm.
LA - eng
KW - grammar induction; graph grammars; graph mining; multi-relational data mining
UR - http://eudml.org/doc/207881
ER -

References

top

Ates K., Kukluk J., Holder L., Cook D. and Zhang K. (2006). Graph grammar induction on structural data for visual programming, Proceedings of the Conference Tools with Artificial Intelligence, Washington DC, USA, pp. 232-242.
Chomsky N. (1956). Three models of language, IRE Transactions on Information Theory 2(3): 113-24. Zbl0156.25401
Cook D. and Holder L. (1994). Substructure discovery using minimum description length and background knowledge, Journal of Artificial Intelligence Research 1: 231-255.
Cook D. and Holder L. (2000). Graph-based data mining, IEEE Intelligent Systems 15(2): 32-41.
Jeltsch E. and Kreowski H. (1990). Grammatical inference based on hyperedge replacement. Graph-Grammars, Lecture Notes in Computer Science 532: 461-474. Zbl0768.68082
Jonyer I., Holder L. and Cook C. (2002). Concept formation using graph grammars, Proceedings of the KDD Workshop on Multi-Relational Data Mining, Edmonton, Alberta, Canada, pp. 71-79.
Jonyer I., Holder L. and Cook D. (2004). MDL-based contextfree graph grammar induction and applications, International Journal of Artificial Intelligence Tools, 13(1): 65-79.
Kanehisa M., Goto S., Hattori M., Aoki-Kinoshita K.F., Itoh M., Kawashima S., Katayama T., Araki M. and Hirakawa M. (2006). From genomics to chemical genomics: New developments in KEGG, Nucleic Acids Res. 34: D354-357.
Kukluk J., Holder L. and Cook D. (2006). Inference of node replacement recursive graph grammars, Proceedings of the 6-th SIAM International Conference on Data Mining, Washington, USA, pp. 544-548.
Kukluk J., Hun You C., Holder L. and Cook D. (2007). Learning node replacement graph grammars in metabolic pathways, International Conference on Bioinformatics & Computational Biology, (BIOCOMP'07), Las Vegas, NV, USA, pp. 44-50.
Kuramochi M. and Karypis G. (2001). Frequent subgraph discovery, Proceedings of the IEEE 2001 International Conference on Data Mining (ICDM '01), San Jose, CA, USA, pp. 313-320.
Neidle S. (Ed.) (1999). Oxford Handbook of Nucleic Acid Structure, Oxford, Oxford University Press.
Nevill-Manning G. and Witten H. (1997). Identifying hierarchical structure in sequences: A linear-time algorithm, Journal of Artificial Intelligence Research, 7: 67-82. Zbl0894.68072
Phan A., Kuryavyi V., Ma J., Faure A., Andreola M. and Patel D. (2005). An interlocked dimeric parallel-stranded DNA quadruplex: A potent inhibitor of HIV-1 integrase, Proceedings of the National Academy of Sciences 102(3): 634-639.
Oates T., Doshi S. and Huang F. (2003). Estimating maximum likelihood parameters for stochastic context-free graph grammars, Lecture Notes in Artificial Intelligence 2835: 281-298. Zbl1263.68059
Rissanen J. (1989). Stochastic Complexity in Statistical Inquiry. World Scientific Company. Zbl0800.68508
Yan X. and Han J., gSpan (2002): Graph-based substructure pattern mining, Proceedings of the IEEE International Conference on Data Mining, Maebashi City, Japan, pp. 721-724.

Citations in EuDML Documents

top

Barbara Strug, Anna Paszyńska, Maciej Paszyński, Ewa Grabska, Using a graph grammar system in the finite element method

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Language to use for this widget.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Number of notes per page

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.