Wiki‎ > ‎

Mining Complex Data

Networks

Applications 
        Social networks
            FOAF ("friend of a friend"), e.g. Facebook
            Organization networks
            Socioeconomic networks
        Content networks
            Web
            Bibliographic networks
        Transportation networks
            Telecommunication networks, e.g. Internet
            Distribution networks, e.g. power grid
        Biological networks
            Ecological networks
            Genomic networks: gene-protein interactions
            Proteomic networks: protein-protein interactions
            Biological pathways
        Software networks

Network Properties 
    Scale
    Evolution
    Distribution
    Interaction through connections

Network Structure
    Connectivity
    Diameter ("small worlds")
    Degree distribution
    Clustering coefficient
+ Affiliation networks

Network Models
    Random-Graph Models
        Erdos-Renyi
    Growing Random Models 
        Watts-Strogatz (small-world networks)
        Barabasi-Albert (scale-free networks)
    Hierarchical/Modular Networks
    Strategic Network Formation 

Network Dynamics
    Robustness/Vulnerability 
    Diffusion through Networks
        Epidemics
        Social contagion (ideas, news, fads...)
    Search on Networks
    Social Influence Models
    Networked Markets



Graph Mining

Graph patterns
    Graph matching
        Canonicalization
    Frequent subgraphs
        Algorithms
            Beam seach: SUBDUE
            Inductive logic programming: WARMR
            Apriori-like: AGM, FSG, "disjoint paths"
            FPGrowth-like: gSpan, Gaston, CloseGraph
        Constraints...
    Applications
        Indexing, e.g. GraphGrep, Grace, gIndex
        Information retrieval, e.g. Grafil
        Metadata mining: Schema mapping, schema discovery, schema reformulation 

Ranking
    PageRank (Brin&Page’98, Google)
    HITS (Kleinberg’98)
    Block-level link analysis (Cai’2004)

Community detection (clustering)

Classification
    Object classification (e.g. multirelational classification)
    Link prediction (predict whether a link exists between two entities, based on attributes and other observed links)
        Link cardinality estimation: Predicting the number of links to an object. 
    Entity resolution (a.k.a. deduplication, reference reconciliation, co-reference resolution, object consolidation)



Bibliography

- Diane J. Cook & Lawrence B. Holder (editors): Mining Graph Data. Wiley, 2007. ISBN 0-471-73190-0
- Jiawei Han & Micheline Kamber: Data Mining: Concepts and Techniques [2nd edition], Chapter 9. Addison-Wesley, 2006. ISBN 1-55860-901-3
- Pang-Ning Tan, Michael Steinbach & Vipin Kumar: Introduction to Data Mining, Section 7.5. Morgan Kaufmann, 2006. ISBN 0-321-32136-7



ć
Fernando Berzal,
Apr 20, 2011, 4:20 AM
ć
Fernando Berzal,
Apr 20, 2011, 4:18 AM
Comments