Algorithm Selection Guide
This guide helps you choose the right algorithms for your multilayer network analysis tasks.
Community Detection
Choose based on your network characteristics and requirements:
Louvain Algorithm
Best for:
Large networks (10,000+ nodes)
Fast approximate results
Non-overlapping communities
Modularity optimization
Complexity: \(O(n \\log n)\) where \(n\) is the number of nodes
Usage:
from py3plex.algorithms.community_detection import community_louvain
communities = community_louvain.best_partition(network.core_network)
Pros:
Very fast on large networks
Well-established and widely used
BSD license (commercial-friendly)
Cons:
Non-deterministic (random initialization)
Cannot find overlapping communities
Resolution limit issues
Infomap Algorithm
Best for:
Flow-based community structure
Overlapping communities
Hierarchical community detection
Information-theoretic optimization
Complexity: \(O(m \\log n)\) where \(m\) is edges, \(n\) is nodes
Usage:
from py3plex.algorithms.community_detection import community_wrapper
communities = community_wrapper.infomap_communities(
network.core_network,
binary_path="/path/to/infomap"
)
Pros:
Can detect overlapping communities
Flow-based approach (natural for many applications)
Hierarchical structure
Cons:
Requires external binary or AGPLv3 code
Slower than Louvain
AGPLv3 license (viral copyleft)
Label Propagation
Best for:
Semi-supervised community detection
Known seed communities
Very large sparse networks
Linear-time approximate results
Complexity: \(O(m)\) linear in number of edges
Usage:
from py3plex.algorithms.community_detection import label_propagation
communities = label_propagation.propagate(
network.core_network,
seed_labels={'node1': 0, 'node2': 1}
)
Pros:
Very fast (linear time)
Can incorporate prior knowledge
MIT license
Cons:
Non-deterministic
Sensitive to initialization
May not converge
Multilayer Modularity
Best for:
True multilayer community detection
Cross-layer community structure
Layer coupling analysis
Research applications
Complexity: \(O(n^2)\) for supra-adjacency construction + Louvain complexity
Usage:
from py3plex.algorithms.community_detection import multilayer_modularity as mlm
supra_adj = network.get_supra_adjacency_matrix()
communities = mlm.multilayer_louvain(supra_adj)
Pros:
Accounts for multilayer structure
Implements Mucha et al. (2010) algorithm
Configurable inter-layer coupling
Cons:
More computationally expensive
Requires careful parameter tuning
May not scale to very large networks
Centrality Measures
Degree Centrality
Best for:
Quick importance ranking
Local connectivity measure
Well-connected node identification
Complexity: \(O(n + m)\)
Variants:
Layer-specific degree
Supra-adjacency degree
Overlapping degree (sum across layers)
Usage:
from py3plex.algorithms.multilayer_algorithms.centrality import MultilayerCentrality
calc = MultilayerCentrality(network)
degrees = calc.overlapping_degree_centrality()
Betweenness Centrality
Best for:
Bridge node identification
Information flow bottlenecks
Critical path analysis
Complexity: \(O(nm)\) for unweighted, \(O(nm + n^2 \\log n)\) for weighted
Usage:
calc = MultilayerCentrality(network)
betweenness = calc.multilayer_betweenness_centrality()
Warning: Computationally expensive for large networks (>1000 nodes)
Closeness Centrality
Best for:
Broadcasting efficiency
Average distance to other nodes
Central position in network
Complexity: \(O(nm)\)
Usage:
closeness = calc.multilayer_closeness_centrality()
Eigenvector Centrality
Best for:
Influence measure (connections to important nodes)
Recursive importance
Prestige ranking
Complexity: \(O(m)\) per iteration, usually converges quickly
Usage:
eigenvector = calc.multiplex_eigenvector_centrality()
PageRank
Best for:
Directed networks
Web-like structures
Random walk centrality
Complexity: \(O(m)\) per iteration
Usage:
pagerank = calc.pagerank_centrality(alpha=0.85)
Versatility Centrality
Best for:
Multilayer-specific importance
Cross-layer influence
Multi-context analysis
Complexity: \(O(m \\times L)\) where \(L\) is number of layers
Usage:
from py3plex.algorithms.statistics import multilayer_statistics as mls
versatility = mls.versatility_centrality(network, centrality_type='degree')
Network Statistics
Basic Statistics
Fast operations (suitable for large networks):
Node count: \(O(n)\)
Edge count: \(O(m)\)
Degree distribution: \(O(n + m)\)
Density: \(O(1)\) if counts are known
network.basic_stats()
layers = network.get_layers()
num_nodes = len(network.get_nodes())
Multilayer Statistics
Layer density - \(O(m_\\alpha)\) per layer:
from py3plex.algorithms.statistics import multilayer_statistics as mls
density = mls.layer_density(network, 'layer1')
Inter-layer correlation - \(O(n)\):
correlation = mls.inter_layer_degree_correlation(network, 'layer1', 'layer2')
Edge overlap - \(O(m)\):
overlap = mls.edge_overlap(network, 'layer1', 'layer2')
Versatility centrality - \(O(m \\times L)\):
versatility = mls.versatility_centrality(network, centrality_type='betweenness')
Path-based Measures
Warning: These are computationally expensive for large networks
Diameter - \(O(nm)\) or \(O(n^2 \\log n)\)
Average shortest path - \(O(nm)\) or \(O(n^2 \\log n)\)
Betweenness centrality - \(O(nm)\) or \(O(nm + n^2 \\log n)\)
Recommendation: Limit to networks with <5,000 nodes for interactive analysis
Node Embeddings
Node2Vec
Best for:
Feature learning for downstream tasks
Link prediction
Node classification
Similarity computation
Complexity: \(O(r \\times l \\times |V|)\) where \(r\) is walks per node, \(l\) is walk length
Parameters:
dimensions
: Embedding dimensions (default: 128)walk_length
: Length of random walks (default: 80)num_walks
: Number of walks per node (default: 10)p
: Return parameter (default: 1.0)q
: In-out parameter (default: 1.0)
Usage:
from py3plex.wrappers import node2vec_embedding
embeddings = node2vec_embedding.generate_embeddings(
network.core_network,
dimensions=128,
walk_length=80,
num_walks=10,
p=1.0,
q=1.0
)
Parameter tuning:
Low
p
(<1): Encourages backtracking (local exploration)High
p
(>1): Discourages backtrackingLow
q
(<1): Encourages outward exploration (BFS-like)High
q
(>1): Encourages staying local (DFS-like)
DeepWalk
Best for:
Same as Node2Vec but simpler (uniform random walks)
Faster than Node2Vec
Complexity: \(O(r \\times l \\times |V|)\)
Usage:
# DeepWalk is Node2Vec with p=1, q=1
embeddings = node2vec_embedding.generate_embeddings(
network.core_network,
p=1.0,
q=1.0
)
Visualization
Diagonal Projection Plot
Best for:
Large multilayer networks (>1000 nodes)
Clear layer separation
Publication-ready figures
Scalability: Handles 10,000+ nodes
Usage:
from py3plex.visualization.multilayer import draw_multilayer_default
draw_multilayer_default([network], display=True, layout_style='diagonal')
Force-Directed Layout
Best for:
Small to medium networks (<1000 nodes)
Interactive exploration
Natural-looking layouts
Scalability: Practical up to ~5,000 nodes
Usage:
from py3plex.visualization.multilayer import hairball_plot
hairball_plot(network.core_network, layout_algorithm='force')
Matrix Visualization
Best for:
Pattern recognition
Very large networks
Quantitative analysis
Scalability: Handles 100,000+ nodes (limited by memory)
Usage:
import matplotlib.pyplot as plt
adj = network.get_supra_adjacency_matrix()
plt.imshow(adj.toarray() if hasattr(adj, 'toarray') else adj, cmap='viridis')
plt.colorbar()
plt.show()
Performance Guidelines
Network Size Categories
- Small (< 1,000 nodes):
All algorithms are fast
Can use expensive path-based measures
Interactive visualization works well
- Medium (1,000 - 10,000 nodes):
Use Louvain over Infomap
Avoid full betweenness centrality
Use diagonal projection for visualization
- Large (10,000 - 100,000 nodes):
Use degree-based measures
Louvain or label propagation only
Matrix visualizations or sampling
- Very Large (> 100,000 nodes):
Degree, PageRank only
Label propagation for communities
Sparse matrix operations essential
Memory Considerations
Supra-adjacency matrix:
Sparse representation: \(O(m)\) memory
Dense representation: \(O(n^2)\) memory (avoid for large networks)
Always use sparse matrices for networks with >1,000 nodes
# Ensure sparse representation
supra_adj = network.get_supra_adjacency_matrix(sparse=True)
Algorithm Comparison Table
Algorithm |
Complexity |
Scalability |
License |
Best Use Case |
---|---|---|---|---|
Louvain |
O(n log n) |
Large |
BSD |
Fast community detection |
Infomap |
O(m log n) |
Medium |
AGPLv3 |
Flow-based communities |
Label Prop |
O(m) |
Very Large |
MIT |
Semi-supervised, very fast |
Degree Centrality |
O(n+m) |
Very Large |
MIT |
Quick importance |
Betweenness |
O(nm) |
Small |
MIT |
Critical paths |
PageRank |
O(m) per iter |
Large |
MIT |
Directed networks |
Node2Vec |
O(r×l×n) |
Medium |
MIT |
Feature learning |
Citation Guidelines
When using specific algorithms in publications, cite the original papers:
- Louvain:
Blondel, V. D., et al. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics.
- Infomap:
Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. PNAS, 105(4), 1118-1123.
- Node2Vec:
Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. KDD, 855-864.
- Multilayer Modularity:
Mucha, P. J., et al. (2010). Community structure in time-dependent, multiscale, and multiplex networks. Science, 328(5980), 876-878.
For complete references with DOIs, see the ALGORITHM_CITATIONS.md file in the repository.
Next Steps
Community Detection Tutorial - Community detection tutorial
Multilayer Network Centrality Measures - Centrality measures tutorial
Performance and Scalability - Performance optimization guide
See ALGORITHM_CITATIONS.md for full citation list