Community Detection
Finding groups that span multiple layers of interaction.
DSL Tip: Filter by Communities
After detecting communities, use DSL to analyze them:
from py3plex.core import multinet
from py3plex.dsl import Q, L
from py3plex.algorithms.community_detection import community_louvain
# Step 1: Detect communities
communities = community_louvain.best_partition(network.core_network)
# Step 2: Store community IDs as node attributes
for node, comm_id in communities.items():
network.core_network.nodes[node]['community'] = comm_id
# Step 3: Find high-degree nodes in specific community
result = (
Q.nodes()
.where(community=0) # Filter by community ID
.compute("degree", "betweenness_centrality")
.order_by("-degree")
.limit(10)
.execute(network)
)
# Step 4: Export for further analysis
df = result.to_pandas()
df.to_csv("community_hubs.csv", index=False)
# Example output:
# id degree betweenness_centrality
# (Alice, social) 3 0.667
# (Bob, social) 2 0.000
# (Charlie, social) 2 0.000
Combine traditional algorithms with DSL queries for powerful workflows!
Networks are rarely homogeneous. People cluster into social groups. Proteins form functional modules. Cities organize into regional hubs. Community detection finds these natural groupings—but in multilayer networks, the question is more subtle: do communities exist within layers, across layers, or both?
This chapter shows you how to detect communities in multilayer networks, how to tune algorithm parameters for your specific domain, and how to interpret results that may differ from single-layer community detection.
Overview
Community detection identifies groups of nodes that are more densely connected to each other than to the rest of the network. For multilayer networks, communities can span multiple layers, accounting for both intra-layer and inter-layer structure.
Key insight: A person who is moderately connected across multiple social platforms (layers) may be more central to a cross-platform community than someone who is highly connected on just one platform. Multilayer community detection captures this.
Supported Algorithms
py3plex provides several community detection algorithms:
Louvain — Fast modularity optimization (recommended for most use cases)
Infomap — Flow-based community detection (requires external binary)
Label Propagation — Semi-supervised approach with known seed communities
Multilayer Modularity — True multilayer community detection (Mucha et al. 2010)
Louvain Algorithm
Basic Usage
Fastest algorithm for large networks:
from py3plex.core import multinet
from py3plex.algorithms.community_detection import community_louvain
# Create or load network
network = multinet.multi_layer_network()
network.load_network("data.graphml", input_type="graphml")
# Detect communities using Louvain
communities = community_louvain.best_partition(network.core_network)
# Print results
for node, community_id in communities.items():
print(f"Node {node} -> Community {community_id}")
Parameters
# With custom resolution parameter
communities = community_louvain.best_partition(
network.core_network,
resolution=1.0 # Higher = more communities, Lower = fewer communities
)
Resolution parameter:
resolution=1.0- Standard modularityresolution>1.0- More, smaller communitiesresolution<1.0- Fewer, larger communities
Advantages
Very fast: \(O(n \log n)\)
Scales to millions of nodes
BSD license (commercial-friendly)
Well-established and widely used
Disadvantages
Non-deterministic (random initialization)
Cannot find overlapping communities
Resolution limit issues
Infomap Algorithm
Basic Usage
Flow-based approach for detecting communities:
from py3plex.algorithms.community_detection import community_wrapper
# Detect communities using Infomap
communities = community_wrapper.infomap_communities(
network.core_network,
binary_path="/path/to/infomap" # Path to Infomap binary
)
With Hierarchical Structure
# Get hierarchical community structure
hierarchical_communities = community_wrapper.infomap_communities(
network.core_network,
binary_path="/path/to/infomap",
hierarchical=True
)
Advantages
Can detect overlapping communities
Flow-based (natural for many applications)
Hierarchical structure
Information-theoretic foundation
Disadvantages
Requires external binary
AGPLv3 license (viral copyleft - problematic for commercial use)
Slower than Louvain
Label Propagation
Semi-Supervised Detection
Use when you have some known community memberships:
from py3plex.algorithms.community_detection import label_propagation
# Provide seed labels for some nodes
seed_labels = {
'node1': 0,
'node2': 0,
'node3': 1,
'node4': 1
}
# Propagate labels to unlabeled nodes
communities = label_propagation.propagate(
network.core_network,
seed_labels=seed_labels,
max_iter=100
)
Fully Unsupervised
# Without seed labels (random initialization)
communities = label_propagation.propagate(
network.core_network,
max_iter=100
)
Advantages
Very fast: \(O(m)\) linear in edges
Can incorporate prior knowledge
MIT license
Simple and interpretable
Disadvantages
Non-deterministic
Sensitive to initialization
May not converge
Lower quality than Louvain/Infomap
Multilayer Modularity
True Multilayer Detection
Accounts for multilayer structure following Mucha et al. (2010):
from py3plex.algorithms.community_detection import multilayer_modularity as mlm
# Get supra-adjacency matrix
supra_adj = network.get_supra_adjacency_matrix(sparse=True)
# Detect communities with multilayer modularity
communities = mlm.multilayer_louvain(
supra_adj,
gamma=1.0, # Resolution parameter
omega=1.0 # Inter-layer coupling strength
)
Parameter Tuning
# Emphasize layer-specific structure
communities = mlm.multilayer_louvain(
supra_adj,
gamma=1.0,
omega=0.1 # Low coupling = layer-specific communities
)
# Emphasize cross-layer structure
communities = mlm.multilayer_louvain(
supra_adj,
gamma=1.0,
omega=10.0 # High coupling = cross-layer communities
)
Mathematical Formulation
Multilayer modularity is defined as:
Where:
\(A_{ij}^{[\alpha]}\) is the adjacency matrix of layer \(\alpha\)
\(P_{ij}^{[\alpha]}\) is the null model (e.g., configuration model)
\(\gamma_{\alpha}\) is the resolution parameter for layer \(\alpha\)
\(\omega_{\alpha\beta}\) is the coupling strength between layers
\(\delta(g_{i}^{[\alpha]}, g_{j}^{[\beta]})\) is 1 if nodes are in the same community, 0 otherwise
Advantages
Accounts for multilayer structure
Implements state-of-the-art algorithm
Configurable inter-layer coupling
Published in Science (Mucha et al. 2010)
Disadvantages
More computationally expensive
Requires parameter tuning
May not scale to very large networks (>100k nodes)
Evaluating Community Quality
Modularity Score
import networkx as nx
# Compute modularity
modularity = nx.community.modularity(network.core_network, communities)
print(f"Modularity: {modularity:.3f}")
Interpretation:
\(Q > 0.3\): Good community structure
\(Q > 0.5\): Strong community structure
\(Q < 0.3\): Weak or no community structure
Coverage and Performance
# Coverage: fraction of edges within communities
coverage = nx.community.coverage(network.core_network, communities)
# Performance: fraction of correctly classified node pairs
performance = nx.community.performance(network.core_network, communities)
print(f"Coverage: {coverage:.3f}")
print(f"Performance: {performance:.3f}")
Visualizing Communities
Color by Community
from py3plex.visualization.multilayer import hairball_plot
import matplotlib.pyplot as plt
# Map communities to colors
node_colors = [communities.get(node, 0) for node in network.core_network.nodes()]
# Visualize with community colors
hairball_plot(
network.core_network,
node_color=node_colors,
layout_algorithm='force',
cmap='tab10'
)
plt.show()
Community Size Distribution
from collections import Counter
import matplotlib.pyplot as plt
# Count community sizes
community_sizes = Counter(communities.values())
sizes = list(community_sizes.values())
# Plot distribution
plt.hist(sizes, bins=20)
plt.xlabel('Community Size')
plt.ylabel('Frequency')
plt.title('Community Size Distribution')
plt.show()
Understanding Single-Layer vs. Multilayer Community Detection
Before diving into algorithm specifics, it’s important to understand the conceptual differences between approaches.
Single-Layer Community Detection
Traditional community detection finds groups in a single graph. Applied to a multilayer network, you have two options:
Flatten and detect: Aggregate all layers into one graph, then find communities. This loses layer information.
Detect per layer: Find communities independently in each layer. This ignores cross-layer structure.
Neither captures the full multilayer picture.
Multilayer Community Detection
Multilayer algorithms find communities that are consistent across layers while respecting layer-specific structure. They ask: “Which nodes cluster together across multiple contexts?”
Key insight: A node that is moderately connected in many layers may be more “community-central” than a node highly connected in just one layer.
Overlapping vs. Non-Overlapping Communities
Non-overlapping: Each node belongs to exactly one community. Algorithms like Louvain and Leiden produce non-overlapping partitions.
Overlapping: Nodes can belong to multiple communities. Algorithms like NoRC and clique percolation find overlapping structure.
When to use overlapping: When nodes naturally belong to multiple groups (e.g., a person in both a work community and a hobby community).
Flow-Based vs. Modularity-Based Views
Modularity-based (Louvain, Leiden): Optimize a quality function that compares edge density within communities to expected density. Fast, widely used, but has resolution limit issues.
Flow-based (Infomap): Model random walks on the network and find community structure that minimizes description length of those walks. Theoretically grounded, finds hierarchical structure, but slower.
When to use which:
Use modularity-based for speed and when you don’t need hierarchical structure
Use flow-based when you care about information flow or want to find nested communities
Parameter Tuning Cookbook
Tuning Resolution (Gamma)
The resolution parameter γ controls community size:
from py3plex.algorithms.community_detection.multilayer_modularity import (
louvain_multilayer
)
# Experiment with different resolution values
for gamma in [0.5, 1.0, 1.5, 2.0]:
partition = louvain_multilayer(network, gamma=gamma, omega=1.0, random_state=42)
num_comms = len(set(partition.values()))
print(f"gamma={gamma}: {num_comms} communities")
Interpretation guide:
Very few communities (2-5) when you expect more: γ is too low → increase γ
Many singleton communities: γ is too high → decrease γ
One giant community + many tiny ones: Resolution limit problem → try γ > 1 or use Leiden
Recommended starting procedure:
Start with γ=1.0 (standard modularity)
Look at community size distribution
If too coarse, try γ=1.5, 2.0
If too fine, try γ=0.5, 0.25
Tuning Inter-Layer Coupling (Omega)
The coupling parameter ω controls how much layers influence each other:
# Experiment with different coupling values
for omega in [0.1, 0.5, 1.0, 2.0, 5.0]:
partition = louvain_multilayer(network, gamma=1.0, omega=omega, random_state=42)
num_comms = len(set(partition.values()))
# Check cross-layer consistency
# (how often does the same node get same community across layers?)
# ... (compute consistency metric)
print(f"omega={omega}: {num_comms} communities")
Interpretation guide:
ω = 0: Layers are independent (equivalent to detecting per-layer, then combining)
ω = 1: Balanced coupling (default, usually good)
ω > 1: Strong coupling (forces cross-layer consistency)
ω → ∞: All layers must have identical community structure
Domain-specific guidance:
Multiplex social networks: Start with ω=1.0 (people are the same across platforms)
Temporal networks: ω=0.5 to 1.0 (communities can evolve but not too fast)
Heterogeneous networks: ω=0.1 to 0.5 (different node types may have different community structure)
Diagnosing Bad Partitions
Problem: All nodes in one community
if len(set(partition.values())) == 1:
print("All nodes in single community - try increasing gamma")
Causes: Network is too dense, γ too low, or network genuinely has no community structure.
Problem: Each node is its own community
if len(set(partition.values())) == len(partition):
print("All singletons - try decreasing gamma or increasing omega")
Causes: Network is too sparse, γ too high, ω too low.
Problem: Communities don’t match domain expectations
Actions:
Visualize communities and examine specific nodes
Check if high-degree nodes are correctly assigned
Verify that known groups (e.g., departments) are recovered
Consider using ground-truth labels for NMI comparison
Mini Case Studies
Case Study 1: Biological Network Communities
Scenario: A protein-protein interaction network with 3 layers representing different experimental evidence types (yeast two-hybrid, co-immunoprecipitation, affinity purification).
Goal: Find functional modules (groups of proteins with shared biological function).
Approach:
from py3plex.core import multinet
from py3plex.algorithms.community_detection.multilayer_modularity import (
louvain_multilayer
)
# Load network
network = multinet.multi_layer_network().load_network(
"ppi_multilayer.txt", input_type="multiedgelist", directed=False
)
# Use moderate coupling - different evidence types should
# contribute to same modules, but we don't require perfect consistency
partition = louvain_multilayer(
network,
gamma=1.0, # Standard resolution
omega=0.5, # Moderate coupling
random_state=42
)
# Validate: Do communities correspond to GO biological process terms?
# Compare community assignments to known functional annotations
Expected outcome: Communities should correspond to functional modules like “cell cycle,” “DNA repair,” “metabolic pathways.” Proteins appearing in multiple layers with high connectivity should be community hubs.
Case Study 2: Transportation Network Communities
Scenario: A multi-modal transportation network with layers for metro, bus, and bike-share in a city.
Goal: Find “travel basins”—regions where people travel together within a mode and switch between modes at hubs.
Approach:
# Load network
network = multinet.multi_layer_network().load_network(
"transport_network.txt", input_type="multiedgelist", directed=False
)
# Higher coupling - the same station serves multiple modes
partition = louvain_multilayer(
network,
gamma=1.2, # Slightly higher to find smaller regions
omega=1.5, # Strong coupling at multimodal hubs
random_state=42
)
# Validate: Do communities correspond to geographic regions?
# Are major transfer stations correctly identified as community boundaries?
Expected outcome: Communities should correspond to neighborhoods or districts. Multimodal hubs (stations serving metro + bus + bike) should appear at community boundaries or as bridges between communities.
Comparing Algorithms
Run Multiple Algorithms
# Run different algorithms
louvain_comms = community_louvain.best_partition(network.core_network)
label_prop_comms = label_propagation.propagate(network.core_network)
# Compare number of communities
print(f"Louvain: {len(set(louvain_comms.values()))} communities")
print(f"Label Prop: {len(set(label_prop_comms.values()))} communities")
# Compare modularity
louvain_mod = nx.community.modularity(network.core_network,
[set(n for n, c in louvain_comms.items() if c == i)
for i in set(louvain_comms.values())])
label_mod = nx.community.modularity(network.core_network,
[set(n for n, c in label_prop_comms.items() if c == i)
for i in set(label_prop_comms.values())])
print(f"Louvain modularity: {louvain_mod:.3f}")
print(f"Label Prop modularity: {label_mod:.3f}")
Normalized Mutual Information
Compare similarity between community structures:
from sklearn.metrics import normalized_mutual_info_score
# Convert to lists
louvain_list = [louvain_comms[node] for node in network.core_network.nodes()]
label_list = [label_prop_comms[node] for node in network.core_network.nodes()]
# Compute NMI
nmi = normalized_mutual_info_score(louvain_list, label_list)
print(f"NMI between Louvain and Label Prop: {nmi:.3f}")
Interpretation:
NMI = 1.0: Identical community structures
NMI = 0.0: Completely different structures
NMI > 0.5: Similar structures
Best Practices
Algorithm Selection
Network Size |
Speed Priority |
Quality Priority |
Recommendation |
|---|---|---|---|
Small (<1K) |
Any |
Any |
Try all algorithms |
Medium (1K-10K) |
Louvain |
Louvain/Infomap |
Louvain (good balance) |
Large (10K-100K) |
Louvain/Label Prop |
Louvain |
Louvain |
Very Large (>100K) |
Label Prop |
Louvain |
Label Prop or sample |
Parameter Guidelines
Louvain resolution:
Start with
resolution=1.0Increase if communities are too large
Decrease if communities are too fragmented
Multilayer coupling (omega):
omega=1.0- Default, balancedomega<1.0- Emphasize layer-specific structureomega>1.0- Emphasize cross-layer structure
Validation
Always validate community detection results:
Visual inspection — Plot and examine communities
Modularity — Check modularity score (>0.3 is good)
Size distribution — Check for giant communities or singletons
Domain knowledge — Do communities make sense for your application?
Ground truth comparison — If you have labels, compute NMI or Adjusted Rand Index
Common Failure Modes
Trivial partitions: All-in-one or all-singletons → tune γ and ω
Unstable results: Different runs give very different partitions → use
random_stateand run multiple timesOver-fragmentation: Too many small communities → decrease γ or try Leiden
Resolution limit: Can’t find small communities in large networks → increase γ or use hierarchical methods
What You Learned
This chapter covered community detection in multilayer networks:
Algorithms:
Louvain — Fast, O(n log n), BSD license, good for most use cases
Infomap — Flow-based, finds hierarchical structure, AGPLv3 license
Label Propagation — Very fast, linear in edges, supports semi-supervised detection
Multilayer Modularity — True multilayer detection with inter-layer coupling
Parameter tuning:
Resolution γ — Higher = more, smaller communities; lower = fewer, larger
Coupling ω — Higher = cross-layer consistency; lower = layer-specific structure
Start with γ=1.0, ω=1.0 and adjust based on results
Interpretation:
Trivial partitions (all-in-one or all-singletons) indicate parameter tuning needed
High modularity (>0.3) suggests good community structure
Validate with visualization, domain knowledge, and ground truth if available
Conceptual differences:
Single-layer detection treats node-layer pairs independently
Multilayer detection finds communities consistent across layers
Overlapping vs. non-overlapping communities serve different use cases
References
Louvain:
Blondel, V. D., et al. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics.
Infomap:
Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. PNAS, 105(4), 1118-1123.
Multilayer Modularity:
Mucha, P. J., et al. (2010). Community structure in time-dependent, multiscale, and multiplex networks. Science, 328(5980), 876-878.
See Citation and References for complete citations with DOIs.
What’s Next?
Random Walk Algorithms — Generate embeddings for ML tasks
Visualization — Visualize communities with color-coding
Algorithm Landscape — Overview of all algorithms
Related Examples:
examples/communities/example_community_detection.py— Complete workflowexamples/communities/example_multilayer_louvain.py— Parameter tuning