Community Detection Tutorial

This tutorial demonstrates community detection algorithms for multilayer networks using py3plex.

Overview

Community detection identifies groups of nodes that are more densely connected to each other than to the rest of the network. For multilayer networks, communities can span multiple layers, accounting for both intra-layer and inter-layer structure.

Supported Algorithms

py3plex provides several community detection algorithms:

Louvain - Fast modularity optimization (recommended for most use cases)
Infomap - Flow-based community detection (requires external binary)
Label Propagation - Semi-supervised approach with known seed communities
Multilayer Modularity - True multilayer community detection (Mucha et al. 2010)

Louvain Algorithm

Basic Usage

Fastest algorithm for large networks:

from py3plex.core import multinet
from py3plex.algorithms.community_detection import community_louvain

# Create or load network
network = multinet.multi_layer_network()
network.load_network("data.graphml", input_type="graphml")

# Detect communities using Louvain
communities = community_louvain.best_partition(network.core_network)

# Print results
for node, community_id in communities.items():
    print(f"Node {node} -> Community {community_id}")

Parameters

# With custom resolution parameter
communities = community_louvain.best_partition(
    network.core_network,
    resolution=1.0  # Higher = more communities, Lower = fewer communities
)

Resolution parameter:

resolution=1.0 - Standard modularity
resolution>1.0 - More, smaller communities
resolution<1.0 - Fewer, larger communities

Advantages

Very fast: \(O(n \log n)\)
Scales to millions of nodes
BSD license (commercial-friendly)
Well-established and widely used

Disadvantages

Non-deterministic (random initialization)
Cannot find overlapping communities
Resolution limit issues

Infomap Algorithm

Basic Usage

Flow-based approach for detecting communities:

from py3plex.algorithms.community_detection import community_wrapper

# Detect communities using Infomap
communities = community_wrapper.infomap_communities(
    network.core_network,
    binary_path="/path/to/infomap"  # Path to Infomap binary
)

With Hierarchical Structure

# Get hierarchical community structure
hierarchical_communities = community_wrapper.infomap_communities(
    network.core_network,
    binary_path="/path/to/infomap",
    hierarchical=True
)

Advantages

Can detect overlapping communities
Flow-based (natural for many applications)
Hierarchical structure
Information-theoretic foundation

Disadvantages

Requires external binary
AGPLv3 license (viral copyleft - problematic for commercial use)
Slower than Louvain

Label Propagation

Semi-Supervised Detection

Use when you have some known community memberships:

from py3plex.algorithms.community_detection import label_propagation

# Provide seed labels for some nodes
seed_labels = {
    'node1': 0,
    'node2': 0,
    'node3': 1,
    'node4': 1
}

# Propagate labels to unlabeled nodes
communities = label_propagation.propagate(
    network.core_network,
    seed_labels=seed_labels,
    max_iter=100
)

Fully Unsupervised

# Without seed labels (random initialization)
communities = label_propagation.propagate(
    network.core_network,
    max_iter=100
)

Advantages

Very fast: \(O(m)\) linear in edges
Can incorporate prior knowledge
MIT license
Simple and interpretable

Disadvantages

Non-deterministic
Sensitive to initialization
May not converge
Lower quality than Louvain/Infomap

Multilayer Modularity

True Multilayer Detection

Accounts for multilayer structure following Mucha et al. (2010):

from py3plex.algorithms.community_detection import multilayer_modularity as mlm

# Get supra-adjacency matrix
supra_adj = network.get_supra_adjacency_matrix(sparse=True)

# Detect communities with multilayer modularity
communities = mlm.multilayer_louvain(
    supra_adj,
    gamma=1.0,      # Resolution parameter
    omega=1.0       # Inter-layer coupling strength
)

Parameter Tuning

# Emphasize layer-specific structure
communities = mlm.multilayer_louvain(
    supra_adj,
    gamma=1.0,
    omega=0.1  # Low coupling = layer-specific communities
)

# Emphasize cross-layer structure
communities = mlm.multilayer_louvain(
    supra_adj,
    gamma=1.0,
    omega=10.0  # High coupling = cross-layer communities
)

Mathematical Formulation

Multilayer modularity is defined as:

\[Q^{ML} = \frac{1}{2\mu} \sum_{ij\alpha\beta} \left[ (A_{ij}^{[\alpha]} - \gamma_{\alpha} P_{ij}^{[\alpha]})\delta_{\alpha\beta} + \omega_{\alpha\beta}\delta_{ij} \right] \delta(g_{i}^{[\alpha]}, g_{j}^{[\beta]})\]

Where:

\(A_{ij}^{[\alpha]}\) is the adjacency matrix of layer \(\alpha\)
\(P_{ij}^{[\alpha]}\) is the null model (e.g., configuration model)
\(\gamma_{\alpha}\) is the resolution parameter for layer \(\alpha\)
\(\omega_{\alpha\beta}\) is the coupling strength between layers
\(\delta(g_{i}^{[\alpha]}, g_{j}^{[\beta]})\) is 1 if nodes are in the same community, 0 otherwise

Advantages

Accounts for multilayer structure
Implements state-of-the-art algorithm
Configurable inter-layer coupling
Published in Science (Mucha et al. 2010)

Disadvantages

More computationally expensive
Requires parameter tuning
May not scale to very large networks (>100k nodes)

Evaluating Community Quality

Modularity Score

import networkx as nx

# Compute modularity
modularity = nx.community.modularity(network.core_network, communities)
print(f"Modularity: {modularity:.3f}")

Interpretation:

\(Q > 0.3\): Good community structure
\(Q > 0.5\): Strong community structure
\(Q < 0.3\): Weak or no community structure

Coverage and Performance

# Coverage: fraction of edges within communities
coverage = nx.community.coverage(network.core_network, communities)

# Performance: fraction of correctly classified node pairs
performance = nx.community.performance(network.core_network, communities)

print(f"Coverage: {coverage:.3f}")
print(f"Performance: {performance:.3f}")

Visualizing Communities

Color by Community

from py3plex.visualization.multilayer import hairball_plot
import matplotlib.pyplot as plt

# Map communities to colors
node_colors = [communities.get(node, 0) for node in network.core_network.nodes()]

# Visualize with community colors
hairball_plot(
    network.core_network,
    node_color=node_colors,
    layout_algorithm='force',
    cmap='tab10'
)
plt.show()

Community Size Distribution

from collections import Counter
import matplotlib.pyplot as plt

# Count community sizes
community_sizes = Counter(communities.values())
sizes = list(community_sizes.values())

# Plot distribution
plt.hist(sizes, bins=20)
plt.xlabel('Community Size')
plt.ylabel('Frequency')
plt.title('Community Size Distribution')
plt.show()

Comparing Algorithms

Run Multiple Algorithms

# Run different algorithms
louvain_comms = community_louvain.best_partition(network.core_network)
label_prop_comms = label_propagation.propagate(network.core_network)

# Compare number of communities
print(f"Louvain: {len(set(louvain_comms.values()))} communities")
print(f"Label Prop: {len(set(label_prop_comms.values()))} communities")

# Compare modularity
louvain_mod = nx.community.modularity(network.core_network,
                                      [set(n for n, c in louvain_comms.items() if c == i)
                                       for i in set(louvain_comms.values())])
label_mod = nx.community.modularity(network.core_network,
                                    [set(n for n, c in label_prop_comms.items() if c == i)
                                     for i in set(label_prop_comms.values())])

print(f"Louvain modularity: {louvain_mod:.3f}")
print(f"Label Prop modularity: {label_mod:.3f}")

Normalized Mutual Information

Compare similarity between community structures:

from sklearn.metrics import normalized_mutual_info_score

# Convert to lists
louvain_list = [louvain_comms[node] for node in network.core_network.nodes()]
label_list = [label_prop_comms[node] for node in network.core_network.nodes()]

# Compute NMI
nmi = normalized_mutual_info_score(louvain_list, label_list)
print(f"NMI between Louvain and Label Prop: {nmi:.3f}")

Interpretation:

NMI = 1.0: Identical community structures
NMI = 0.0: Completely different structures
NMI > 0.5: Similar structures

Best Practices

Algorithm Selection

Network Size	Speed Priority	Quality Priority	Recommendation
Small (<1K)	Any	Any	Try all algorithms
Medium (1K-10K)	Louvain	Louvain/Infomap	Louvain (good balance)
Large (10K-100K)	Louvain/Label Prop	Louvain	Louvain
Very Large (>100K)	Label Prop	Louvain	Label Prop or sample

Parameter Guidelines

Louvain resolution:

Start with resolution=1.0
Increase if communities are too large
Decrease if communities are too fragmented

Multilayer coupling (omega):

omega=1.0 - Default, balanced
omega<1.0 - Emphasize layer-specific structure
omega>1.0 - Emphasize cross-layer structure

Validation

Always validate community detection results:

Visual inspection - Plot and examine communities
Modularity - Check modularity score (>0.3 is good)
Size distribution - Check for giant communities or singletons
Domain knowledge - Do communities make sense for your application?

References

Louvain:

Blondel, V. D., et al. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics.

Infomap:

Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. PNAS, 105(4), 1118-1123.

Multilayer Modularity:

Mucha, P. J., et al. (2010). Community structure in time-dependent, multiscale, and multiplex networks. Science, 328(5980), 876-878.

See Citation and References for complete citations with DOIs.

Next Steps

Multilayer Network Centrality Measures - Centrality measures tutorial
Multilayer Modularity and Community Detection - Multilayer modularity details
Algorithm Selection Guide - Algorithm selection guide
Examples in examples/communities/example_community_detection.py