How to Run Community Detection on Multilayer Networks

Goal: This guide demonstrates how to apply community detection algorithms to multilayer networks and interpret their results. Community detection identifies mesoscale structure—groups of nodes that are more densely connected internally than to the rest of the network. In multilayer networks, communities can exist within single layers, span multiple layers, or emerge from inter-layer coupling patterns. This analysis is essential for understanding functional modules, organizational structure, and hierarchical clustering in complex systems.

📓 Run this guide online

You can run this tutorial in your browser without any local installation:

Or see the full executable example: example_community_detection.py

Prerequisites:

A loaded multilayer network (see How to Load and Build Networks)
Basic familiarity with network terminology (nodes, edges, layers)
Understanding of modularity as a quality metric (covered in this guide)

When to use community detection:

Identifying functional modules in biological networks
Detecting organizational units in social networks
Finding coherent topics in multi-relational knowledge graphs
Analyzing temporal evolution of communities across time-sliced networks
Discovering cross-layer relationships in multiplex systems

Quick Start: Louvain Algorithm

What is Louvain?

The Louvain algorithm (Blondel et al., 2008) is a fast, greedy method that optimizes modularity, defined as:

\[Q = \frac{1}{2m} \sum_{ij} \left[ A_{ij} - \frac{k_i k_j}{2m} \right] \delta(c_i, c_j)\]

where \(A_{ij}\) is the adjacency matrix, \(k_i\) is node degree, \(m\) is total edges, and \(\delta(c_i, c_j)=1\) if nodes \(i,j\) are in the same community. Higher \(Q\) indicates stronger community structure.

How it works:

Initialize: each node starts in its own community
For each node, compute \(\Delta Q\) from moving to each neighbor’s community
Move the node to the community with maximum positive \(\Delta Q\)
Aggregate: collapse communities into super-nodes and repeat
Stop when no further improvement is possible

Time complexity: \(O(n \log n)\) for sparse networks

Basic example:

from py3plex.core import multinet
from py3plex.algorithms.community_detection.community_wrapper import louvain_communities

# Load multilayer network
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "datasets/synthetic_multilayer.txt",
    input_type="multiedgelist"
)

# Run Louvain (operates on flattened network by default)
communities = louvain_communities(network)

# Analyze results
from collections import Counter
comm_sizes = Counter(communities.values())

print(f"Number of communities: {len(comm_sizes)}")
print(f"Largest community: {max(comm_sizes.values())} nodes")
print(f"Smallest community: {min(comm_sizes.values())} nodes")
print(f"Average size: {sum(comm_sizes.values())/len(comm_sizes):.1f}")

# Sample assignments
for node, comm_id in list(communities.items())[:5]:
    print(f"  {node} → Community {comm_id}")

Expected output:

Number of communities: 4
Largest community: 45 nodes
Smallest community: 8 nodes
Average size: 22.8
  ('A1', 'layer1') → Community 0
  ('A2', 'layer1') → Community 0
  ('B1', 'layer1') → Community 1
  ('B2', 'layer2') → Community 1
  ('C1', 'layer2') → Community 2

Note: The standard louvain_communities function flattens the multilayer network into a single-layer graph (projecting all nodes across layers into a unified node set). For layer-aware detection, use louvain_multilayer (see next section).

Multilayer-Specific: Multilayer Louvain

What makes multilayer community detection different?

Standard Louvain treats a multilayer network as a single flattened graph, losing layer identity. Multilayer Louvain (Mucha et al., 2010) optimizes the multilayer modularity:

\[Q_{\text{multi}} = \frac{1}{2\mu} \sum_{ij\alpha\beta} \left[ \left(A^\alpha_{ij} - \gamma^\alpha \frac{k_i^\alpha k_j^\alpha}{2m_\alpha}\right)\delta_{\alpha\beta} + \delta_{ij}\omega_{\alpha\beta} \right] \delta(g_{i\alpha}, g_{j\beta})\]

where:

\(A^\alpha_{ij}\): adjacency in layer \(\alpha\)
\(\gamma^\alpha\): resolution parameter for layer \(\alpha\) (default 1.0)
\(\omega_{\alpha\beta}\): inter-layer coupling strength (default 1.0)
\(\delta_{ij}=1\) if \(i=j\) (inter-layer edges connect same node across layers)
\(\delta(g_{i\alpha}, g_{j\beta})=1\) if node \(i\) in layer \(\alpha\) and node \(j\) in layer \(\beta\) are in the same community
\(\mu\): total weight in supra-network

Key insight: The coupling term \(\omega_{\alpha\beta}\) controls whether communities span layers:

ω = 0: Layers are independent → separate communities per layer
ω → ∞: Strong coupling → communities span all layers
0 < ω < ∞: Partial coupling → communities can span some layers

Full workflow example:

from py3plex.core import multinet
from py3plex.algorithms.community_detection.multilayer_modularity import (
    louvain_multilayer,
    multilayer_modularity
)
from collections import Counter, defaultdict

# Load multilayer network
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "datasets/synthetic_multilayer.txt",
    input_type="multiedgelist"
)

print("Network structure:")
print(f"  Layers: {network.get_layers()}")
print(f"  Nodes: {len(network.get_nodes())}")
print(f"  Edges (total): {network.number_of_edges()}")

# Run multilayer Louvain with different coupling strengths
for omega in [0.0, 0.5, 1.0, 2.0]:
    print(f"\n--- Coupling ω={omega} ---")

    communities = louvain_multilayer(
        network,
        gamma=1.0,        # Resolution (default)
        omega=omega,      # Inter-layer coupling
        random_state=42   # For reproducibility
    )

    # Count communities
    n_communities = len(set(communities.values()))

    # Calculate multilayer modularity
    Q = multilayer_modularity(network, communities, gamma=1.0, omega=omega)

    # Analyze layer coverage
    layer_coverage = defaultdict(set)  # community -> set of layers
    for (node, layer), comm_id in communities.items():
        layer_coverage[comm_id].add(layer)

    cross_layer = sum(1 for layers in layer_coverage.values() if len(layers) > 1)
    single_layer = len(layer_coverage) - cross_layer

    print(f"  Communities: {n_communities}")
    print(f"  Modularity Q: {Q:.4f}")
    print(f"  Cross-layer communities: {cross_layer}")
    print(f"  Single-layer communities: {single_layer}")

    # Size distribution
    comm_sizes = Counter(communities.values())
    avg_size = sum(comm_sizes.values()) / len(comm_sizes)
    print(f"  Average community size: {avg_size:.1f} node-layers")

Expected output:

Network structure:
  Layers: ['layer1', 'layer2', 'layer3']
  Nodes: 120 (40 nodes × 3 layers)
  Edges (total): 284

--- Coupling ω=0.0 ---
  Communities: 12
  Modularity Q: 0.3456
  Cross-layer communities: 0
  Single-layer communities: 12
  Average community size: 10.0 node-layers

--- Coupling ω=0.5 ---
  Communities: 8
  Modularity Q: 0.4123
  Cross-layer communities: 3
  Single-layer communities: 5
  Average community size: 15.0 node-layers

--- Coupling ω=1.0 ---
  Communities: 5
  Modularity Q: 0.4589
  Cross-layer communities: 4
  Single-layer communities: 1
  Average community size: 24.0 node-layers

--- Coupling ω=2.0 ---
  Communities: 4
  Modularity Q: 0.4234
  Cross-layer communities: 4
  Single-layer communities: 0
  Average community size: 30.0 node-layers

Interpretation:

ω=0.0: Each layer has independent communities (useful for baseline)
ω=0.5-1.0: Balanced trade-off, some communities span layers
ω>1.0: Forces global communities across all layers (may over-integrate)

Choosing ω:

Use domain knowledge: biological function (high ω), temporal snapshots (low ω)
Grid search: try ω ∈ [0.1, 0.5, 1.0, 2.0, 5.0] and pick maximum Q
Consensus clustering: aggregate results across multiple ω values

Infomap Algorithm

What is Infomap?

Infomap (Rosvall & Bergstrom, 2008) uses information theory to find communities by minimizing the map equation:

\[L(M) = q_\curvearrowright H(Q) + \sum_{i=1}^m p_{\circlearrowright}^i H(P^i)\]

where:

\(q_\curvearrowright\): probability of switching between modules (inter-module flow)
\(H(Q)\): entropy of module codebook
\(p_{\circlearrowright}^i\): probability of staying within module \(i\) (intra-module flow)
\(H(P^i)\): entropy of nodes within module \(i\)

Key insight: Infomap simulates a random walker and finds communities that compress the description length of the walker’s trajectory. Communities are regions where the walker gets “trapped” for extended periods.

Pros/cons vs. Louvain:

Pros: Often finds better communities for flow-based systems (e.g., citation networks, web graphs)
Cons: Requires external binary (not pure Python), slower than Louvain, harder to interpret parameters

Installation:

Infomap requires the standalone binary from https://www.mapequation.org/infomap/:

# Download and install
wget https://www.mapequation.org/downloads/Infomap.zip
unzip Infomap.zip
cd Infomap
make
sudo cp Infomap /usr/local/bin/infomap

# Or install Python wrapper (alternative)
pip install infomap

Basic usage:

from py3plex.core import multinet
from py3plex.algorithms.community_detection.community_wrapper import infomap_communities
import os

# Load network
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "datasets/synthetic_multilayer.txt",
    input_type="multiedgelist"
)

# Check if binary exists
binary_path = "/usr/local/bin/infomap"  # Adjust to your installation
if not os.path.exists(binary_path):
    print(f"Infomap binary not found at {binary_path}")
    print("Please install from: https://www.mapequation.org/infomap/")
    print("Falling back to Louvain...")
    # Use Louvain as fallback
    from py3plex.algorithms.community_detection.community_wrapper import louvain_communities
    communities = louvain_communities(network)
else:
    # Run Infomap
    communities = infomap_communities(
        network,
        binary=binary_path,
        multiplex=True,      # Use multiplex mode for multilayer networks
        iterations=1000,     # More iterations = better convergence
        seed=42,             # For reproducibility
        verbose=False        # Set True to see Infomap output
    )

# Analyze results
from collections import Counter
comm_sizes = Counter(communities.values())

print(f"Number of communities: {len(comm_sizes)}")
print(f"Largest community: {max(comm_sizes.values())} nodes")
print(f"Average size: {sum(comm_sizes.values())/len(comm_sizes):.1f}")

Expected output:

Number of communities: 6
Largest community: 38 nodes
Average size: 20.0

Multiplex mode:

When multiplex=True, Infomap treats layers as separate networks but allows random walkers to switch layers (implicitly modeling inter-layer coupling). This is different from Louvain’s explicit \(\omega\) parameter.

Comparison workflow:

from sklearn.metrics import adjusted_rand_score, normalized_mutual_info_score

# Run both algorithms
louvain_comms = louvain_communities(network)
infomap_comms = infomap_communities(network, binary=binary_path, seed=42)

# Convert to aligned label vectors
nodes = list(louvain_comms.keys())
louvain_labels = [louvain_comms[n] for n in nodes]
infomap_labels = [infomap_comms[n] for n in nodes]

# Compute similarity
ari = adjusted_rand_score(louvain_labels, infomap_labels)
nmi = normalized_mutual_info_score(louvain_labels, infomap_labels)

print(f"Agreement between Louvain and Infomap:")
print(f"  ARI: {ari:.3f}  (1.0 = perfect agreement)")
print(f"  NMI: {nmi:.3f}  (1.0 = perfect agreement)")

Expected output:

Agreement between Louvain and Infomap:
  ARI: 0.723  (1.0 = perfect agreement)
  NMI: 0.815  (1.0 = perfect agreement)

When to use Infomap:

Citation/web networks with clear flow patterns
Networks where you care about information diffusion
When Louvain gives unsatisfying results (try both and compare)
When you have the binary installed (otherwise, stick with Louvain)

Label Propagation

What is Label Propagation?

Label propagation (Raghavan et al., 2007) is an extremely fast, near-linear time algorithm that works by iteratively assigning each node to the most common community among its neighbors.

Algorithm:

Initialize: each node gets a unique label (community ID)
For t=1 to T iterations:
1. Randomize node order
2. For each node \(i\):
  - Count neighbor labels: \(n_c = |\{j \in N(i) : c_j = c\}|\)
  - Assign \(c_i = \arg\max_c n_c\) (ties broken randomly)
Stop when labels stabilize or max iterations reached

Time complexity: \(O(m)\) per iteration (linear in edges)

Pros/cons:

Pros: Very fast, scales to millions of nodes, no parameters to tune
Cons: Non-deterministic (order-dependent), lower quality than Louvain/Infomap, may not converge

Implementation note:

py3plex uses NetworkX’s label propagation for single-layer networks:

from py3plex.core import multinet
import networkx as nx
from networkx.algorithms.community import asyn_lpa_communities
from collections import defaultdict

# Load network
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "datasets/synthetic_multilayer.txt",
    input_type="multiedgelist"
)

# Convert to NetworkX (flattened single-layer graph)
G = nx.Graph()
for edge in network.core_network.edges():
    G.add_edge(edge[0], edge[1])

# Run label propagation
communities_list = asyn_lpa_communities(G, seed=42)

# Convert to dict format: node -> community_id
communities = {}
for comm_id, comm_nodes in enumerate(communities_list):
    for node in comm_nodes:
        communities[node] = comm_id

# Analyze results
from collections import Counter
comm_sizes = Counter(communities.values())

print(f"Number of communities: {len(comm_sizes)}")
print(f"Largest community: {max(comm_sizes.values())} nodes")
print(f"Average size: {sum(comm_sizes.values())/len(comm_sizes):.1f}")

# Run multiple times to check stability
print("\nStability check (5 runs with same seed):")
for run in range(5):
    comms_run = list(asyn_lpa_communities(G, seed=42))
    n_comms = len(comms_run)
    print(f"  Run {run+1}: {n_comms} communities")

Expected output:

Number of communities: 7
Largest community: 34 nodes
Average size: 17.1

Stability check (5 runs with same seed):
  Run 1: 7 communities
  Run 2: 7 communities
  Run 3: 7 communities
  Run 4: 8 communities
  Run 5: 7 communities

Layer-aware label propagation (custom implementation):

For multilayer networks, you can implement layer-aware label propagation:

import random
from collections import Counter

def multilayer_label_propagation(network, max_iter=100, seed=42):
    """
    Layer-aware label propagation for multilayer networks.
    Propagates labels within each layer independently.
    """
    random.seed(seed)

    # Initialize: each node-layer gets unique label
    labels = {nl: i for i, nl in enumerate(network.get_nodes())}

    # Get layer-specific edges
    layer_edges = {}
    for layer in network.get_layers():
        layer_edges[layer] = [
            (e[0], e[1]) for e in network.core_network.edges()
            if e[0][1] == layer and e[1][1] == layer
        ]

    # Iterate
    for iteration in range(max_iter):
        changed = False
        nodes = list(labels.keys())
        random.shuffle(nodes)

        for node, layer in nodes:
            # Get neighbors in same layer
            neighbors = [
                target for source, target in layer_edges.get(layer, [])
                if source == (node, layer)
            ] + [
                source for source, target in layer_edges.get(layer, [])
                if target == (node, layer)
            ]

            if not neighbors:
                continue

            # Count neighbor labels
            neighbor_labels = [labels[n] for n in neighbors]
            label_counts = Counter(neighbor_labels)

            # Assign most common label (ties broken randomly)
            most_common = label_counts.most_common()
            max_count = most_common[0][1]
            candidates = [lbl for lbl, cnt in most_common if cnt == max_count]
            new_label = random.choice(candidates)

            if new_label != labels[(node, layer)]:
                labels[(node, layer)] = new_label
                changed = True

        if not changed:
            print(f"Converged after {iteration+1} iterations")
            break

    # Renumber communities
    unique_labels = sorted(set(labels.values()))
    label_map = {old: new for new, old in enumerate(unique_labels)}
    return {nl: label_map[lbl] for nl, lbl in labels.items()}

# Run custom implementation
communities = multilayer_label_propagation(network, max_iter=100, seed=42)

comm_sizes = Counter(communities.values())
print(f"\nLayer-aware label propagation:")
print(f"  Communities: {len(comm_sizes)}")
print(f"  Average size: {sum(comm_sizes.values())/len(comm_sizes):.1f}")

Expected output:

Converged after 23 iterations

Layer-aware label propagation:
  Communities: 9
  Average size: 13.3

When to use label propagation:

Very large networks (>100k nodes) where Louvain is too slow
Exploratory analysis where you need quick initial results
Streaming settings where you process edges incrementally
Not recommended for publication-quality results (use Louvain or Infomap instead)

Analyzing Community Structure

After detecting communities, you need to analyze and interpret the results. This section shows robust workflows for understanding community properties.

Count Nodes Per Community

Basic counting:

from collections import Counter
import numpy as np

# Assuming 'communities' is a dict: node -> community_id
comm_sizes = Counter(communities.values())

print(f"Total communities: {len(comm_sizes)}")
print(f"\nTop 10 largest communities:")
for comm_id, size in comm_sizes.most_common(10):
    print(f"  Community {comm_id}: {size} nodes")

# Size statistics
sizes = np.array(list(comm_sizes.values()))
print(f"\nSize distribution:")
print(f"  Mean: {np.mean(sizes):.2f}")
print(f"  Median: {np.median(sizes):.2f}")
print(f"  Std dev: {np.std(sizes):.2f}")
print(f"  Min: {np.min(sizes)}")
print(f"  Max: {np.max(sizes)}")
print(f"  Q1/Q3: {np.percentile(sizes, 25):.0f} / {np.percentile(sizes, 75):.0f}")

Expected output:

Total communities: 5

Top 10 largest communities:
  Community 0: 45 nodes
  Community 1: 38 nodes
  Community 2: 22 nodes
  Community 3: 10 nodes
  Community 4: 5 nodes

Size distribution:
  Mean: 24.00
  Median: 22.00
  Std dev: 15.87
  Min: 5
  Max: 45
  Q1/Q3: 10 / 38

Layer coverage analysis (for multilayer networks):

from collections import defaultdict

# communities: {(node, layer): comm_id}
layer_coverage = defaultdict(lambda: defaultdict(set))  # comm -> layer -> nodes

for (node, layer), comm_id in communities.items():
    layer_coverage[comm_id][layer].add(node)

print("Community layer coverage:")
for comm_id in sorted(layer_coverage.keys()):
    layers = layer_coverage[comm_id]
    total_size = sum(len(nodes) for nodes in layers.values())
    print(f"\nCommunity {comm_id} (total: {total_size} node-layers):")
    for layer, nodes in sorted(layers.items()):
        print(f"  {layer}: {len(nodes)} nodes")

    # Cross-layer nodes (nodes appearing in multiple layers within same community)
    all_nodes = set()
    for nodes in layers.values():
        all_nodes.update(nodes)
    unique_nodes = len(all_nodes)
    redundancy = total_size / unique_nodes if unique_nodes > 0 else 0
    print(f"  Unique nodes: {unique_nodes}, Redundancy: {redundancy:.2f}x")

Expected output:

Community layer coverage:

Community 0 (total: 45 node-layers):
  layer1: 18 nodes
  layer2: 15 nodes
  layer3: 12 nodes
  Unique nodes: 15, Redundancy: 3.00x

Community 1 (total: 38 node-layers):
  layer1: 20 nodes
  layer2: 18 nodes
  Unique nodes: 20, Redundancy: 1.90x

Community 2 (total: 22 node-layers):
  layer3: 22 nodes
  Unique nodes: 22, Redundancy: 1.00x

Visualize Communities

Hairball plot with community colors:

from py3plex.visualization.multilayer import hairball_plot
import matplotlib.pyplot as plt
from py3plex.visualization.colors import colors_default

# Select top N communities to color
top_n = 8
top_communities = [c for c, _ in comm_sizes.most_common(top_n)]

# Create color mapping
color_map = dict(zip(
    top_communities,
    colors_default[:top_n]
))

# Assign colors to nodes
node_colors = []
for node in network.get_nodes():
    comm_id = communities.get(node, -1)
    if comm_id in color_map:
        node_colors.append(color_map[comm_id])
    else:
        node_colors.append('lightgray')  # Small communities

# Plot
plt.figure(figsize=(12, 10))
hairball_plot(
    network.core_network,
    color_list=node_colors,
    layout_algorithm='force',
    layout_parameters={'iterations': 500},
    scale_by_size=True,
    legend=False
)
plt.title('Community Structure (Top 8 Communities Colored)', fontsize=16)
plt.tight_layout()
plt.savefig('community_hairball.png', dpi=300, bbox_inches='tight')
plt.show()

print("Visualization saved to: community_hairball.png")

Size distribution histogram:

import matplotlib.pyplot as plt
import numpy as np

sizes = list(comm_sizes.values())

plt.figure(figsize=(10, 6))
plt.hist(sizes, bins=20, edgecolor='black', alpha=0.7)
plt.xlabel('Community Size (number of nodes)', fontsize=12)
plt.ylabel('Frequency', fontsize=12)
plt.title(f'Community Size Distribution (n={len(sizes)} communities)', fontsize=14)
plt.axvline(np.mean(sizes), color='red', linestyle='--', label=f'Mean: {np.mean(sizes):.1f}')
plt.axvline(np.median(sizes), color='blue', linestyle='--', label=f'Median: {np.median(sizes):.1f}')
plt.legend()
plt.grid(alpha=0.3)
plt.tight_layout()
plt.savefig('community_size_distribution.png', dpi=300)
plt.show()

Layer-specific visualization:

For multilayer networks, visualize community composition across layers:

import pandas as pd
import seaborn as sns

# Build matrix: communities × layers
layers = network.get_layers()
comm_ids = sorted(set(communities.values()))

matrix = np.zeros((len(comm_ids), len(layers)))
for (node, layer), comm_id in communities.items():
    layer_idx = layers.index(layer)
    comm_idx = comm_ids.index(comm_id)
    matrix[comm_idx, layer_idx] += 1

# Heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(
    matrix,
    xticklabels=layers,
    yticklabels=[f'C{i}' for i in comm_ids],
    cmap='YlOrRd',
    annot=True,
    fmt='.0f',
    cbar_kws={'label': 'Number of nodes'}
)
plt.xlabel('Layer', fontsize=12)
plt.ylabel('Community', fontsize=12)
plt.title('Community × Layer Composition Heatmap', fontsize=14)
plt.tight_layout()
plt.savefig('community_layer_heatmap.png', dpi=300)
plt.show()

Export Communities

CSV export (most common):

import pandas as pd

# Convert to DataFrame
data = []
for (node, layer), comm_id in communities.items():
    data.append({
        'node': node,
        'layer': layer,
        'community': comm_id
    })

df = pd.DataFrame(data)

# Add community size
size_map = dict(comm_sizes)
df['community_size'] = df['community'].map(size_map)

# Sort by community, then layer, then node
df = df.sort_values(['community', 'layer', 'node'])

# Save
df.to_csv('communities.csv', index=False)
print(f"Exported {len(df)} node-layer assignments to communities.csv")
print(f"\nFirst few rows:")
print(df.head(10))

Expected output:

Exported 120 node-layer assignments to communities.csv

First few rows:
    node   layer  community  community_size
   A1  layer1          0              45
   A1  layer2          0              45
   A1  layer3          0              45
   A2  layer1          0              45
   A2  layer2          0              45
   B1  layer1          1              38
   B1  layer2          1              38
   B2  layer1          1              38
   C1  layer3          2              22
   C2  layer3          2              22

JSON export (for web apps):

import json

# Group by community
community_dict = defaultdict(list)
for (node, layer), comm_id in communities.items():
    community_dict[str(comm_id)].append({
        'node': node,
        'layer': layer
    })

# Add metadata
output = {
    'num_communities': len(community_dict),
    'num_nodes': len(set(node for node, _ in communities.keys())),
    'num_layers': len(network.get_layers()),
    'communities': dict(community_dict)
}

with open('communities.json', 'w') as f:
    json.dump(output, f, indent=2)

print("Exported to communities.json")

Cytoscape format (for visualization):

# Node table
node_df = pd.DataFrame([
    {
        'node_id': f"{node}_{layer}",
        'node': node,
        'layer': layer,
        'community': communities.get((node, layer), -1)
    }
    for node, layer in network.get_nodes()
])
node_df.to_csv('cytoscape_nodes.csv', index=False)

# Edge table
edge_data = []
for source, target in network.core_network.edges():
    edge_data.append({
        'source': f"{source[0]}_{source[1]}",
        'target': f"{target[0]}_{target[1]}",
        'source_community': communities.get(source, -1),
        'target_community': communities.get(target, -1),
        'is_intra_community': communities.get(source, -1) == communities.get(target, -1)
    })
edge_df = pd.DataFrame(edge_data)
edge_df.to_csv('cytoscape_edges.csv', index=False)

print("Exported to cytoscape_nodes.csv and cytoscape_edges.csv")
print("Import these into Cytoscape for interactive visualization")

Query Communities with DSL

Goal: Use py3plex’s Domain-Specific Language (DSL) to query and analyze community-detected networks efficiently.

The DSL provides a declarative, SQL-like interface for querying multilayer networks. After detecting communities, you can use DSL queries to filter nodes by community membership, compute community-level statistics, and extract subnetworks.

Prerequisites:

Community detection results (e.g., from louvain_communities())
Familiarity with DSL basics (see How to Query Multilayer Graphs with the SQL-like DSL for full tutorial)

DSL Basics for Communities

String Syntax - SQL-like queries:

from py3plex.core import multinet
from py3plex.algorithms.community_detection.community_wrapper import louvain_communities
from py3plex.dsl import execute_query

# Load network and detect communities
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "py3plex/datasets/_data/synthetic_multilayer.edges",
    input_type="multiedgelist"
)

communities = louvain_communities(network)

# Attach community labels as node attributes
for (node, layer), comm_id in communities.items():
    network.core_network.nodes[(node, layer)]['community'] = comm_id

# DSL Query: Find nodes in community 0
result = execute_query(
    network,
    'SELECT nodes WHERE community=0'
)

print(f"Nodes in community 0: {len(result)}")
for node in list(result)[:5]:
    print(f"  {node}")

Expected output:

Nodes in community 0: 18
  ('node1', 'layer1')
  ('node1', 'layer2')
  ('node2', 'layer1')
  ('node3', 'layer1')
  ('node3', 'layer3')

Builder API - Chainable operations:

from py3plex.dsl import Q, L

# Find high-degree nodes in a specific community
result = (
    Q.nodes()
     .where(community=0)
     .compute("degree")
     .where(degree__gt=5)
     .order_by("degree", reverse=True)
     .execute(network)
)

# Convert to pandas for analysis
import pandas as pd
df = pd.DataFrame([
    {
        'node': node[0],
        'layer': node[1],
        'degree': data['degree'],
        'community': data.get('community', -1)
    }
    for node, data in result.items()
])

print("High-degree nodes in community 0:")
print(df.head(10))

Expected output:

High-degree nodes in community 0:
      node   layer  degree  community
  node1  layer1      12          0
  node1  layer2      10          0
  node2  layer1       9          0
  node5  layer1       8          0
  node5  layer3       7          0

Community-Level Queries

Count nodes per community:

# Get all communities
community_ids = set(communities.values())

for comm_id in sorted(community_ids):
    result = execute_query(
        network,
        f'SELECT nodes WHERE community={comm_id}'
    )
    print(f"Community {comm_id}: {len(result)} nodes")

Find inter-community edges:

from py3plex.dsl import Q

# Attach community labels to edges based on endpoint communities
for edge in network.core_network.edges():
    source, target = edge
    source_comm = communities.get(source, -1)
    target_comm = communities.get(target, -1)
    network.core_network.edges[edge]['source_community'] = source_comm
    network.core_network.edges[edge]['target_community'] = target_comm
    network.core_network.edges[edge]['is_intra_community'] = (source_comm == target_comm)

# Query inter-community edges
inter_comm_edges = (
    Q.edges()
     .where(is_intra_community=False)
     .execute(network)
)

intra_comm_edges = (
    Q.edges()
     .where(is_intra_community=True)
     .execute(network)
)

print(f"Intra-community edges: {len(intra_comm_edges)}")
print(f"Inter-community edges: {len(inter_comm_edges)}")
print(f"Ratio: {len(inter_comm_edges)/len(intra_comm_edges):.3f}")

Expected output:

Intra-community edges: 245
Inter-community edges: 39
Ratio: 0.159

Layer-Specific Community Queries

Find nodes in a specific community and layer:

from py3plex.dsl import Q, L

# Community 0 nodes in layer1 only
result = (
    Q.nodes()
     .from_layers(L["layer1"])
     .where(community=0)
     .compute("degree")
     .execute(network)
)

print(f"Community 0 in layer1: {len(result)} nodes")
print(f"Average degree: {sum(d['degree'] for d in result.values())/len(result):.2f}")

Compare community structure across layers:

layers = network.get_layers()

for layer in layers:
    # Count communities present in this layer
    layer_nodes = (
        Q.nodes()
         .from_layers(L[layer])
         .execute(network)
    )

    layer_communities = set(
        communities.get(node, -1)
        for node in layer_nodes
    )

    print(f"{layer}: {len(layer_communities)} communities, {len(layer_nodes)} nodes")

Expected output:

layer1: 5 communities, 40 nodes
layer2: 4 communities, 40 nodes
layer3: 3 communities, 40 nodes

Extract Community Subnetworks

Extract a single community as a subnetwork:

from py3plex.dsl import Q

# Extract community 0
comm_0_nodes = execute_query(
    network,
    'SELECT nodes WHERE community=0'
)

# Get induced subgraph
subgraph = network.core_network.subgraph(comm_0_nodes)

# Convert to new multilayer network
community_network = multinet.multi_layer_network(directed=False)
community_network.core_network = subgraph.copy()

print(f"Community 0 subnetwork:")
print(f"  Nodes: {community_network.number_of_nodes()}")
print(f"  Edges: {community_network.number_of_edges()}")
print(f"  Layers: {community_network.get_layers()}")

Expected output:

Community 0 subnetwork:
  Nodes: 18
  Edges: 67
  Layers: ['layer1', 'layer2', 'layer3']

Compute Community-Level Statistics

Average centrality per community:

from py3plex.dsl import Q
from collections import defaultdict

# Compute centrality for all nodes
result = (
    Q.nodes()
     .compute("betweenness_centrality", "degree")
     .execute(network)
)

# Group by community
comm_stats = defaultdict(list)
for node, data in result.items():
    comm_id = data.get('community', -1)
    comm_stats[comm_id].append({
        'degree': data['degree'],
        'betweenness': data['betweenness_centrality']
    })

# Calculate averages
print("Community-level statistics:")
print(f"{'Community':<12} {'Nodes':<8} {'Avg Degree':<12} {'Avg Betweenness':<18}")
print("-" * 50)

for comm_id in sorted(comm_stats.keys()):
    stats = comm_stats[comm_id]
    n_nodes = len(stats)
    avg_degree = sum(s['degree'] for s in stats) / n_nodes
    avg_betw = sum(s['betweenness'] for s in stats) / n_nodes
    print(f"{comm_id:<12} {n_nodes:<8} {avg_degree:<12.2f} {avg_betw:<18.6f}")

Expected output:

Community-level statistics:
Community    Nodes    Avg Degree   Avg Betweenness
--------------------------------------------------
0            18       7.44         0.012345
1            15       6.13         0.008234
2            12       5.25         0.005678
3            8        4.50         0.003456
4            7        3.86         0.002123

Complex DSL Workflows

Multi-step analysis: Find bridge nodes between communities:

from py3plex.dsl import Q

# Bridge nodes: high betweenness + connect multiple communities
# First, compute betweenness
result = (
    Q.nodes()
     .compute("betweenness_centrality", "degree")
     .execute(network)
)

# Identify potential bridges (high betweenness)
bridges = [
    (node, data['betweenness_centrality'])
    for node, data in result.items()
    if data['betweenness_centrality'] > 0.01  # Threshold
]

print(f"Potential bridge nodes (betweenness > 0.01): {len(bridges)}")

# For each bridge, check which communities its neighbors belong to
for node, betw in sorted(bridges, key=lambda x: x[1], reverse=True)[:5]:
    # Get neighbors
    neighbors = list(network.core_network.neighbors(node))
    neighbor_comms = set(communities.get(n, -1) for n in neighbors)

    print(f"  {node}: betweenness={betw:.6f}, connects {len(neighbor_comms)} communities")

Expected output:

Potential bridge nodes (betweenness > 0.01): 12
  ('node7', 'layer1'): betweenness=0.045678, connects 3 communities
  ('node12', 'layer2'): betweenness=0.034567, connects 2 communities
  ('node3', 'layer1'): betweenness=0.023456, connects 3 communities
  ('node15', 'layer3'): betweenness=0.019876, connects 2 communities
  ('node8', 'layer2'): betweenness=0.015432, connects 2 communities

Temporal community analysis (for time-sliced networks):

from py3plex.dsl import Q, L

# Assuming layers represent time slices: t1, t2, t3
time_layers = ['t1', 't2', 't3']

# Track specific nodes across time
tracked_nodes = ['Alice', 'Bob', 'Carol']

print("Community membership over time:")
for node in tracked_nodes:
    print(f"\n{node}:")
    for t_layer in time_layers:
        node_key = (node, t_layer)
        comm_id = communities.get(node_key, None)
        if comm_id is not None:
            print(f"  {t_layer}: Community {comm_id}")
        else:
            print(f"  {t_layer}: Not present")

Why use DSL for community analysis?

Declarative: Express what you want, not how to compute it
Composable: Chain operations to build complex queries
Efficient: DSL optimizes query execution internally
Readable: SQL-like syntax is self-documenting
Interoperable: Results integrate seamlessly with pandas, NumPy, and visualization tools

Next steps with DSL:

Full DSL tutorial: How to Query Multilayer Graphs with the SQL-like DSL - Comprehensive guide with advanced patterns
Builder API reference: ../reference/dsl_api - Complete API documentation
Temporal queries: How to Query Multilayer Graphs with the SQL-like DSL (Temporal Queries section) - Time-varying networks

Compare Algorithms

Different algorithms optimize different objective functions and may produce different community structures. Comparing multiple algorithms helps validate findings and understand algorithm-specific biases.

Metrics for comparing partitions:

Adjusted Rand Index (ARI): Measures similarity adjusted for chance
- Range: [-1, 1], where 1 = perfect agreement, 0 = random
- Adjusted for cluster size imbalance
Normalized Mutual Information (NMI): Information-theoretic similarity
- Range: [0, 1], where 1 = perfect agreement
- Symmetric, handles different number of communities well
Variation of Information (VI): Distance metric (lower = more similar)
- Range: [0, ∞], where 0 = identical partitions

Full comparison workflow:

from py3plex.core import multinet
from py3plex.algorithms.community_detection.community_wrapper import (
    louvain_communities,
    infomap_communities
)
from py3plex.algorithms.community_detection.multilayer_modularity import (
    louvain_multilayer,
    multilayer_modularity
)
from py3plex.algorithms.community_detection.community_louvain import modularity
import networkx as nx
from networkx.algorithms.community import asyn_lpa_communities
from sklearn.metrics import adjusted_rand_score, normalized_mutual_info_score
from scipy.spatial.distance import jensenshannon
import numpy as np
from collections import Counter

# Load network
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "datasets/synthetic_multilayer.txt",
    input_type="multiedgelist"
)

print("=" * 70)
print("COMMUNITY DETECTION ALGORITHM COMPARISON")
print("=" * 70)

# Run multiple algorithms
print("\n1. Running algorithms...")

# Louvain (flattened)
louvain_comms = louvain_communities(network)

# Multilayer Louvain (ω=1.0)
multilayer_comms = louvain_multilayer(
    network, gamma=1.0, omega=1.0, random_state=42
)

# Label propagation (flattened NetworkX graph)
G = nx.Graph()
for edge in network.core_network.edges():
    G.add_edge(edge[0], edge[1])
lpa_comms_list = asyn_lpa_communities(G, seed=42)
lpa_comms = {}
for comm_id, nodes in enumerate(lpa_comms_list):
    for node in nodes:
        lpa_comms[node] = comm_id

# (Optional) Infomap - skip if binary not available
try:
    infomap_comms = infomap_communities(
        network, binary="/usr/local/bin/infomap",
        seed=42, verbose=False
    )
    has_infomap = True
except Exception:
    has_infomap = False
    print("  [SKIP] Infomap not available")

# Store results
algorithms = {
    'Louvain (flat)': louvain_comms,
    'Louvain (multilayer)': multilayer_comms,
    'Label Propagation': lpa_comms,
}
if has_infomap:
    algorithms['Infomap'] = infomap_comms

# 2. Basic statistics
print("\n2. Basic statistics:")
print(f"{'Algorithm':<25} {'#Comm':<10} {'Largest':<10} {'Avg Size':<10}")
print("-" * 70)

for name, comms in algorithms.items():
    sizes = Counter(comms.values())
    n_comms = len(sizes)
    largest = max(sizes.values())
    avg_size = sum(sizes.values()) / n_comms
    print(f"{name:<25} {n_comms:<10} {largest:<10} {avg_size:<10.1f}")

# 3. Modularity scores
print("\n3. Modularity scores:")
print(f"{'Algorithm':<25} {'Modularity (Q)':<15}")
print("-" * 70)

for name, comms in algorithms.items():
    if name == 'Louvain (multilayer)':
        # Use multilayer modularity
        Q = multilayer_modularity(network, comms, gamma=1.0, omega=1.0)
    else:
        # Use single-layer modularity on flattened graph
        Q = modularity(comms, G, weight='weight')
    print(f"{name:<25} {Q:<15.4f}")

# 4. Pairwise agreement
print("\n4. Pairwise agreement (ARI / NMI):")

# Align all partitions to same node set
alg_names = list(algorithms.keys())
alg_labels = {}

# Get common nodes (for multilayer, use node-layer pairs)
all_nodes = set()
for comms in algorithms.values():
    all_nodes.update(comms.keys())
common_nodes = sorted(all_nodes)

# Convert to label vectors
for name, comms in algorithms.items():
    alg_labels[name] = [comms.get(node, -1) for node in common_nodes]

# Compute pairwise metrics
print(f"\n{'Pair':<45} {'ARI':<10} {'NMI':<10}")
print("-" * 70)

for i in range(len(alg_names)):
    for j in range(i+1, len(alg_names)):
        name1, name2 = alg_names[i], alg_names[j]
        labels1 = alg_labels[name1]
        labels2 = alg_labels[name2]

        ari = adjusted_rand_score(labels1, labels2)
        nmi = normalized_mutual_info_score(labels1, labels2)

        print(f"{name1} vs {name2:<25} {ari:<10.3f} {nmi:<10.3f}")

# 5. Size distribution comparison
print("\n5. Size distribution similarity:")

# Normalize size distributions
def normalize_sizes(comms):
    sizes = list(Counter(comms.values()).values())
    sizes_array = np.array(sorted(sizes, reverse=True))
    # Pad to same length
    max_len = max(len(Counter(c.values())) for c in algorithms.values())
    padded = np.zeros(max_len)
    padded[:len(sizes_array)] = sizes_array
    return padded / padded.sum()

size_dists = {name: normalize_sizes(comms) for name, comms in algorithms.items()}

print(f"{'Pair':<45} {'JS Divergence':<15}")
print("-" * 70)

for i in range(len(alg_names)):
    for j in range(i+1, len(alg_names)):
        name1, name2 = alg_names[i], alg_names[j]
        js_div = jensenshannon(size_dists[name1], size_dists[name2])
        print(f"{name1} vs {name2:<25} {js_div:<15.4f}")

Expected output:

======================================================================
COMMUNITY DETECTION ALGORITHM COMPARISON
======================================================================

1. Running algorithms...
  [SKIP] Infomap not available

2. Basic statistics:
Algorithm                 #Comm      Largest    Avg Size
----------------------------------------------------------------------
Louvain (flat)            5          45         24.0
Louvain (multilayer)      4          52         30.0
Label Propagation         7          38         17.1

3. Modularity scores:
Algorithm                 Modularity (Q)
----------------------------------------------------------------------
Louvain (flat)            0.4234
Louvain (multilayer)      0.4589
Label Propagation         0.3891

4. Pairwise agreement (ARI / NMI):

Pair                                          ARI        NMI
----------------------------------------------------------------------
Louvain (flat) vs Louvain (multilayer)        0.812      0.878
Louvain (flat) vs Label Propagation           0.623      0.745
Louvain (multilayer) vs Label Propagation     0.589      0.712

5. Size distribution similarity:
Pair                                          JS Divergence
----------------------------------------------------------------------
Louvain (flat) vs Louvain (multilayer)        0.1234
Louvain (flat) vs Label Propagation           0.2456
Louvain (multilayer) vs Label Propagation     0.2789

Interpretation:

High ARI/NMI (>0.8): Algorithms agree strongly → robust communities
Medium ARI/NMI (0.5-0.8): Partial agreement → sensitive to algorithm choice
Low ARI/NMI (<0.5): Strong disagreement → no clear community structure or algorithm-specific artifacts

Consensus clustering:

When algorithms disagree, use consensus clustering to find stable communities:

from collections import defaultdict

# Build co-occurrence matrix: how often do pairs of nodes appear together?
co_occurrence = defaultdict(int)
n_algorithms = len(algorithms)

for comms in algorithms.values():
    # For each community in this partition
    comm_groups = defaultdict(list)
    for node, comm_id in comms.items():
        comm_groups[comm_id].append(node)

    # Increment co-occurrence for all pairs in same community
    for nodes in comm_groups.values():
        for i, node1 in enumerate(nodes):
            for node2 in nodes[i+1:]:
                pair = tuple(sorted([node1, node2]))
                co_occurrence[pair] += 1

# Threshold: keep pairs that co-occur in ≥50% of algorithms
threshold = n_algorithms * 0.5
stable_pairs = {pair for pair, count in co_occurrence.items() if count >= threshold}

print(f"\nConsensus clustering:")
print(f"  Total node pairs: {len(co_occurrence)}")
print(f"  Stable pairs (≥50% agreement): {len(stable_pairs)}")
print(f"  Stability ratio: {len(stable_pairs)/len(co_occurrence):.2%}")

Expected output:

Consensus clustering:
  Total node pairs: 1845
  Stable pairs (≥50% agreement): 1234
  Stability ratio: 66.88%

Layer-Specific Communities

Motivation:

In multilayer networks, you may want to detect communities within individual layers and then compare them across layers. This reveals:

Layer-specific structure (e.g., friendship communities vs. work communities)
How community organization changes across contexts
Which communities are stable vs. layer-dependent

Workflow:

from py3plex.core import multinet
from py3plex.algorithms.community_detection.community_wrapper import louvain_communities
from py3plex.algorithms.community_detection.community_louvain import modularity
import networkx as nx
from collections import Counter

# Load multilayer network
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "datasets/synthetic_multilayer.txt",
    input_type="multiedgelist"
)

print("LAYER-SPECIFIC COMMUNITY DETECTION")
print("=" * 70)

# Extract and analyze each layer separately
layer_communities = {}
layer_stats = {}

for layer in network.get_layers():
    print(f"\n--- Layer: {layer} ---")

    # Extract layer-specific edges
    layer_edges = [
        (e[0][0], e[1][0])  # (node, node) without layer info
        for e in network.core_network.edges()
        if e[0][1] == layer and e[1][1] == layer
    ]

    # Build single-layer graph
    G_layer = nx.Graph()
    G_layer.add_edges_from(layer_edges)

    print(f"  Nodes: {G_layer.number_of_nodes()}")
    print(f"  Edges: {G_layer.number_of_edges()}")

    if G_layer.number_of_edges() == 0:
        print(f"  [SKIP] No edges in this layer")
        continue

    # Run Louvain on this layer
    communities = louvain_communities(G_layer)
    layer_communities[layer] = communities

    # Statistics
    comm_sizes = Counter(communities.values())
    n_comms = len(comm_sizes)
    Q = modularity(communities, G_layer, weight='weight')

    layer_stats[layer] = {
        'n_communities': n_comms,
        'modularity': Q,
        'sizes': comm_sizes
    }

    print(f"  Communities: {n_comms}")
    print(f"  Modularity: {Q:.4f}")
    print(f"  Largest community: {max(comm_sizes.values())} nodes")
    print(f"  Average size: {sum(comm_sizes.values())/n_comms:.1f}")

Expected output:

LAYER-SPECIFIC COMMUNITY DETECTION
======================================================================

--- Layer: layer1 ---
  Nodes: 40
  Edges: 95
  Communities: 4
  Modularity: 0.4123
  Largest community: 15 nodes
  Average size: 10.0

--- Layer: layer2 ---
  Nodes: 40
  Edges: 102
  Communities: 5
  Modularity: 0.3876
  Largest community: 12 nodes
  Average size: 8.0

--- Layer: layer3 ---
  Nodes: 40
  Edges: 87
  Communities: 3
  Modularity: 0.4456
  Largest community: 18 nodes
  Average size: 13.3

Cross-layer stability analysis:

Check how consistently nodes are grouped across layers:

from sklearn.metrics import normalized_mutual_info_score
import pandas as pd

# Build node-level community assignments per layer
node_layer_assignments = {}
all_nodes = set()

for layer, communities in layer_communities.items():
    for node, comm_id in communities.items():
        if node not in node_layer_assignments:
            node_layer_assignments[node] = {}
        node_layer_assignments[node][layer] = comm_id
        all_nodes.add(node)

# For each node, check consistency across layers
print("\n" + "=" * 70)
print("CROSS-LAYER STABILITY")
print("=" * 70)

layers = list(layer_communities.keys())

# Pairwise NMI between layers
print(f"\nPairwise NMI between layers:")
print(f"{'Layer Pair':<30} {'NMI':<10} {'Interpretation'}")
print("-" * 70)

for i in range(len(layers)):
    for j in range(i+1, len(layers)):
        layer1, layer2 = layers[i], layers[j]

        # Get common nodes
        nodes1 = set(layer_communities[layer1].keys())
        nodes2 = set(layer_communities[layer2].keys())
        common = nodes1 & nodes2

        if not common:
            continue

        # Compute NMI
        labels1 = [layer_communities[layer1][n] for n in common]
        labels2 = [layer_communities[layer2][n] for n in common]
        nmi = normalized_mutual_info_score(labels1, labels2)

        # Interpret
        if nmi > 0.8:
            interp = "Very similar"
        elif nmi > 0.5:
            interp = "Moderately similar"
        else:
            interp = "Different"

        print(f"{layer1} vs {layer2:<20} {nmi:<10.3f} {interp}")

# Node-level stability score
print(f"\nNode-level stability:")

node_stability = []
for node in sorted(all_nodes):
    assignments = node_layer_assignments.get(node, {})

    # How many layers does this node appear in?
    n_layers = len(assignments)

    if n_layers < 2:
        continue

    # Are the community IDs consistent?
    # (This is a simplified measure - in reality, IDs may differ but structure may be same)
    comm_ids = list(assignments.values())
    is_stable = len(set(comm_ids)) == 1  # All same community ID

    node_stability.append({
        'node': node,
        'n_layers': n_layers,
        'is_stable': is_stable,
        'assignments': assignments
    })

stable_nodes = sum(1 for s in node_stability if s['is_stable'])
print(f"  Nodes appearing in ≥2 layers: {len(node_stability)}")
print(f"  Stable nodes (same community ID): {stable_nodes}")
print(f"  Stability rate: {stable_nodes/len(node_stability)*100:.1f}%")

# Example unstable nodes
print(f"\n  Example unstable nodes:")
unstable = [s for s in node_stability if not s['is_stable']][:5]
for item in unstable:
    print(f"    {item['node']}: {item['assignments']}")

Expected output:

======================================================================
CROSS-LAYER STABILITY
======================================================================

Pairwise NMI between layers:
Layer Pair                     NMI        Interpretation
----------------------------------------------------------------------
layer1 vs layer2               0.723      Moderately similar
layer1 vs layer3               0.456      Different
layer2 vs layer3               0.512      Moderately similar

Node-level stability:
  Nodes appearing in ≥2 layers: 40
  Stable nodes (same community ID): 18
  Stability rate: 45.0%

  Example unstable nodes:
    A5: {'layer1': 0, 'layer2': 1, 'layer3': 0}
    B12: {'layer1': 2, 'layer2': 3}
    C3: {'layer1': 1, 'layer2': 0, 'layer3': 2}
    D7: {'layer1': 0, 'layer2': 2}
    E9: {'layer1': 3, 'layer2': 1, 'layer3': 1}

Visualization - Alluvial diagram:

Show how community membership flows across layers (requires external tools or manual construction):

import pandas as pd

# Export data for alluvial diagram (use R ggalluvial or similar)
alluvial_data = []

for node in all_nodes:
    assignments = node_layer_assignments.get(node, {})
    if len(assignments) >= 2:
        row = {'node': node}
        for layer in layers:
            row[f'comm_{layer}'] = assignments.get(layer, -1)
        alluvial_data.append(row)

df_alluvial = pd.DataFrame(alluvial_data)
df_alluvial.to_csv('alluvial_data.csv', index=False)
print("\nExported alluvial_data.csv for visualization in R/Python")
print("Example R code:")
print("  library(ggalluvial)")
print("  ggplot(data, aes(axis1=comm_layer1, axis2=comm_layer2, axis3=comm_layer3)) +")
print("    geom_alluvium(aes(fill=node)) + geom_stratum()")

When to use layer-specific detection:

Exploratory analysis: Understand layer-specific structure before multilayer methods
Heterogeneous layers: Layers represent fundamentally different relationships (e.g., co-authorship vs. citation)
Baseline comparison: Compare layer-specific vs. multilayer results to quantify benefit of multilayer methods
Dynamic networks: Detect communities in temporal snapshots and track evolution

Cross-Layer Community Analysis

Motivation:

After detecting communities in the full multilayer network, you want to understand:

Do communities span multiple layers?
Which layers contribute most to each community?
Are there inter-layer bridges (nodes connecting different layer-specific communities)?

Community × Layer composition:

from py3plex.core import multinet
from py3plex.algorithms.community_detection.multilayer_modularity import louvain_multilayer
from collections import defaultdict
import numpy as np
import pandas as pd

# Load network and detect communities
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "datasets/synthetic_multilayer.txt",
    input_type="multiedgelist"
)

communities = louvain_multilayer(network, gamma=1.0, omega=1.0, random_state=42)

print("CROSS-LAYER COMMUNITY ANALYSIS")
print("=" * 70)

# Build composition matrix: community × layer
layers = network.get_layers()
comm_ids = sorted(set(communities.values()))

composition = defaultdict(lambda: defaultdict(int))
for (node, layer), comm_id in communities.items():
    composition[comm_id][layer] += 1

# Convert to DataFrame for easier manipulation
data = []
for comm_id in comm_ids:
    row = {'community': comm_id}
    for layer in layers:
        row[layer] = composition[comm_id][layer]
    row['total'] = sum(composition[comm_id].values())
    data.append(row)

df_comp = pd.DataFrame(data)

print("\nCommunity × Layer composition:")
print(df_comp.to_string(index=False))

# Calculate layer entropy for each community
print("\n" + "-" * 70)
print("Community layer diversity (entropy):")
print(f"{'Community':<12} {'Entropy':<10} {'Interpretation'}")
print("-" * 70)

for comm_id in comm_ids:
    # Calculate entropy: H = -Σ p_i log(p_i)
    counts = [composition[comm_id][layer] for layer in layers]
    total = sum(counts)

    if total == 0:
        continue

    probs = np.array(counts) / total
    probs = probs[probs > 0]  # Remove zeros
    entropy = -np.sum(probs * np.log2(probs))
    max_entropy = np.log2(len(layers))  # Maximum possible entropy
    normalized_entropy = entropy / max_entropy if max_entropy > 0 else 0

    # Interpret
    if normalized_entropy > 0.9:
        interp = "Highly dispersed (spans all layers)"
    elif normalized_entropy > 0.5:
        interp = "Moderately dispersed (multi-layer)"
    else:
        interp = "Concentrated (layer-specific)"

    print(f"C{comm_id:<11} {entropy:<10.3f} {interp}")

Expected output:

CROSS-LAYER COMMUNITY ANALYSIS
======================================================================

Community × Layer composition:
 community  layer1  layer2  layer3  total
         0      15      14      16     45
         1      18      20       0     38
         2       0       0      22     22
         3       7       6       2     15

----------------------------------------------------------------------
Community layer diversity (entropy):
Community    Entropy    Interpretation
----------------------------------------------------------------------
C0           1.585      Highly dispersed (spans all layers)
C1           0.997      Moderately dispersed (multi-layer)
C2           0.000      Concentrated (layer-specific)
C3           1.252      Moderately dispersed (multi-layer)

Inter-layer bridges:

Identify nodes that connect different communities across layers:

print("\n" + "=" * 70)
print("INTER-LAYER BRIDGE ANALYSIS")
print("=" * 70)

# For each node, check if it belongs to different communities in different layers
node_communities = defaultdict(dict)  # node -> layer -> comm_id

for (node, layer), comm_id in communities.items():
    node_communities[node][layer] = comm_id

# Identify bridge nodes
bridge_nodes = []
for node, layer_comms in node_communities.items():
    if len(layer_comms) < 2:
        continue

    # Check if community IDs differ across layers
    comm_ids = set(layer_comms.values())
    if len(comm_ids) > 1:
        bridge_nodes.append({
            'node': node,
            'n_layers': len(layer_comms),
            'n_communities': len(comm_ids),
            'assignments': dict(layer_comms)
        })

print(f"\nBridge nodes (spanning multiple communities across layers):")
print(f"  Total nodes: {len(node_communities)}")
print(f"  Bridge nodes: {len(bridge_nodes)} ({len(bridge_nodes)/len(node_communities)*100:.1f}%)")

# Show examples
print(f"\n  Top 10 bridge nodes:")
print(f"  {'Node':<15} {'Layers':<10} {'Communities':<15} {'Assignments'}")
print("  " + "-" * 65)

bridge_nodes_sorted = sorted(bridge_nodes, key=lambda x: x['n_communities'], reverse=True)
for item in bridge_nodes_sorted[:10]:
    node = item['node']
    n_layers = item['n_layers']
    n_comms = item['n_communities']
    assignments = ', '.join([f"{l}:C{c}" for l, c in sorted(item['assignments'].items())])
    print(f"  {str(node):<15} {n_layers:<10} {n_comms:<15} {assignments}")

Expected output:

======================================================================
INTER-LAYER BRIDGE ANALYSIS
======================================================================

Bridge nodes (spanning multiple communities across layers):
  Total nodes: 40
  Bridge nodes: 12 (30.0%)

  Top 10 bridge nodes:
  Node            Layers     Communities     Assignments
  -----------------------------------------------------------------
  A5              3          3               layer1:C0, layer2:C1, layer3:C2
  B12             3          2               layer1:C0, layer2:C1, layer3:C1
  C3              3          2               layer1:C1, layer2:C0, layer3:C0
  D7              2          2               layer1:C0, layer2:C3
  E9              3          2               layer1:C3, layer2:C1, layer3:C1
  F4              2          2               layer1:C1, layer2:C0
  G8              3          2               layer1:C0, layer2:C0, layer3:C2
  H2              2          2               layer1:C3, layer2:C0
  I6              3          2               layer1:C1, layer2:C1, layer3:C2
  J11             2          2               layer1:C0, layer2:C1

Community connectivity graph:

Build a meta-graph where nodes are communities and edges represent inter-layer bridges:

import networkx as nx
import matplotlib.pyplot as plt

# Build community connectivity graph
G_meta = nx.Graph()

# Add community nodes
for comm_id in comm_ids:
    G_meta.add_node(f"C{comm_id}")

# Add edges for bridge nodes
for item in bridge_nodes:
    comms = list(item['assignments'].values())
    # Connect all pairs of communities this node bridges
    for i in range(len(comms)):
        for j in range(i+1, len(comms)):
            c1, c2 = f"C{comms[i]}", f"C{comms[j]}"
            if G_meta.has_edge(c1, c2):
                G_meta[c1][c2]['weight'] += 1
            else:
                G_meta.add_edge(c1, c2, weight=1)

print(f"\n" + "=" * 70)
print("COMMUNITY CONNECTIVITY")
print("=" * 70)
print(f"\nCommunity-level connectivity:")
print(f"  Communities: {G_meta.number_of_nodes()}")
print(f"  Inter-community bridges: {G_meta.number_of_edges()}")

if G_meta.number_of_edges() > 0:
    print(f"\n  Strongest bridges (top 5):")
    edges_sorted = sorted(G_meta.edges(data=True), key=lambda x: x[2]['weight'], reverse=True)
    for c1, c2, data in edges_sorted[:5]:
        print(f"    {c1} ↔ {c2}: {data['weight']} bridge nodes")

    # Visualize meta-graph
    plt.figure(figsize=(8, 8))
    pos = nx.spring_layout(G_meta, seed=42)

    # Edge widths proportional to weight
    weights = [G_meta[u][v]['weight'] for u, v in G_meta.edges()]
    max_weight = max(weights) if weights else 1
    edge_widths = [3 * w / max_weight for w in weights]

    nx.draw_networkx_nodes(G_meta, pos, node_size=800, node_color='lightblue')
    nx.draw_networkx_labels(G_meta, pos, font_size=12, font_weight='bold')
    nx.draw_networkx_edges(G_meta, pos, width=edge_widths, alpha=0.6)

    # Edge labels
    edge_labels = {(u, v): f"{G_meta[u][v]['weight']}" for u, v in G_meta.edges()}
    nx.draw_networkx_edge_labels(G_meta, pos, edge_labels, font_size=8)

    plt.title('Community Connectivity Meta-Graph\n(Edge width = number of bridge nodes)', fontsize=14)
    plt.axis('off')
    plt.tight_layout()
    plt.savefig('community_connectivity.png', dpi=300, bbox_inches='tight')
    plt.show()
    print(f"\n  Visualization saved to: community_connectivity.png")

Expected output:

======================================================================
COMMUNITY CONNECTIVITY
======================================================================

Community-level connectivity:
  Communities: 4
  Inter-community bridges: 5

  Strongest bridges (top 5):
    C0 ↔ C1: 5 bridge nodes
    C1 ↔ C2: 3 bridge nodes
    C0 ↔ C3: 2 bridge nodes
    C1 ↔ C3: 1 bridge nodes
    C0 ↔ C2: 1 bridge nodes

  Visualization saved to: community_connectivity.png

Use cases:

Biological networks: Proteins bridging functional modules across different interaction types
Social networks: Individuals connecting different social circles across contexts
Transportation: Transfer hubs connecting regional clusters across transport modes

Quality Metrics

Why quality metrics matter:

Quality metrics help you:

Compare algorithms objectively
Tune parameters (e.g., choosing optimal \(\omega\) in multilayer Louvain)
Validate results (high Q suggests real structure, not random fluctuations)
Detect overfitting (too many tiny communities = over-segmentation)

Compute Modularity

Single-layer modularity:

For flattened networks, use the Newman-Girvan modularity:

\[Q = \frac{1}{2m} \sum_{ij} \left[ A_{ij} - \frac{k_i k_j}{2m} \right] \delta(c_i, c_j)\]

from py3plex.core import multinet
from py3plex.algorithms.community_detection.community_wrapper import louvain_communities
from py3plex.algorithms.community_detection.community_louvain import modularity
import networkx as nx

# Load network
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "datasets/synthetic_multilayer.txt",
    input_type="multiedgelist"
)

# Detect communities
communities = louvain_communities(network)

# Convert to NetworkX for modularity calculation
G = nx.Graph()
for edge in network.core_network.edges():
    G.add_edge(edge[0], edge[1])

# Calculate modularity
Q = modularity(communities, G, weight='weight')

print(f"Modularity Q: {Q:.4f}")

# Interpretation
if Q > 0.7:
    print("  Interpretation: Excellent community structure")
elif Q > 0.4:
    print("  Interpretation: Strong community structure")
elif Q > 0.2:
    print("  Interpretation: Moderate community structure")
else:
    print("  Interpretation: Weak or no community structure")

Expected output:

Modularity Q: 0.4234
  Interpretation: Strong community structure

Multilayer modularity:

For multilayer networks, use the generalized modularity that accounts for inter-layer coupling:

from py3plex.algorithms.community_detection.multilayer_modularity import (
    louvain_multilayer,
    multilayer_modularity
)

# Run multilayer Louvain
communities = louvain_multilayer(
    network,
    gamma=1.0,
    omega=1.0,
    random_state=42
)

# Calculate multilayer modularity
Q_multi = multilayer_modularity(
    network,
    communities,
    gamma=1.0,
    omega=1.0
)

print(f"Multilayer modularity Q: {Q_multi:.4f}")

Expected output:

Multilayer modularity Q: 0.4589

Modularity resolution:

Modularity has a resolution limit: it cannot detect communities smaller than \(\sqrt{m}\) where \(m\) is the number of edges. The resolution parameter \(\gamma\) can help:

# Test different resolution parameters
print("Modularity vs. resolution:")
print(f"{'γ':<10} {'#Comm':<10} {'Q':<10} {'Avg Size':<10}")
print("-" * 45)

for gamma in [0.5, 1.0, 1.5, 2.0]:
    comms = louvain_multilayer(
        network, gamma=gamma, omega=1.0, random_state=42
    )
    n_comms = len(set(comms.values()))
    Q = multilayer_modularity(network, comms, gamma=gamma, omega=1.0)
    avg_size = len(comms) / n_comms

    print(f"{gamma:<10.1f} {n_comms:<10} {Q:<10.4f} {avg_size:<10.1f}")

Expected output:

Modularity vs. resolution:
γ          #Comm      Q          Avg Size
---------------------------------------------
0.5        3          0.3456     40.0
1.0        5          0.4589     24.0
1.5        8          0.4123     15.0
2.0        12         0.3678     10.0

Interpretation:

Lower γ: Fewer, larger communities (under-segmentation)
Higher γ: More, smaller communities (over-segmentation)
Optimal γ: Maximum Q (but check that communities are meaningful!)

Additional Quality Metrics

1. Coverage (fraction of edges within communities):

def calculate_coverage(network, communities):
    """Fraction of edges within communities."""
    intra_edges = 0
    total_edges = 0

    for source, target in network.core_network.edges():
        total_edges += 1
        if communities.get(source) == communities.get(target):
            intra_edges += 1

    return intra_edges / total_edges if total_edges > 0 else 0

coverage = calculate_coverage(network, communities)
print(f"Coverage: {coverage:.4f}  (fraction of intra-community edges)")

Expected output:

Coverage: 0.8234  (fraction of intra-community edges)

2. Performance (combines intra-community edges and inter-community non-edges):

def calculate_performance(network, communities):
    """Performance metric (Fortunato 2010)."""
    nodes = list(communities.keys())
    n = len(nodes)

    # Count intra-community edges and inter-community non-edges
    intra_edges = 0
    inter_non_edges = 0
    total_pairs = 0

    for i in range(len(nodes)):
        for j in range(i+1, len(nodes)):
            node1, node2 = nodes[i], nodes[j]
            same_community = communities[node1] == communities[node2]
            is_edge = network.core_network.has_edge(node1, node2)

            if same_community and is_edge:
                intra_edges += 1
            elif not same_community and not is_edge:
                inter_non_edges += 1

            total_pairs += 1

    return (intra_edges + inter_non_edges) / total_pairs if total_pairs > 0 else 0

performance = calculate_performance(network, communities)
print(f"Performance: {performance:.4f}")

Expected output:

Performance: 0.7456

3. Conductance (quality of community boundaries):

def calculate_conductance(network, communities, comm_id):
    """Conductance of a specific community (lower is better)."""
    comm_nodes = [n for n, c in communities.items() if c == comm_id]

    if not comm_nodes:
        return None

    # Count edges
    internal_edges = 0
    boundary_edges = 0

    for node in comm_nodes:
        neighbors = list(network.core_network.neighbors(node))
        for neighbor in neighbors:
            if communities.get(neighbor) == comm_id:
                internal_edges += 0.5  # Count each edge once
            else:
                boundary_edges += 1

    volume = internal_edges * 2 + boundary_edges  # Volume of the community
    return boundary_edges / volume if volume > 0 else 0

# Calculate for all communities
print("\nConductance per community (lower = better defined):")
for comm_id in sorted(set(communities.values())):
    cond = calculate_conductance(network, communities, comm_id)
    if cond is not None:
        print(f"  Community {comm_id}: {cond:.4f}")

Expected output:

Conductance per community (lower = better defined):
  Community 0: 0.1234
  Community 1: 0.2456
  Community 2: 0.0987
  Community 3: 0.3123
  Community 4: 0.1789

4. Null model comparison (compare to random partitions):

import random

# Calculate Q for real partition
Q_real = multilayer_modularity(network, communities, gamma=1.0, omega=1.0)

# Generate random partitions and calculate Q
nodes = list(communities.keys())
n_communities = len(set(communities.values()))

Q_random = []
for trial in range(100):
    # Random partition with same number of communities
    random_comms = {node: random.randint(0, n_communities-1) for node in nodes}
    Q_rand = multilayer_modularity(network, random_comms, gamma=1.0, omega=1.0)
    Q_random.append(Q_rand)

Q_rand_mean = np.mean(Q_random)
Q_rand_std = np.std(Q_random)
z_score = (Q_real - Q_rand_mean) / Q_rand_std if Q_rand_std > 0 else 0

print(f"\nNull model comparison:")
print(f"  Real Q: {Q_real:.4f}")
print(f"  Random Q (mean ± std): {Q_rand_mean:.4f} ± {Q_rand_std:.4f}")
print(f"  Z-score: {z_score:.2f}")

if z_score > 3:
    print(f"  Interpretation: Highly significant (real structure)")
elif z_score > 2:
    print(f"  Interpretation: Significant (likely real structure)")
else:
    print(f"  Interpretation: Not significant (could be random)")

Expected output:

Null model comparison:
  Real Q: 0.4589
  Random Q (mean ± std): 0.0023 ± 0.0145
  Z-score: 31.49
  Interpretation: Highly significant (real structure)

Summary of metrics:

Modularity (Q): Overall quality, general-purpose
Coverage: Simple interpretability (% internal edges)
Performance: Balances true positives and true negatives
Conductance: Community boundary quality (per-community)
Null model: Statistical significance test

Recommendation: Always report modularity + at least one other metric to get a complete picture.

CLI Cross-Reference (Optional)

py3plex provides command-line tools for quick community detection without writing Python code.

Basic usage:

# Detect communities using Louvain (default algorithm)
py3plex community datasets/network.edgelist \
    --algorithm louvain \
    --output communities.json

# Using Infomap (requires Infomap binary installed)
py3plex community datasets/network.edgelist \
    --algorithm infomap \
    --output communities.json

# Using Label Propagation (fast for large networks)
py3plex community datasets/network.edgelist \
    --algorithm label_prop \
    --output communities.json

# With custom resolution parameter for Louvain
py3plex community datasets/network.edgelist \
    --algorithm louvain \
    --resolution 1.5 \
    --output communities.json

Available algorithms:

louvain: Fast Louvain method (default) - optimizes modularity on flattened network
infomap: Infomap algorithm - requires Infomap binary (https://www.mapequation.org/infomap/)
label_prop: Label propagation - very fast, suitable for large networks

Output format:

The CLI outputs JSON files with structure:

{
  "algorithm": "louvain",
  "num_communities": 5,
  "communities": {
    "node1": 0,
    "node2": 0,
    "node3": 1,
    ...
  },
  "community_sizes": {
    "0": 42,
    "1": 27,
    ...
  }
}

Note on multilayer networks:

The current CLI community command operates on flattened networks. For multilayer-specific community detection (with inter-layer coupling), use the Python API with louvain_multilayer() as shown in the examples above. Future CLI versions may add multilayer support.

Viewing results:

After running the CLI command, you can analyze the JSON output:

# View community statistics
py3plex community network.edgelist --algorithm louvain
# Output printed to console if no --output specified

For full CLI documentation, see Command-Line Interface (CLI) Tutorial or Command-Line Interface (CLI) Tutorial.

Next Steps

Further reading:

Algorithms: Algorithm Landscape - Deep dive into community detection theory
Visualization: How to Visualize Multilayer Networks - Advanced community visualization techniques
Benchmark: ../tutorials/benchmark_communities - Compare with ground-truth communities
Temporal analysis: ../tutorials/temporal_communities - Track community evolution over time

Recommended workflows:

Exploratory: Start with Louvain → visualize → if unsatisfied, try multilayer Louvain or Infomap
Publication: Run multiple algorithms → compare → report consensus + metrics
Large-scale: Use label propagation for initial exploration → refine with Louvain on filtered subgraph
Temporal: Detect communities in snapshots → track with NMI → visualize with alluvial diagrams

Common pitfalls:

Resolution limit: Modularity cannot detect communities smaller than \(\sqrt{m}\)
Non-determinism: Many algorithms are stochastic; always set random seeds for reproducibility
Overfitting: Too many tiny communities suggests over-segmentation; try lower resolution
Layer coupling: For multilayer networks, always try multiple \(\omega\) values

Community detection checklist:

[ ] Run at least 2 different algorithms
[ ] Calculate modularity and at least one other quality metric
[ ] Visualize size distribution to check for over/under-segmentation
[ ] Compare with null model to ensure statistical significance
[ ] For multilayer: test multiple \(\omega\) values
[ ] Export results to CSV for downstream analysis
[ ] Document random seeds for reproducibility

Questions?

GitHub Issues: https://github.com/SkBlaz/py3plex/issues
Documentation: https://skblaz.github.io/py3plex/
Examples: examples/communities/ directory in the repository

Key References:

Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.
Mucha, P. J., Richardson, T., Macon, K., Porter, M. A., & Onnela, J. P. (2010). Community structure in time-dependent, multiscale, and multiplex networks. Science, 328(5980), 876-878.
Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 1118-1123.
Raghavan, U. N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76(3), 036106.