How to Reproduce Common Analysis Workflows

Goal: Use ready-made recipes for common multilayer network analysis tasks.

Prerequisites: Basic understanding of py3plex (see 10-Minute Tutorial).

Complete Workflows

This guide links to detailed recipes and case studies. For step-by-step implementations, see:

Quick Recipe Index

Network Construction

Statistical Analysis

Community Detection

Network Embeddings

Visualization

Domain-Specific Workflows

Social Networks

Multi-platform social analysis:

# See: user_guide/case_studies.rst - Social Network Case Study
# 1. Load data from multiple platforms
# 2. Detect cross-platform communities
# 3. Identify influential users
# 4. Analyze information diffusion

See Use Cases & Case Studies for complete implementation.

Biological Networks

Multi-omics integration:

# See: user_guide/case_studies.rst - Biological Network Case Study
# 1. Integrate protein-protein + gene regulation + metabolic pathways
# 2. Find key regulators using multilayer centrality
# 3. Detect functional modules
# 4. Prioritize disease genes

See Use Cases & Case Studies for complete implementation.

Transportation Networks

Multimodal route analysis:

# See: examples/index.rst - Transportation Example
# 1. Model different transportation modes as layers
# 2. Add transfer connections between layers
# 3. Compute optimal multimodal routes
# 4. Identify critical transfer points

See Examples & Recipes for runnable code.

Config-Driven Workflows

Use configuration files for reproducibility:

# workflow_config.yaml
network:
  input_file: "data.multiedgelist"
  input_type: "multiedgelist"

analysis:
  - name: "statistics"
    metrics: ["degree", "betweenness_centrality"]

  - name: "community_detection"
    algorithm: "louvain"
    params:
      resolution: 1.0

  - name: "visualization"
    output: "network.png"
    layout: "force_directed"

Execute workflow:

from py3plex.workflows import execute_workflow

results = execute_workflow("workflow_config.yaml")

See Analysis Recipes & Workflows for complete config-driven workflow examples.

DSL-Driven Analysis Workflows

Goal: Use py3plex’s DSL to create reproducible, declarative analysis pipelines.

The DSL provides a powerful way to express analysis workflows as queries rather than imperative code. This makes workflows more readable, maintainable, and reproducible.

Basic DSL Workflow Pattern

Template for DSL-first analysis:

from py3plex.core import multinet
from py3plex.dsl import Q, L, execute_query

# 1. Load network
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "py3plex/datasets/_data/synthetic_multilayer.edges",
    input_type="multiedgelist"
)

# 2. Query and filter nodes with DSL
high_degree_nodes = (
    Q.nodes()
     .compute("degree", "betweenness_centrality")
     .where(degree__gt=5)
     .order_by("betweenness_centrality", reverse=True)
     .execute(network)
)

# 3. Extract subnetwork
subgraph = network.core_network.subgraph(high_degree_nodes.keys())

# 4. Analyze subnetwork
print(f"High-degree subnetwork:")
print(f"  Nodes: {len(high_degree_nodes)}")
print(f"  Edges: {subgraph.number_of_edges()}")

# 5. Export results
import pandas as pd
df = pd.DataFrame([
    {
        'node': node[0],
        'layer': node[1],
        'degree': data['degree'],
        'betweenness': data['betweenness_centrality']
    }
    for node, data in high_degree_nodes.items()
])
df.to_csv('high_degree_analysis.csv', index=False)

Expected output:

High-degree subnetwork:
  Nodes: 25
  Edges: 89

Multilayer Exploration Workflow

Systematic multilayer network analysis:

from py3plex.core import multinet
from py3plex.dsl import Q, L
import pandas as pd

# Load network
network = multinet.multi_layer_network(directed=False)
network.load_network(
    "py3plex/datasets/_data/synthetic_multilayer.edges",
    input_type="multiedgelist"
)

print("MULTILAYER NETWORK EXPLORATION")
print("=" * 70)

# Step 1: Per-layer statistics
print("\n1. Per-Layer Statistics:")
layer_stats = []

for layer in network.get_layers():
    # Query layer nodes
    layer_nodes = Q.nodes().from_layers(L[layer]).execute(network)
    layer_edges = Q.edges().from_layers(L[layer]).execute(network)

    # Compute metrics
    result = (
        Q.nodes()
         .from_layers(L[layer])
         .compute("degree")
         .execute(network)
    )

    avg_degree = sum(d['degree'] for d in result.values()) / len(result) if result else 0

    layer_stats.append({
        'layer': layer,
        'nodes': len(layer_nodes),
        'edges': len(layer_edges),
        'avg_degree': avg_degree
    })

    print(f"  {layer}: {len(layer_nodes)} nodes, {len(layer_edges)} edges, avg_degree={avg_degree:.2f}")

# Step 2: Find versatile nodes (present in multiple layers)
print("\n2. Versatile Nodes (multilayer presence):")
from collections import Counter

node_layer_count = Counter()
for node, layer in network.get_nodes():
    node_layer_count[node] += 1

versatile_nodes = {
    node: count for node, count in node_layer_count.items()
    if count >= 2
}

print(f"  Total versatile nodes: {len(versatile_nodes)}")
print(f"  Top 5 most versatile:")
for node, count in sorted(versatile_nodes.items(), key=lambda x: x[1], reverse=True)[:5]:
    print(f"    {node}: {count} layers")

# Step 3: Layer comparison
print("\n3. Layer Overlap Analysis:")
layers = network.get_layers()

for i, layer1 in enumerate(layers):
    for layer2 in layers[i+1:]:
        nodes1 = set(n[0] for n in Q.nodes().from_layers(L[layer1]).execute(network).keys())
        nodes2 = set(n[0] for n in Q.nodes().from_layers(L[layer2]).execute(network).keys())

        overlap = nodes1 & nodes2
        jaccard = len(overlap) / len(nodes1 | nodes2) if (nodes1 | nodes2) else 0

        print(f"  {layer1}{layer2}: {len(overlap)} nodes, Jaccard={jaccard:.3f}")

# Step 4: Hub identification across layers
print("\n4. Cross-Layer Hub Nodes:")
all_metrics = (
    Q.nodes()
     .compute("degree", "betweenness_centrality")
     .where(degree__gt=7)
     .execute(network)
)

print(f"  Hub nodes (degree > 7): {len(all_metrics)}")

# Group hubs by base node ID
from collections import defaultdict
hub_layers = defaultdict(set)

for (node, layer), data in all_metrics.items():
    hub_layers[node].add(layer)

print(f"  Unique hub node IDs: {len(hub_layers)}")
print(f"  Top 5 hub nodes:")
for node, layers in sorted(hub_layers.items(), key=lambda x: len(x[1]), reverse=True)[:5]:
    print(f"    {node}: present in {len(layers)} layers - {list(layers)}")

Expected output:

MULTILAYER NETWORK EXPLORATION
======================================================================

1. Per-Layer Statistics:
  layer1: 40 nodes, 95 edges, avg_degree=4.75
  layer2: 40 nodes, 87 edges, avg_degree=4.35
  layer3: 40 nodes, 102 edges, avg_degree=5.10

2. Versatile Nodes (multilayer presence):
  Total versatile nodes: 35
  Top 5 most versatile:
    node7: 3 layers
    node12: 3 layers
    node3: 3 layers
    node15: 3 layers
    node1: 3 layers

3. Layer Overlap Analysis:
  layer1 ∩ layer2: 35 nodes, Jaccard=0.875
  layer1 ∩ layer3: 32 nodes, Jaccard=0.800
  layer2 ∩ layer3: 33 nodes, Jaccard=0.825

4. Cross-Layer Hub Nodes:
  Hub nodes (degree > 7): 18
  Unique hub node IDs: 12
  Top 5 hub nodes:
    node7: present in 3 layers - ['layer1', 'layer2', 'layer3']
    node12: present in 3 layers - ['layer1', 'layer2', 'layer3']
    node3: present in 3 layers - ['layer1', 'layer2', 'layer3']
    node15: present in 2 layers - ['layer1', 'layer3']
    node8: present in 2 layers - ['layer2', 'layer3']

Community Detection + DSL Workflow

Combine community detection with DSL queries:

from py3plex.algorithms.community_detection.community_wrapper import louvain_communities
from py3plex.dsl import Q, execute_query

# Detect communities
communities = louvain_communities(network)

# Attach as node attributes
for (node, layer), comm_id in communities.items():
    network.core_network.nodes[(node, layer)]['community'] = comm_id

print("COMMUNITY-BASED ANALYSIS")
print("=" * 70)

# Query each community
community_ids = set(communities.values())

for comm_id in sorted(community_ids):
    # Use DSL to get community members
    comm_nodes = execute_query(
        network,
        f'SELECT nodes WHERE community={comm_id}'
    )

    # Compute community metrics
    comm_result = (
        Q.nodes()
         .where(community=comm_id)
         .compute("degree", "betweenness_centrality")
         .execute(network)
    )

    # Statistics
    avg_degree = sum(d['degree'] for d in comm_result.values()) / len(comm_result)
    avg_betw = sum(d['betweenness_centrality'] for d in comm_result.values()) / len(comm_result)

    # Layer composition
    layer_counts = Counter(node[1] for node in comm_nodes)

    print(f"\nCommunity {comm_id}:")
    print(f"  Size: {len(comm_nodes)} nodes")
    print(f"  Avg degree: {avg_degree:.2f}")
    print(f"  Avg betweenness: {avg_betw:.6f}")
    print(f"  Layer composition: {dict(layer_counts)}")

Dynamics + DSL Workflow

Epidemic simulation with DSL-based analysis:

from py3plex.dynamics import SIRDynamics
from py3plex.dsl import Q, L

# Run SIR simulation
sir = SIRDynamics(
    network,
    beta=0.3,
    gamma=0.1,
    initial_infected=0.05
)
sir.set_seed(42)
results = sir.run(steps=100)

# Attach final state
final_state = results.trajectory[-1]
for node, state in final_state.items():
    network.core_network.nodes[node]['sir_state'] = state

print("EPIDEMIC ANALYSIS")
print("=" * 70)

# Per-layer infection analysis
for layer in network.get_layers():
    layer_nodes = Q.nodes().from_layers(L[layer]).execute(network)

    state_counts = Counter(
        network.core_network.nodes[node].get('sir_state', 'unknown')
        for node in layer_nodes.keys()
    )

    total = len(layer_nodes)
    infected_pct = state_counts.get('I', 0) / total * 100 if total > 0 else 0
    recovered_pct = state_counts.get('R', 0) / total * 100 if total > 0 else 0

    print(f"\n{layer}:")
    print(f"  S: {state_counts.get('S', 0)} ({state_counts.get('S', 0)/total*100:.1f}%)")
    print(f"  I: {state_counts.get('I', 0)} ({infected_pct:.1f}%)")
    print(f"  R: {state_counts.get('R', 0)} ({recovered_pct:.1f}%)")

# Identify superspreaders (infected nodes with high degree)
superspreaders = (
    Q.nodes()
     .where(sir_state='I')
     .compute("degree", "betweenness_centrality")
     .where(degree__gt=6)
     .order_by("degree", reverse=True)
     .execute(network)
)

print(f"\nSuperspreaders (infected, degree > 6): {len(superspreaders)}")
for node, data in list(superspreaders.items())[:5]:
    print(f"  {node}: degree={data['degree']}, betw={data['betweenness_centrality']:.4f}")

Reusable DSL Query Functions

Create reusable query templates:

def get_layer_hubs(network, layer, degree_threshold=5):
    """Get high-degree nodes in a specific layer."""
    from py3plex.dsl import Q, L

    return (
        Q.nodes()
         .from_layers(L[layer])
         .compute("degree")
         .where(degree__gt=degree_threshold)
         .order_by("degree", reverse=True)
         .execute(network)
    )

def get_versatile_nodes(network, min_layers=2):
    """Get nodes present in multiple layers."""
    from collections import Counter

    node_layer_count = Counter()
    for node, layer in network.get_nodes():
        node_layer_count[node] += 1

    return {
        node: count for node, count in node_layer_count.items()
        if count >= min_layers
    }

def compare_layer_centrality(network, layer1, layer2):
    """Compare centrality distributions between two layers."""
    from py3plex.dsl import Q, L
    import numpy as np

    result1 = (
        Q.nodes()
         .from_layers(L[layer1])
         .compute("betweenness_centrality")
         .execute(network)
    )

    result2 = (
        Q.nodes()
         .from_layers(L[layer2])
         .compute("betweenness_centrality")
         .execute(network)
    )

    betw1 = [d['betweenness_centrality'] for d in result1.values()]
    betw2 = [d['betweenness_centrality'] for d in result2.values()]

    return {
        'layer1': {'mean': np.mean(betw1), 'std': np.std(betw1)},
        'layer2': {'mean': np.mean(betw2), 'std': np.std(betw2)}
    }

# Use reusable functions
hubs_layer1 = get_layer_hubs(network, 'layer1', degree_threshold=7)
versatile = get_versatile_nodes(network, min_layers=3)
centrality_comp = compare_layer_centrality(network, 'layer1', 'layer2')

print(f"Layer1 hubs: {len(hubs_layer1)}")
print(f"Highly versatile nodes: {len(versatile)}")
print(f"Centrality comparison: {centrality_comp}")

Why use DSL-driven workflows?

  • Declarative: Express what to analyze, not how to compute

  • Composable: Chain queries to build complex analyses

  • Reproducible: Queries are self-documenting and version-controllable

  • Efficient: DSL optimizes execution internally

  • Readable: SQL-like syntax is intuitive for data analysis

Next steps with DSL workflows:

Batch Processing

Process multiple networks:

import glob
from py3plex.core import multinet

results = []

for filename in glob.glob("data/*.multiedgelist"):
    # Load network
    network = multinet.multi_layer_network()
    network.load_network(filename, input_type="multiedgelist")

    # Apply analysis pipeline
    stats = analyze_network(network)  # Your custom function

    results.append({
        'filename': filename,
        'stats': stats
    })

# Aggregate results
summary = aggregate_results(results)

Complete Example Templates

The following locations contain complete, runnable examples:

  1. User Guide Recipes (Analysis Recipes & Workflows)

    • Recipe-style solutions with code + explanation

    • Focused on single tasks

  2. Case Studies (Use Cases & Case Studies)

    • End-to-end analyses

    • Real-world datasets

    • Publication-ready results

  3. Examples Gallery (Examples & Recipes)

    • Standalone Python scripts

    • Minimal, focused examples

    • Easy to adapt

Next Steps