Architecture and Design

This document describes the system architecture, design patterns, and extension points of py3plex. It is a map of how the layers fit together, what invariants each layer keeps, and where to extend the library safely without breaking caller expectations.

System Overview

py3plex is built as a modular, layered architecture with clear separation of concerns. Each layer depends only on the one below it, which keeps algorithm code focused on logic instead of plumbing. All layers operate on encoded node-layer identifiers supplied by the core layer so that layer membership is always explicit.

┌─────────────────────────────────────────────────────┐
│          High-Level Interfaces (Wrappers)           │
│  node2vec_embedding, benchmark_nodes                │
└─────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│              Algorithms Layer                        │
│  ┌──────────────┬───────────────┬─────────────────┐ │
│  │  Community   │  Statistics   │  Multilayer     │ │
│  │  Detection   │               │  Algorithms     │ │
│  └──────────────┴───────────────┴─────────────────┘ │
└─────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│            Visualization Layer                       │
│  ┌──────────────┬───────────────┬─────────────────┐ │
│  │  Multilayer  │  Drawing      │  Layout         │ │
│  │  Plots       │  Machinery    │  Algorithms     │ │
│  └──────────────┴───────────────┴─────────────────┘ │
└─────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│                Core Layer                            │
│  ┌──────────────┬───────────────┬─────────────────┐ │
│  │  multinet    │  Parsers      │  Converters     │ │
│  │  (MultiLayer │               │                 │ │
│  │   Network)   │               │                 │ │
│  └──────────────┴───────────────┴─────────────────┘ │
└─────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│           NetworkX Foundation                        │
│  MultiDiGraph, MultiGraph, Algorithms                │
└─────────────────────────────────────────────────────┘

Architectural Layers

Each layer exposes a narrow surface to the one above it. When adding new features, decide which layer should own the responsibility first to avoid pushing implementation details upward.

Core Layer

Purpose: Fundamental data structures and I/O operations. All higher layers manipulate networks through this layer.

Key Components:

  • multinet.py - multi_layer_network class

  • parsers.py - Input/output for supported formats

  • converters.py - Format conversion helpers

  • random_generators.py - Synthetic network generators

  • HINMINE/ - Heterogeneous network decomposition

Responsibilities:

  • Network construction and manipulation

  • File I/O (GraphML, GML, GEXF, edge lists, etc.)

  • Layer management

  • Matrix representations (adjacency, supra-adjacency) with clear encoding rules

  • NetworkX integration so downstream algorithms work on familiar graphs

  • Encoding invariants that keep node-layer separation consistent across the stack

Design Pattern: Facade Pattern - multi_layer_network provides a unified interface to complex NetworkX operations and centralizes encoding/decoding rules

Algorithms Layer

Purpose: Network analysis algorithms optimized for multilayer networks.

Key Components:

  • community_detection/ - Community detection algorithms

  • statistics/ - Network statistics and metrics

  • multilayer_algorithms/ - Multilayer-specific algorithms

  • node_ranking/ - Centrality and ranking measures

  • general/ - General-purpose algorithms (random walks, etc.)

Responsibilities:

  • Community detection (Louvain, Infomap, Label Propagation)

  • Statistical analysis (density, overlap, correlation)

  • Centrality computation (degree, betweenness, PageRank, etc.)

  • Random walks and embeddings

  • Network decomposition using explicit layer boundaries or coupling rules

Design Pattern: Strategy Pattern - Different algorithms implement common interfaces

Visualization Layer

Purpose: Network plotting and rendering.

Key Components:

  • multilayer.py - High-level multilayer plotting

  • drawing_machinery.py - Core drawing primitives

  • layout_algorithms.py - Layout computation

  • colors.py - Color scheme generators

  • fa2/ - ForceAtlas2 layout

Responsibilities:

  • Diagonal projection plots (dense but layer-aware)

  • Force-directed layouts

  • Matrix visualizations

  • Color mapping and legends

  • Interactive plots (via Plotly)

Design Pattern: Template Method Pattern - Layout algorithms follow a common template and separate layout computation from rendering

Wrappers Layer

Purpose: High-level interfaces for common workflows.

Key Components:

  • node2vec_embedding.py - Node2Vec embedding generation

  • benchmark_nodes.py - Node classification benchmarking

Responsibilities:

  • Simplified interfaces for complex workflows

  • Integration with external tools

  • Benchmarking and evaluation

Design Pattern: Facade Pattern - Simplify complex multi-step operations

Core Data Structure

The multi_layer_network Class

Central to py3plex, this class manages multilayer network state and enforces encoding conventions.

class multi_layer_network:
    def __init__(self, directed=True, label_delimiter="---", coupling_weight=1.0):
        self.core_network = nx.MultiDiGraph() if directed else nx.MultiGraph()
        self.layer_name_map = {}  # Bidirectional mapping
        self.label_delimiter = label_delimiter
        self.coupling_weight = coupling_weight
        self.embedding = None
        self.labels = None

Key Attributes:

  • core_network - Underlying NetworkX graph (MultiDiGraph if directed=True)

  • layer_name_map - Maps layer names to integer IDs

  • label_delimiter - Separator for node-layer encoding (default: "---"). Avoid using the delimiter inside raw node IDs.

  • coupling_weight - Default weight for inter-layer edges when none is provided

  • embedding - Cached node embedding matrix

  • labels - Node classification labels

Encoding Scheme:

Nodes are encoded as "{node_id}{delimiter}{layer_id}" before they are added to core_network.

Example: Node ‘A’ in layer ‘social’ becomes "A---social". This keeps the multilayer structure explicit while allowing standard NetworkX algorithms to run.

Encoded node keys remain opaque to callers. When returning results (e.g., centrality), keep them encoded to preserve layer information unless explicitly decoded for presentation. If raw node IDs are needed, split on label_delimiter consistently so delimiter changes do not silently corrupt parsing.

Design Patterns

Facade Pattern

Used in: multi_layer_network, wrappers

Purpose: Provide simplified interface to complex subsystems

# Complex underlying operations hidden behind simple interface
network = multinet.multi_layer_network()
network.add_edges(edges, input_type='list')  # Handles parsing, encoding, validation
network.basic_stats()  # Aggregates multiple NetworkX calls and keeps caches coherent

Strategy Pattern

Used in: Algorithms, layout computation

Purpose: Interchangeable algorithms following common interface

# Different community detection strategies
def detect_communities(network, method='louvain'):
    strategies = {
        'louvain': community_louvain.best_partition,
        'infomap': community_wrapper.infomap_communities,
        'label_prop': label_propagation.propagate
    }
    return strategies[method](network.core_network)

Template Method Pattern

Used in: Visualization, layout algorithms

Purpose: Define algorithm skeleton, allow customization in subclasses

class LayoutAlgorithm:
    def compute(self, graph):
        self.initialize(graph)
        self.iterate()
        return self.finalize()

    def initialize(self, graph):
        raise NotImplementedError

    def iterate(self):
        raise NotImplementedError

    def finalize(self):
        raise NotImplementedError

Dependency Injection

Used in: Configuration, algorithm parameters

Purpose: Inject dependencies rather than hard-coding

# Configuration injected rather than hard-coded
from py3plex.config import DEFAULT_COLORS, LAYOUT_PARAMS

def draw_network(network, colors=None, layout_params=None):
    colors = colors or DEFAULT_COLORS
    layout_params = layout_params or LAYOUT_PARAMS
    # Use injected configuration

Data Flow

The steps below show how a typical workflow moves through the layers: creation in the core, processing in algorithms, summarization, and presentation in visualization or wrappers. The examples use encoded nodes throughout so layer context is preserved.

Typical Workflow

  1. Input: Load or create network

    network = multinet.multi_layer_network()
    network.load_network("data.graphml", input_type="graphml")
    
  2. Processing: Apply algorithms

    communities = community_louvain.best_partition(network.core_network)
    centrality = calc.multilayer_degree_centrality(network)
    
  3. Analysis: Compute statistics

    density = mls.layer_density(network, 'layer1')
    correlation = mls.inter_layer_degree_correlation(network, 'L1', 'L2')
    
  4. Visualization: Render results

    draw_multilayer_default([network], display=True)
    
  5. Output: Export results

    network.save_network("output.graphml", output_type="graphml")
    

State Management

Immutable Operations: Most algorithms don’t modify the network.

# These don't modify the network
centrality = calc.multilayer_degree_centrality(network)
communities = community_louvain.best_partition(network.core_network)

Mutable Operations: Some operations modify network state and should trigger cache invalidation (e.g., supra-adjacency, cached layouts).

# These modify the network
network.add_edges(new_edges, input_type='list')
network.aggregate_layers(['L1', 'L2'], 'combined')

When mutating, clear or recompute any cached matrices or embeddings that depend on structure before running further analysis. Mutations should be localized (e.g., via helper methods) so cache invalidation stays centralized and deterministic.

Extension Points

Custom Algorithms

Add new algorithms by following existing patterns. Prefer operating on network.core_network (already encoded) and return results keyed by encoded nodes or layers so downstream utilities work unchanged. Keep any cacheable intermediates (e.g., supra-adjacency) out of global scope to avoid cross-test leakage:

# py3plex/algorithms/my_module/my_algorithm.py
from py3plex.core.multinet import multi_layer_network

def my_centrality(network: multi_layer_network) -> dict:
    """
    Custom centrality measure.

    Parameters
    ----------
    network : multi_layer_network
        Input network

    Returns
    -------
    dict
        Node centrality scores
    """
    G = network.core_network
    centrality = {}

    for node in G.nodes():
        # Implement custom logic
        centrality[node] = compute_score(node, G)

    return centrality

Custom Visualizations

Create custom plots using drawing machinery. Keep layout computation separate from drawing so alternative layouts can be swapped in:

from py3plex.visualization import drawing_machinery as dm
import matplotlib.pyplot as plt

def my_custom_plot(network):
    """Custom visualization."""
    fig, ax = plt.subplots(figsize=(10, 8))

    # Compute layout
    pos = dm.compute_layout(network.core_network, 'force')

    # Draw elements
    dm.draw_nodes(ax, network.core_network, pos, node_size=50)
    dm.draw_edges(ax, network.core_network, pos, edge_width=1)
    dm.draw_labels(ax, pos, labels=network.get_node_labels())

    plt.show()

Custom Parsers

Add support for new file formats. Keep parsing side effects minimal and reuse existing helpers for validation where possible. Preserve the node-layer encoding contract so downstream algorithms receive encoded nodes:

# py3plex/core/parsers.py
def parse_my_format(input_file, **kwargs):
    """
    Parse custom file format.

    Parameters
    ----------
    input_file : str
        Path to input file

    Returns
    -------
    multi_layer_network
        Parsed network
    """
    network = multi_layer_network()

    with open(input_file, 'r') as f:
        for line in f:
            # Parse line and add to network
            pass

    return network

Configuration System

Centralized Configuration

py3plex/config.py provides centralized configuration:

# Default color palettes (8 options including colorblind-safe)
DEFAULT_COLORS = 'Set1'
COLORBLIND_SAFE = 'colorblind'

# Visualization defaults
DEFAULT_NODE_SIZE = 20
DEFAULT_EDGE_WIDTH = 1.0
DEFAULT_ALPHA = 0.7

# Layout parameters
LAYOUT_PARAMS = {
    'force': {'iterations': 500, 'optimal_distance': 1.0},
    'fa2': {'iterations': 1000, 'gravity': 1.0}
}

# Performance settings
SPARSE_THRESHOLD = 1000  # Use sparse matrices above this node count
MEMORY_WARNING_THRESHOLD = 10000  # Warn for large dense matrices

Usage:

from py3plex.config import DEFAULT_COLORS, LAYOUT_PARAMS

colors = DEFAULT_COLORS
iterations = LAYOUT_PARAMS['force']['iterations']

Override configuration at call sites instead of mutating globals to keep tests reproducible. If a global default must change, document the reason and expected downstream effects on visualization, performance, or memory.

Testing Architecture

Test Organization

tests/
├── test_core_functionality.py      # Core data structure tests
├── test_multilayer_*.py            # Multilayer algorithm tests
├── test_random_walks.py            # Random walk tests
├── test_io_*.py                    # I/O and parsing tests
├── test_config_api.py              # Configuration tests
└── test_utils.py                   # Utility function tests

Test Patterns

Unit Tests: Test individual functions in isolation

def test_layer_density():
    network = create_test_network()
    density = mls.layer_density(network, 'layer1')
    assert 0 <= density <= 1

Integration Tests: Test workflows across modules

def test_community_detection_workflow():
    network = load_network("test_data.graphml")
    communities = community_louvain.best_partition(network.core_network)
    assert len(communities) > 0

Property-Based Tests: Test invariants

def test_centrality_normalization():
    network = create_random_network()
    centrality = calc.multilayer_degree_centrality(network)
    # Centrality values should be normalized
    assert all(0 <= v <= 1 for v in centrality.values())

Use small synthetic networks for speed, seed any randomness (e.g., layouts or random walks), and isolate file I/O behind fixtures. Validate invariants on encoded nodes (node-layer pairs) to catch subtle encoding regressions early.

Performance Considerations

Lazy Evaluation

Expensive operations are computed on-demand:

class multi_layer_network:
    @property
    def supra_adjacency(self):
        if self._supra_adj_cache is None:
            self._supra_adj_cache = self._compute_supra_adjacency()
        return self._supra_adj_cache

Sparse Matrices

Use sparse representations for large networks:

def get_supra_adjacency_matrix(self, sparse=True):
    adj = self._compute_dense_adjacency()
    if sparse or len(self.get_nodes()) > SPARSE_THRESHOLD:
        return scipy.sparse.csr_matrix(adj)
    return np.array(adj)

In this section, n refers to encoded nodes (node-layer pairs) and m to edges in the aggregated core graph. Use sparse matrices once n grows beyond a few thousand to avoid cubic blowups in dense linear algebra. Dense operations scale poorly on supra-adjacency matrices because they are block-structured but still quadratic in layer count.

Vectorization

Prefer NumPy vectorized operations:

# Bad: Python loop
degrees = [sum(1 for _ in G.neighbors(node)) for node in G.nodes()]

# Good: Vectorized
degrees = np.array(list(dict(G.degree()).values()))

Logging Infrastructure

Centralized Logging

py3plex/logging_config.py provides structured logging:

import logging
from py3plex.logging_config import get_logger

logger = get_logger(__name__)

logger.info("Processing network with %d nodes", num_nodes)
logger.warning("Large network detected, using sparse matrices")
logger.error("Invalid layer: %s", layer_name)

Configure logging once per process; repeated basicConfig calls in library code lead to duplicated log lines.

Log Levels

  • DEBUG: Detailed diagnostic information

  • INFO: General informational messages

  • WARNING: Warning messages (e.g., performance concerns)

  • ERROR: Error messages

  • CRITICAL: Critical errors

Error Handling

Custom Exceptions

py3plex/exceptions.py defines domain-specific exceptions:

class NetworkError(Exception):
    """Base exception for network errors."""
    pass

class LayerNotFoundError(NetworkError):
    """Raised when layer doesn't exist."""
    pass

class InvalidFormatError(NetworkError):
    """Raised when file format is invalid."""
    pass

Usage:

from py3plex.exceptions import LayerNotFoundError

def get_layer(self, layer_name):
    if layer_name not in self.layer_name_map:
        raise LayerNotFoundError(f"Layer '{layer_name}' not found")
    return self.layer_name_map[layer_name]

Future Architecture

Planned Improvements

  1. Backend Registry: Support for igraph, cugraph backends

  2. Streaming API: Process networks larger than memory

  3. Distributed Computing: Dask/Ray integration for large-scale analysis

  4. Plugin System: Easy addition of third-party algorithms

  5. Type System: Expand type hints coverage and enforce mypy across core and algorithms

See Also