Architecture and Design

This document describes the system architecture, design patterns, and extension points of py3plex. It is a map of how the layers fit together, what invariants each layer keeps, and where to extend the library safely without breaking caller expectations.

System Overview

py3plex is built as a modular, layered architecture with clear separation of concerns. Each layer depends only on the one below it, which keeps algorithm code focused on logic instead of plumbing. All layers operate on encoded node-layer identifiers supplied by the core layer so that layer membership is always explicit.

┌─────────────────────────────────────────────────────┐
│          High-Level Interfaces (Wrappers)           │
│  node2vec_embedding, benchmark_nodes                │
└─────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│              Algorithms Layer                        │
│  ┌──────────────┬───────────────┬─────────────────┐ │
│  │  Community   │  Statistics   │  Multilayer     │ │
│  │  Detection   │               │  Algorithms     │ │
│  └──────────────┴───────────────┴─────────────────┘ │
└─────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│            Visualization Layer                       │
│  ┌──────────────┬───────────────┬─────────────────┐ │
│  │  Multilayer  │  Drawing      │  Layout         │ │
│  │  Plots       │  Machinery    │  Algorithms     │ │
│  └──────────────┴───────────────┴─────────────────┘ │
└─────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│                Core Layer                            │
│  ┌──────────────┬───────────────┬─────────────────┐ │
│  │  multinet    │  Parsers      │  Converters     │ │
│  │  (MultiLayer │               │                 │ │
│  │   Network)   │               │                 │ │
│  └──────────────┴───────────────┴─────────────────┘ │
└─────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│           NetworkX Foundation                        │
│  MultiDiGraph, MultiGraph, Algorithms                │
└─────────────────────────────────────────────────────┘

Architectural Layers

Each layer exposes a narrow surface to the one above it. When adding new features, decide which layer should own the responsibility first to avoid pushing implementation details upward.

Core Layer

Purpose: Fundamental data structures and I/O operations. All higher layers manipulate networks through this layer.

Key Components:

multinet.py - multi_layer_network class
parsers.py - Input/output for supported formats
converters.py - Format conversion helpers
random_generators.py - Synthetic network generators
HINMINE/ - Heterogeneous network decomposition

Responsibilities:

Network construction and manipulation
File I/O (GraphML, GML, GEXF, edge lists, etc.)
Layer management
Matrix representations (adjacency, supra-adjacency) with clear encoding rules
NetworkX integration so downstream algorithms work on familiar graphs
Encoding invariants that keep node-layer separation consistent across the stack

Design Pattern: Facade Pattern - multi_layer_network provides a unified interface to complex NetworkX operations and centralizes encoding/decoding rules

Algorithms Layer

Purpose: Network analysis algorithms optimized for multilayer networks.

Key Components:

community_detection/ - Community detection algorithms
statistics/ - Network statistics and metrics
multilayer_algorithms/ - Multilayer-specific algorithms
node_ranking/ - Centrality and ranking measures
general/ - General-purpose algorithms (random walks, etc.)

Responsibilities:

Community detection (Louvain, Infomap, Label Propagation)
Statistical analysis (density, overlap, correlation)
Centrality computation (degree, betweenness, PageRank, etc.)
Random walks and embeddings
Network decomposition using explicit layer boundaries or coupling rules

Design Pattern: Strategy Pattern - Different algorithms implement common interfaces

Visualization Layer

Purpose: Network plotting and rendering.

Key Components:

multilayer.py - High-level multilayer plotting
drawing_machinery.py - Core drawing primitives
layout_algorithms.py - Layout computation
colors.py - Color scheme generators
fa2/ - ForceAtlas2 layout

Responsibilities:

Diagonal projection plots (dense but layer-aware)
Force-directed layouts
Matrix visualizations
Color mapping and legends
Interactive plots (via Plotly)

Design Pattern: Template Method Pattern - Layout algorithms follow a common template and separate layout computation from rendering

Wrappers Layer

Purpose: High-level interfaces for common workflows.

Key Components:

node2vec_embedding.py - Node2Vec embedding generation
benchmark_nodes.py - Node classification benchmarking

Responsibilities:

Simplified interfaces for complex workflows
Integration with external tools
Benchmarking and evaluation

Design Pattern: Facade Pattern - Simplify complex multi-step operations

Core Data Structure

The multi_layer_network Class

Central to py3plex, this class manages multilayer network state and enforces encoding conventions.

class multi_layer_network:
    def __init__(self, directed=True, label_delimiter="---", coupling_weight=1.0):
        self.core_network = nx.MultiDiGraph() if directed else nx.MultiGraph()
        self.layer_name_map = {}  # Bidirectional mapping
        self.label_delimiter = label_delimiter
        self.coupling_weight = coupling_weight
        self.embedding = None
        self.labels = None

Key Attributes:

core_network - Underlying NetworkX graph (MultiDiGraph if directed=True)
layer_name_map - Maps layer names to integer IDs
label_delimiter - Separator for node-layer encoding (default: "---"). Avoid using the delimiter inside raw node IDs.
coupling_weight - Default weight for inter-layer edges when none is provided
embedding - Cached node embedding matrix
labels - Node classification labels

Encoding Scheme:

Nodes are encoded as "{node_id}{delimiter}{layer_id}" before they are added to core_network.

Example: Node ‘A’ in layer ‘social’ becomes "A---social". This keeps the multilayer structure explicit while allowing standard NetworkX algorithms to run.

Encoded node keys remain opaque to callers. When returning results (e.g., centrality), keep them encoded to preserve layer information unless explicitly decoded for presentation. If raw node IDs are needed, split on label_delimiter consistently so delimiter changes do not silently corrupt parsing.

Design Patterns

Facade Pattern

Used in: multi_layer_network, wrappers

Purpose: Provide simplified interface to complex subsystems

# Complex underlying operations hidden behind simple interface
network = multinet.multi_layer_network()
network.add_edges(edges, input_type='list')  # Handles parsing, encoding, validation
network.basic_stats()  # Aggregates multiple NetworkX calls and keeps caches coherent

Strategy Pattern

Used in: Algorithms, layout computation

Purpose: Interchangeable algorithms following common interface

# Different community detection strategies
def detect_communities(network, method='louvain'):
    strategies = {
        'louvain': community_louvain.best_partition,
        'infomap': community_wrapper.infomap_communities,
        'label_prop': label_propagation.propagate
    }
    return strategies[method](network.core_network)

Template Method Pattern

Used in: Visualization, layout algorithms

Purpose: Define algorithm skeleton, allow customization in subclasses

class LayoutAlgorithm:
    def compute(self, graph):
        self.initialize(graph)
        self.iterate()
        return self.finalize()

    def initialize(self, graph):
        raise NotImplementedError

    def iterate(self):
        raise NotImplementedError

    def finalize(self):
        raise NotImplementedError

Dependency Injection

Used in: Configuration, algorithm parameters

Purpose: Inject dependencies rather than hard-coding

# Configuration injected rather than hard-coded
from py3plex.config import DEFAULT_COLORS, LAYOUT_PARAMS

def draw_network(network, colors=None, layout_params=None):
    colors = colors or DEFAULT_COLORS
    layout_params = layout_params or LAYOUT_PARAMS
    # Use injected configuration

Data Flow

The steps below show how a typical workflow moves through the layers: creation in the core, processing in algorithms, summarization, and presentation in visualization or wrappers. The examples use encoded nodes throughout so layer context is preserved.

Typical Workflow

Input: Load or create network

network = multinet.multi_layer_network()
network.load_network("data.graphml", input_type="graphml")

Processing: Apply algorithms

communities = community_louvain.best_partition(network.core_network)
centrality = calc.multilayer_degree_centrality(network)

Analysis: Compute statistics

density = mls.layer_density(network, 'layer1')
correlation = mls.inter_layer_degree_correlation(network, 'L1', 'L2')

Visualization: Render results

draw_multilayer_default([network], display=True)

Output: Export results

network.save_network("output.graphml", output_type="graphml")

State Management

Immutable Operations: Most algorithms don’t modify the network.

# These don't modify the network
centrality = calc.multilayer_degree_centrality(network)
communities = community_louvain.best_partition(network.core_network)

Mutable Operations: Some operations modify network state and should trigger cache invalidation (e.g., supra-adjacency, cached layouts).

# These modify the network
network.add_edges(new_edges, input_type='list')
network.aggregate_layers(['L1', 'L2'], 'combined')

When mutating, clear or recompute any cached matrices or embeddings that depend on structure before running further analysis. Mutations should be localized (e.g., via helper methods) so cache invalidation stays centralized and deterministic.

Extension Points

Custom Algorithms

Add new algorithms by following existing patterns. Prefer operating on network.core_network (already encoded) and return results keyed by encoded nodes or layers so downstream utilities work unchanged. Keep any cacheable intermediates (e.g., supra-adjacency) out of global scope to avoid cross-test leakage:

# py3plex/algorithms/my_module/my_algorithm.py
from py3plex.core.multinet import multi_layer_network

def my_centrality(network: multi_layer_network) -> dict:
    """
    Custom centrality measure.

    Parameters
    ----------
    network : multi_layer_network
        Input network

    Returns
    -------
    dict
        Node centrality scores
    """
    G = network.core_network
    centrality = {}

    for node in G.nodes():
        # Implement custom logic
        centrality[node] = compute_score(node, G)

    return centrality

Custom Visualizations

Create custom plots using drawing machinery. Keep layout computation separate from drawing so alternative layouts can be swapped in:

from py3plex.visualization import drawing_machinery as dm
import matplotlib.pyplot as plt

def my_custom_plot(network):
    """Custom visualization."""
    fig, ax = plt.subplots(figsize=(10, 8))

    # Compute layout
    pos = dm.compute_layout(network.core_network, 'force')

    # Draw elements
    dm.draw_nodes(ax, network.core_network, pos, node_size=50)
    dm.draw_edges(ax, network.core_network, pos, edge_width=1)
    dm.draw_labels(ax, pos, labels=network.get_node_labels())

    plt.show()

Custom Parsers

Add support for new file formats. Keep parsing side effects minimal and reuse existing helpers for validation where possible. Preserve the node-layer encoding contract so downstream algorithms receive encoded nodes:

# py3plex/core/parsers.py
def parse_my_format(input_file, **kwargs):
    """
    Parse custom file format.

    Parameters
    ----------
    input_file : str
        Path to input file

    Returns
    -------
    multi_layer_network
        Parsed network
    """
    network = multi_layer_network()

    with open(input_file, 'r') as f:
        for line in f:
            # Parse line and add to network
            pass

    return network

Configuration System

Centralized Configuration

py3plex/config.py provides centralized configuration:

# Default color palettes (8 options including colorblind-safe)
DEFAULT_COLORS = 'Set1'
COLORBLIND_SAFE = 'colorblind'

# Visualization defaults
DEFAULT_NODE_SIZE = 20
DEFAULT_EDGE_WIDTH = 1.0
DEFAULT_ALPHA = 0.7

# Layout parameters
LAYOUT_PARAMS = {
    'force': {'iterations': 500, 'optimal_distance': 1.0},
    'fa2': {'iterations': 1000, 'gravity': 1.0}
}

# Performance settings
SPARSE_THRESHOLD = 1000  # Use sparse matrices above this node count
MEMORY_WARNING_THRESHOLD = 10000  # Warn for large dense matrices

Usage:

from py3plex.config import DEFAULT_COLORS, LAYOUT_PARAMS

colors = DEFAULT_COLORS
iterations = LAYOUT_PARAMS['force']['iterations']

Override configuration at call sites instead of mutating globals to keep tests reproducible. If a global default must change, document the reason and expected downstream effects on visualization, performance, or memory.

Testing Architecture

Test Organization

tests/
├── test_core_functionality.py      # Core data structure tests
├── test_multilayer_*.py            # Multilayer algorithm tests
├── test_random_walks.py            # Random walk tests
├── test_io_*.py                    # I/O and parsing tests
├── test_config_api.py              # Configuration tests
└── test_utils.py                   # Utility function tests

Test Patterns

Unit Tests: Test individual functions in isolation

def test_layer_density():
    network = create_test_network()
    density = mls.layer_density(network, 'layer1')
    assert 0 <= density <= 1

Integration Tests: Test workflows across modules

def test_community_detection_workflow():
    network = load_network("test_data.graphml")
    communities = community_louvain.best_partition(network.core_network)
    assert len(communities) > 0

Property-Based Tests: Test invariants

def test_centrality_normalization():
    network = create_random_network()
    centrality = calc.multilayer_degree_centrality(network)
    # Centrality values should be normalized
    assert all(0 <= v <= 1 for v in centrality.values())

Use small synthetic networks for speed, seed any randomness (e.g., layouts or random walks), and isolate file I/O behind fixtures. Validate invariants on encoded nodes (node-layer pairs) to catch subtle encoding regressions early.

Performance Considerations

Lazy Evaluation

Expensive operations are computed on-demand:

class multi_layer_network:
    @property
    def supra_adjacency(self):
        if self._supra_adj_cache is None:
            self._supra_adj_cache = self._compute_supra_adjacency()
        return self._supra_adj_cache

Sparse Matrices

Use sparse representations for large networks:

def get_supra_adjacency_matrix(self, sparse=True):
    adj = self._compute_dense_adjacency()
    if sparse or len(self.get_nodes()) > SPARSE_THRESHOLD:
        return scipy.sparse.csr_matrix(adj)
    return np.array(adj)

In this section, n refers to encoded nodes (node-layer pairs) and m to edges in the aggregated core graph. Use sparse matrices once n grows beyond a few thousand to avoid cubic blowups in dense linear algebra. Dense operations scale poorly on supra-adjacency matrices because they are block-structured but still quadratic in layer count.

Vectorization

Prefer NumPy vectorized operations:

# Bad: Python loop
degrees = [sum(1 for _ in G.neighbors(node)) for node in G.nodes()]

# Good: Vectorized
degrees = np.array(list(dict(G.degree()).values()))

Logging Infrastructure

Centralized Logging

py3plex/logging_config.py provides structured logging:

import logging
from py3plex.logging_config import get_logger

logger = get_logger(__name__)

logger.info("Processing network with %d nodes", num_nodes)
logger.warning("Large network detected, using sparse matrices")
logger.error("Invalid layer: %s", layer_name)

Configure logging once per process; repeated basicConfig calls in library code lead to duplicated log lines.

Log Levels

DEBUG: Detailed diagnostic information
INFO: General informational messages
WARNING: Warning messages (e.g., performance concerns)
ERROR: Error messages
CRITICAL: Critical errors

Error Handling

Custom Exceptions

py3plex/exceptions.py defines domain-specific exceptions:

class NetworkError(Exception):
    """Base exception for network errors."""
    pass

class LayerNotFoundError(NetworkError):
    """Raised when layer doesn't exist."""
    pass

class InvalidFormatError(NetworkError):
    """Raised when file format is invalid."""
    pass

Usage:

from py3plex.exceptions import LayerNotFoundError

def get_layer(self, layer_name):
    if layer_name not in self.layer_name_map:
        raise LayerNotFoundError(f"Layer '{layer_name}' not found")
    return self.layer_name_map[layer_name]

Future Architecture

Planned Improvements

Backend Registry: Support for igraph, cugraph backends
Streaming API: Process networks larger than memory
Distributed Computing: Dask/Ray integration for large-scale analysis
Plugin System: Easy addition of third-party algorithms
Type System: Expand type hints coverage and enforce mypy across core and algorithms

Architecture and Design

System Overview

Architectural Layers

Core Layer

Algorithms Layer

Visualization Layer

Wrappers Layer

Core Data Structure

The multi_layer_network Class

Design Patterns

Facade Pattern

Strategy Pattern

Template Method Pattern

Dependency Injection

Data Flow

Typical Workflow

State Management

Extension Points

Custom Algorithms

Custom Visualizations

Custom Parsers

Configuration System

Centralized Configuration

Testing Architecture

Test Organization

Test Patterns

Performance Considerations

Lazy Evaluation

Sparse Matrices

Vectorization

Logging Infrastructure

Centralized Logging

Log Levels

Error Handling

Custom Exceptions

Future Architecture

Planned Improvements

See Also