CSV Data Loading Examples

This guide provides practical examples for loading multilayer network data from CSV files, which is one of the most common formats for network data.

Standard CSV Schema

Multilayer Edge List Format

The most common format for multilayer networks is a CSV with these columns:

Required columns: source, target, layer Optional columns: weight (defaults to 1.0)

Example CSV file (save as network.csv):

source,target,layer,weight
A,B,collaboration,1.0
A,C,dependency,0.8
B,C,collaboration,1.0
A,B,dependency,0.5
C,D,collaboration,0.9

Loading Multilayer CSV

from py3plex.core import multinet

# Create network object
network = multinet.multi_layer_network()

# Load from CSV
network.load_network(
    "network.csv",
    input_type="multiedgelist",
    directed=False
)

# Display basic statistics
network.basic_stats()

Expected Output:

Number of nodes: 4
Number of edges: 5
Number of unique nodes (as node-layer tuples): 7
Number of unique node IDs (across all layers): 4
Nodes per layer:
  Layer 'collaboration': 4 nodes
  Layer 'dependency': 3 nodes

Simple Edge List Format

For single-layer networks, use a simpler format:

Required columns: source, target Optional columns: weight (defaults to 1.0)

Example CSV file (save as simple_network.csv):

source,target,weight
A,B,1.0
B,C,0.8
C,D,1.5
D,A,0.9

Loading Simple CSV

from py3plex.core import multinet

network = multinet.multi_layer_network()

# Load simple edge list
network.load_network(
    "simple_network.csv",
    input_type="edgelist",
    directed=False
)

network.basic_stats()

Creating CSV from Pandas DataFrame

Building Network from Existing Data

If you have data in a Pandas DataFrame, convert it to CSV format:

import pandas as pd
from py3plex.core import multinet

# Create sample data
data = {
    'source': ['Alice', 'Alice', 'Bob', 'Bob', 'Charlie'],
    'target': ['Bob', 'Charlie', 'Charlie', 'David', 'David'],
    'layer': ['friendship', 'work', 'friendship', 'work', 'friendship'],
    'weight': [1.0, 0.8, 1.0, 0.6, 0.9]
}
df = pd.DataFrame(data)

# Save to CSV
df.to_csv('social_network.csv', index=False)

# Load into py3plex
network = multinet.multi_layer_network()
network.load_network('social_network.csv', input_type="multiedgelist")

print(f"Loaded {network.core_network.number_of_nodes()} nodes")
print(f"Loaded {network.core_network.number_of_edges()} edges")

Alternative Formats

Space-Delimited Multiedgelist

Py3plex also supports space-delimited files (no header):

Format: source layer1 target layer2 [weight]

Example file (save as network.txt):

A layer1 B layer1 1.0
B layer1 C layer1 1.0
A layer2 B layer2 0.8
B layer2 D layer2 0.6
# Load space-delimited format
network = multinet.multi_layer_network()
network.load_network(
    "network.txt",
    input_type="multiedgelist",
    directed=False
)

Common Issues and Solutions

Issue: “Layer name missing” error

Problem: CSV is missing the ‘layer’ column

Solution 1: Add layer column to CSV

import pandas as pd

# Read CSV without layer
df = pd.read_csv('network.csv')

# Add default layer
df['layer'] = 'layer1'

# Save updated CSV
df.to_csv('network_with_layer.csv', index=False)

Solution 2: Use ‘edgelist’ format for single-layer networks

# If your data has only source/target columns
network.load_network(
    "network.csv",
    input_type="edgelist",  # Use edgelist instead of multiedgelist
    directed=False
)

Issue: “Could not load network” error

Problem: CSV format doesn’t match expected schema

Solution: Verify CSV structure

import pandas as pd

# Check CSV columns
df = pd.read_csv('network.csv')
print("Columns:", df.columns.tolist())
print("\nFirst few rows:")
print(df.head())

# Verify required columns exist
required = ['source', 'target', 'layer']
missing = [col for col in required if col not in df.columns]

if missing:
    print(f"\nMissing columns: {missing}")
else:
    print("\n[OK] All required columns present")

Issue: Encoding errors with special characters

Problem: CSV contains non-ASCII characters (é, ñ, 中文, etc.)

Solution: Specify encoding when creating CSV

import pandas as pd

# Save with UTF-8 encoding
df.to_csv('network.csv', index=False, encoding='utf-8')

# If loading fails, try reading with explicit encoding
df = pd.read_csv('network.csv', encoding='utf-8')

Complete Example Workflow

End-to-End Pipeline: CSV → Analysis → Visualization

import pandas as pd
from py3plex.core import multinet
from py3plex.visualization.multilayer import draw_multilayer_default
from py3plex.algorithms.community_detection import community_louvain
import matplotlib.pyplot as plt

# Step 1: Create sample CSV data
data = {
    'source': ['A', 'A', 'B', 'B', 'C', 'C', 'D'],
    'target': ['B', 'C', 'C', 'D', 'D', 'E', 'E'],
    'layer': ['social', 'work', 'social', 'work', 'social', 'work', 'social'],
    'weight': [1.0, 0.8, 1.0, 0.7, 0.9, 0.6, 0.8]
}
df = pd.DataFrame(data)
df.to_csv('example_network.csv', index=False)

# Step 2: Load network
network = multinet.multi_layer_network()
network.load_network('example_network.csv', input_type="multiedgelist")

# Step 3: Basic analysis
print("=== Network Statistics ===")
network.basic_stats()

# Step 4: Community detection
communities = community_louvain.best_partition(network.core_network)
print(f"\nDetected {len(set(communities.values()))} communities")

# Step 5: Centrality analysis
from py3plex.algorithms.statistics import basic_statistics
hubs = basic_statistics.identify_n_hubs(network.core_network, top_n=3)
print("\nTop 3 hub nodes:")
for node, degree in hubs.items():
    print(f"  {node}: degree {degree}")

# Step 6: Visualization
fig, ax = plt.subplots(1, 1, figsize=(10, 8))
draw_multilayer_default(
    network.get_layers(),
    display=False,
    axis=ax,
    labels=True,
    background_shape="circle"
)
plt.title("Multilayer Network from CSV")
plt.savefig('network_visualization.png', dpi=300, bbox_inches='tight')
print("\n[OK] Visualization saved to network_visualization.png")

# Step 7: Export to NetworkX for further analysis
nx_graph = network.to_nx_network()
print(f"\n[OK] Exported to NetworkX: {nx_graph.number_of_nodes()} nodes, "
      f"{nx_graph.number_of_edges()} edges")

Expected Output:

=== Network Statistics ===
Number of nodes: 5
Number of edges: 7
Number of unique nodes (as node-layer tuples): 9
Number of unique node IDs (across all layers): 5
Nodes per layer:
  Layer 'social': 5 nodes
  Layer 'work': 5 nodes

Detected 2 communities

Top 3 hub nodes:
  ('C', 'social'): degree 3
  ('D', 'social'): degree 3
  ('B', 'social'): degree 2

[OK] Visualization saved to network_visualization.png
[OK] Exported to NetworkX: 9 nodes, 7 edges

Validation Before Loading

Pre-validate CSV Format

Use the validation module to check CSV format before loading:

from py3plex.validation import validate_network_data, ParsingError

try:
    # Validate before loading
    validate_network_data('network.csv', 'multiedgelist')
    print("[OK] Validation passed")

    # Safe to load
    network = multinet.multi_layer_network()
    network.load_network('network.csv', input_type='multiedgelist')

except ParsingError as e:
    print(f"[X] Validation failed:\n{e}")
    # Fix CSV and try again

This performs checks for:

  • File exists and is readable

  • Required columns are present

  • CSV format is valid

  • Data types are appropriate

Tips and Best Practices

  1. Always use column names in CSV (first row should be header)

  2. Use UTF-8 encoding for files with special characters

  3. Validate data before loading large networks

  4. Use meaningful layer names (e.g., ‘friendship’, ‘work’ instead of ‘layer1’, ‘layer2’)

  5. Normalize weights if needed (e.g., scale to [0, 1] range)

  6. Handle missing values appropriately:

import pandas as pd

# Load CSV and handle missing values
df = pd.read_csv('network.csv')

# Fill missing weights with default
df['weight'] = df['weight'].fillna(1.0)

# Remove rows with missing source/target
df = df.dropna(subset=['source', 'target', 'layer'])

# Save cleaned data
df.to_csv('network_cleaned.csv', index=False)

Next Steps

After loading CSV data:

For more examples, see the examples/ directory in the GitHub repository.