Allen's Graph Theory Basics

AI Translated from Chinese

Introduction

What is Graph Theory?

Graph theory is a very interesting branch of discrete mathematics, with “graphs” as the main research object, studying the internal structure of graphs, exploring their combinatorial, algebraic, and topological properties, and finding “good graphs” that satisfy conditions. For example, the famous Seven Bridges problem (Euler circuit) belongs to graph theory.

So, what is a graph?

Simply put, a graph (Graph) is an object composed of multiple points and lines connecting these points. For specific definitions, see Wikipedia - Graph, OI Wiki - Graph Theory Concepts

Graph

In computers, graphs are also a very important data structure. Problems like shortest paths, minimum spanning trees, and network flows often require graphs to implement.

Graph Storage

This article uses $ n $ to refer to the number of nodes in a graph, $ m $ to refer to the number of edges, and $ d^+(u) $ to refer to the out-degree of node $ u $, i.e., the number of edges starting from $ u $.

Adjacency Matrix

Method

From the concept of graphs, we know that graphs consist of node sets and edge sets. We can use a 2D array G to store edges, where G[u][v] = 1 indicates an edge exists from u to v. If the edge has weight w, we can set G[u][v] = w.

#include <iostream>
#include <cstring>
using namespace std;

const int maxn = 1000;
int n, m;
bool vis[maxn];
int G[maxn][maxn];

// Check if edge (u, v) exists
bool find_edge(int u, int v) { return G[u][v]; }

void dfs(int u) {
    if (vis[u]) return;
    vis[u] = true;
    cout << u << ' ';
    for (int v = 1; v <= n; v++)
        if (G[u][v]) dfs(v);
}

int main() {
    // Total n nodes, m edges
    cin >> n >> m;
    for (int i = 1; i <= m; i++) {
        int u, v, w;
        cin >> u >> v >> w;
        G[u][v] = w;
    }
    cout << endl;

    for (int i = 1; i <= n; i++) {
        memset(vis, false, sizeof(vis));
        dfs(i);
        cout << '\n';
    }
    cout << endl;

    cout << find_edge(3, 1) << endl;
    cout << find_edge(4, 5) << endl;
    return 0;
}
/* Input example:
5 6
1 2 1
2 3 2
1 3 3
3 1 10
3 4 3
5 2 7
*/

Example problem: P1359 Renting a Yacht

Use adjacency matrix to store the graph. Note that in this problem, the adjacency matrix is a half-matrix (similar to an upper triangular matrix, but diagonal elements are 0, i.e., a[i][i] = 0). Note that plain DFS will time out, so pruning is needed.

Dynamic programming can also solve it, and the code is very concise (seems like it was originally a DP problem, but I hardcoded it into DFS).

Complexity

Query if an edge exists: $ O(1) $

Traverse all outgoing edges of a node: $ O(n) $

Traverse the entire graph: $ O(n^2) $

Space complexity: $ O(n^2) $

Applications

Adjacency matrix is only suitable when there are no multiple edges (or multiple edges can be ignored).

Its most significant advantage is being able to query whether an edge exists in $ O(1) $.

Since adjacency matrix has very low efficiency on sparse graphs (especially on graphs with many nodes where space is unaffordable), it’s generally only used on dense graphs.

(The above information about complexity and applications is from OI Wiki - Graph Storage)

Adjacency List

Method

We can store edges associated with each node separately, like:

1: -->a -->b -->c
2: -->b -->c
3: -->c -->d -->e
4: -->a -->d

As you can see, an adjacency list is a kind of linked list that stores edges in any order. But in algorithm competitions, we can use arrays or vector to simulate linked lists.

We can also store the endpoints of edges, as shown in the diagram:

1: [2, 3, 5]
2: [5]
3: [1]
4: [3]
5: []

For undirected edges, just store both endpoints once. For example, undirected edge (1, 2):

1: [2]
2: [1]

Here is the C++ implementation:

#include <iostream>
#include <vector>
using namespace std;

struct Node {
    int v;  // Endpoint of edge
    int w;  // Edge weight
};

int n, m;
vector<int> vis;
vector<vector<Node> > G;  // Store all outgoing edge information of node u (endpoint, edge weight, etc.)

// Check if edge (u, v) exists
bool find_edge(int u, int v) {
    for (int i = 0; i < (int)G[u].size(); i++)
        if (G[u][i].v == v) return true;
    return false;
}

void dfs(int u) {
    if (vis[u]) return;
    vis[u] = 1;
    cout << u << ' ';
    for (int i = 0; i < (int)G[u].size(); i++) dfs(G[u][i].v);
}

int main() {
    // Total n nodes, m edges
    cin >> n >> m;

    vis.resize(n+1, false);
    G.resize(n+1);

    for (int i = 1; i <= m; i++) {
        int u, v, w;
        cin >> u >> v >> w;
        G[u].push_back({v, w});
    }

    for (int i = 1; i <= n; i++) {
        dfs(i);
        cout << endl;
    }

    cout << find_edge(1, 2) << endl;

    return 0;
}

Complexity

Query if edge from u to v exists: $ O(d^+(u)) $ (if sorted beforehand, binary search can achieve $ O(\log(d^+(u))) $)

Traverse all outgoing edges of node u: $ O(d^+(u)) $

Traverse the entire graph: $ O(n+m) $

Space complexity: $ O(m) $

Applications

Suitable for storing various graphs, unless there are special requirements (such as needing to quickly query if an edge exists with few nodes, adjacency matrix can be used).

Especially suitable when needing to sort all outgoing edges of a node.

Chain Forward Star

Method

An upgraded version of adjacency list, which is essentially a statically built adjacency list.

Its characteristic is numbering edges, with the first edge numbered 0.

The key is understanding these variables:

Edge array: e[i] stores the edge weight of the i-th edge
ne, representing the number of the previous edge with the same starting point
Head node array: h[i] represents the number of the last edge starting from i

The h array is generally initialized to -1, and when traversing the graph, ne = -1 is used as the termination condition.

#include <iostream>
#include <cstring>
#include <algorithm>
using namespace std;

const int N = 1e5 + 10;
int n, m, idx;
int h[N], ne[N], to[N], e[N], vis[N];


// h[u] initial value is -1
void add_edge(int u, int v, int w) {
    to[idx] = v;     // Endpoint
    e[idx] = w;      // Edge weight
    ne[idx] = h[u];  // Previous edge number starting from u
    h[u] = idx++;    // Update the previous edge number starting from u
}

void dfs(int u) {
    cout << u << ' ';
    vis[u] = true;
    for (int i = h[u]; ~i; i = ne[i]) {  // ~i is equivalent to i != -1
        if (!vis[to[i]]) dfs(to[i]);
    }
}

void print() {   // Traverse endpoints of outgoing edges for each node
    for (int i = 1; i <= n; i++) {
        cout << i << ": ";
        for (int j = h[i]; ~j; j = ne[j])
            cout << to[j] << ' ';
        cout << "\n";
    }
}

int main() {
    memset(h, -1, sizeof(h));  // Initialize; ne[i] = -1 means reaching endpoint

    // Store graph
    cin >> n >> m;
    for (int i = 1; i <= m; i++) {
        int u, v, w;
        cin >> u >> v >> w;
        add_edge(u, v, w);
        // add_edge(v, u, w);  // Undirected edge

        // Output storage process
        cout<<"to["<<idx-1<<"] = "<<to[idx-1]<<"; "
            <<"ne["<<idx-1<<"] = "<<ne[idx-1]<<"; "
            <<"h["<<u<<"] = "<<h[u]<<";\n";
    }

    print();

    for (int i = 1; i <= n; i++) {
        memset(vis, 0, sizeof(vis));
        dfs(i);
    }

    // Output storage results
    cout << "\n";
    for (int i = 0; i <= n; i++) {
        cout << "h[" << i << "] = " << h[i] << ", ";
        cout << "ne[" << i << "] = " << ne[i] << ", ";
        cout << "to[" << i << "] = " << to[i] << ", ";
        cout << "e[" << i << "] = " << e[i] << ",\n";
    }
    return 0;
}
/* Input example:
4 3
1 2 5
2 4 10
4 3 15
*/

Complexity

Query if edge from u to v exists: $ O(d^+(u)) $

Traverse all outgoing edges of node u: $ O(d^+(u)) $

Traverse the entire graph: $ O(n+m) $

Space complexity: $ O(m) $

Applications

Suitable for storing various graphs, but cannot quickly query if an edge exists, nor conveniently sort outgoing edges of a node.

The advantage is that edges are numbered, which can sometimes be very useful. Also, if idx’s initial value is odd, when storing bidirectional edges, i ^ 1 is the reverse edge of i (often used in network flow).

Example Problem

P3916 Graph Traversal

The trick for this problem is to store the graph in reverse, to find the largest numbered node each point can reach in reverse, i.e.:

cin >> u >> v >> w;
add_edge(v, u);

Reverse DFS to get the result:

for (int i = n; i > 0; i--)
    if (!vis[i]) dfs(i, i);

Finally output vis[].

Topological Sort

In graph theory, a sequence of vertices of a directed acyclic graph is called a topological sort of the graph if and only if it satisfies the following conditions:

1. The sequence contains each vertex exactly once;
1. If A comes before B in the sequence, there is no path from B to A in the graph.

Topological sorting is obtaining a total order from a partial order. Simply put, it’s traversing all nodes of the graph in a specific order: before traversing a node, all nodes pointing to it must have been traversed. As you can see, only directed acyclic graphs (DAG) have topological order.

We can devise such an algorithm:

L = []  # Topological order
while there are nodes with in-degree 1:
    randomly select a node u with in-degree 1
    L.append(u)
    for v in G[u]:
        v's degree -= 1
    delete node u from the graph
if len(L) == initial graph's total nodes:
    return True  # Topological order exists
else:
    return False

For example, the diagram below:

# Its topological order can be:
1 2 3 4 5
# Or:
1 3 2 4 5

Can be implemented with BFS or DFS. Since only one traversal of nodes is needed, the complexity is quite reasonable, at $O(V+E)$.

BFS

int n, m, _in[N];  // Nodes, edges, in-degree
bool vis[N];
vector<vector<int>> G;
vector<int> topo;


bool toposort() {
    queue<int> q;
    for (int i = 1; i <= n; i++) {
        if (!_in[i]) q.push(i);
    }
    while (!q.empty()) {
        int u = q.front();
        q.pop();
        topo.emplace_back(u);
        for (auto &v : G[u]) {
            _in[v]--;
            if (!_in[v]) q.push(v);
        }
    }
    return (int)topo.size() == n;
}

DFS

void dfs(int i) {
    if (!_in[i] && !vis[i] && !tmp[i]) {
        topo.emplace_back(i);
        vis[i] = true;
        tmp[i] = true;  // Temporary mark. If traversing to already marked node, not a DAG
        for (auto &j : G[i]) {
            _in[j]--;
            dfs(j);
        }
        tmp[i] = false;
    }
}

bool toposort() {
    for (int i = 1; i <= n; i++) dfs(i);
    return ((int)topo.size() == n);
}

Example Problem

Codeforces Round 748 (Div. 3) E. Gardener and Tree

Problem: Given an undirected tree, each operation deletes all leaf nodes. How many nodes remain after k operations?

Analogous to topological sort, each operation selects nodes with degree 1, traverses their outgoing edges and reduces the degree of the endpoint by 1, then deletes these nodes. Note: The given tree is undirected, so DFS cannot be used directly (for example, if the tree is a single chain, DFS will traverse from one end to the other directly). BFS is better to write, and the code is similar. The only difference is to check if a neighbor’s degree is 1 before deletion (meaning the neighbor is already in the queue).

#include <bits/stdc++.h>

#define IOS ios::sync_with_stdio(false);cin.tie(0);cout.tie(0);
#define _all(x) (x).begin(), (x).end()
#define _abs(x) ((x >= 0) ? (x) : (-x))
#define PII pair<int, int>
#define PLL pair<LL, LL>
#define PDD pair<double, double>
#define fi first
#define se second

typedef long long LL;
typedef long double LD;
typedef unsigned long long ULL;

const int INF = 0x3f3f3f3f;
const double PI = acos(-1.0);

using namespace std;

const int N = 4e5 + 10;
int d[N], vis[N], n, k;  // d[] is node degree, vis[] is number of operations
vector<vector<int>> G;


void init() {
    memset(d, 0, sizeof(d));
    memset(vis, 0, sizeof(vis));
    cin >> n >> k;
    G.clear();
    G.resize(n+1);
    for (int i = 0; i < n-1; i++) {
        int u, v;
        cin >> u >> v;
        G[u].emplace_back(v);
        G[v].emplace_back(u);
        d[u]++, d[v]++;
    }
}

// BFS-based topological sort
void solve() {
    queue<int> q;
    for (int i = 1; i <= n; i++) {
        if (d[i] == 1) {
            q.push(i);
            vis[i] = 1;
        }
    }
    while (!q.empty()) {
        int u = q.front();
        q.pop();
        for (auto &v : G[u]) {
            // Neighbor degree is 1, meaning it's already in the queue
            if (d[v] != 1) {
                d[v]--;
                // After deleting u, v's degree is 1, v enters queue
                if (d[v] == 1) {
                    vis[v] = vis[u] + 1;
                    q.push(v);
                }
            }
        }
    }
    int ans = 0;
    for (int i = 1; i <= n; i++) {
        if (vis[i] > k) ans++;
    }
    cout << ans << "\n";
}

int main() {
    IOS
    int test_case;
    cin >> test_case;
    while (test_case--) {
        init();
        solve();
    }
    return 0;
}

Euler Path, Euler Circuit

(To be filled…)

Minimum Spanning Tree

(To be filled…)

Appendix

References

[1]Wikipedia - Graph Theory

[2]OI Wiki - Graph Theory Concepts

[3]OI Wiki - Graph Storage

[4]Shuichi Miyazaki “Programmer’s Mathematics 4: Graph Theory Introduction”

[5]Chain Forward Star – The Most Easy-to-Understand Explanation

Introduction#

What is Graph Theory?#

So, what is a graph?#

Graph Storage#

Adjacency Matrix#

Method#

Complexity#

Applications#

Adjacency List#

Method#

Complexity#

Applications#

Chain Forward Star#

Method#

Complexity#

Applications#

Example Problem#

Graph Theory Related Problems#

Topological Sort#

BFS#

DFS#

Example Problem#

Euler Path, Euler Circuit#

Minimum Spanning Tree#

Appendix#

References#

Introduction

What is Graph Theory?

So, what is a graph?

Graph Storage

Adjacency Matrix

Method

Complexity

Applications

Adjacency List

Method

Complexity

Applications

Chain Forward Star

Method

Complexity

Applications

Example Problem

Graph Theory Related Problems

Topological Sort

BFS

DFS

Example Problem

Euler Path, Euler Circuit

Minimum Spanning Tree

Appendix

References