{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "import copy\n", "import re\n", "from dataclasses import dataclass\n", "from functools import lru_cache\n", "\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import numpy.random as rnd\n", "\n", "from alns import ALNS\n", "from alns.accept import HillClimbing\n", "from alns.select import SegmentedRouletteWheel\n", "from alns.stop import MaxIterations" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "SEED = 42" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "# The resource-constrained project scheduling problem\n", "\n", "The following explanation is largely based on [this paper](https://pms2020.sciencesconf.org/300164/document).\n", "\n", "The goal of the RCPSP is to schedule a set of project activities $V = \\{ 0, 1, 2, \\ldots, n \\}$, such that the makespan of the project is minimised.\n", "Each activity $i \\in V$ has a duration $d_i \\in \\mathbb{N}$.\n", "Precedence constraints impose that an activity $i \\in V$ can only start after all its predecessor activities have been completed.\n", "The precedence constraints are given by a set of edges $E \\subset V \\times V$, where $(i, j) \\in E$ means that $i$ must be completed before $j$ can commence.\n", "Resource constraints, on the other hand, impose that an activity can only be scheduled if sufficient resources are available.\n", "There are $K = \\{ 1, 2, \\ldots, m \\}$ renewable resources available, with $R_k$ indicating the availability of resource $k$.\n", "Each activity $i \\in V$ requires $r_{ik}$ units of resource $k$.\n", "A solution to the RCPSP is a schedule of activities $S = \\{ S_0, S_1, \\ldots, S_n \\}$, where $S_i$ is the starting time of activity $i$.\n", "The project starts at time $S_0 = 0$, and completes at $S_n$, where activities $0$ and $n$ are dummy activities that represent the start and completion of the project, respectively.\n", "\n", "In this notebook, we solve an instance of the RCPSP using ALNS.\n", "In particular, we solve instance `j9041_6` of the [PSPLib](http://www.om-db.wi.tum.de/psplib/library.html) benchmark suite.\n", "This instance consists of 90 jobs, and four resources.\n", "The optimal makespan of this instance is known to be between 123 and 135." ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Data instance\n", "\n", "We use the [dataclass](https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass) decorator to simplify our class representation a little." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "@dataclass(frozen=True)\n", "class ProblemData:\n", " num_jobs: int\n", " num_resources: int\n", "\n", " duration: np.ndarray # job durations\n", " successors: list[list[int]] # job successors\n", " predecessors: list[list[int]] # job predecessors\n", " needs: np.ndarray # job resource needs\n", " resources: np.ndarray # resource capacities\n", "\n", " def __hash__(self) -> int:\n", " return id(self)\n", "\n", " @property\n", " def first_job(self) -> int:\n", " return 0\n", "\n", " @property\n", " def last_job(self) -> int:\n", " return self.num_jobs - 1\n", "\n", " @property\n", " @lru_cache(1)\n", " def all_predecessors(self) -> list[list[int]]:\n", " pred = [set() for _ in range(self.num_jobs)]\n", "\n", " for job, pre in enumerate(self.predecessors):\n", " for p in pre:\n", " pred[job] |= pred[p] | {p}\n", "\n", " return [sorted(p) for p in pred]\n", "\n", " @property\n", " @lru_cache(1)\n", " def all_successors(self) -> list[list[int]]:\n", " succ = [set() for _ in range(self.num_jobs)]\n", "\n", " for job, suc in zip(\n", " reversed(range(self.num_jobs)), reversed(self.successors)\n", " ):\n", " for s in suc:\n", " succ[job] |= succ[s] | {s}\n", "\n", " return [sorted(s) for s in succ]\n", "\n", " @classmethod\n", " def read_instance(cls, path: str) -> \"ProblemData\":\n", " \"\"\"\n", " Reads an instance of the RCPSP from a file.\n", " Assumes the data is in the PSPLib format.\n", "\n", " Loosely based on:\n", " https://github.com/baobabsoluciones/hackathonbaobab2020.\n", " \"\"\"\n", " with open(path) as fh:\n", " lines = fh.readlines()\n", "\n", " prec_idx = lines.index(\"PRECEDENCE RELATIONS:\\n\")\n", " req_idx = lines.index(\"REQUESTS/DURATIONS:\\n\")\n", " avail_idx = lines.index(\"RESOURCEAVAILABILITIES:\\n\")\n", "\n", " successors = []\n", "\n", " for line in lines[prec_idx + 2 : req_idx - 1]:\n", " _, _, modes, num_succ, *jobs, _ = re.split(\"\\s+\", line)\n", " successors.append(list(map(lambda x: int(x) - 1, jobs)))\n", "\n", " predecessors = [[] for _ in range(len(successors))]\n", "\n", " for job in range(len(successors)):\n", " for succ in successors[job]:\n", " predecessors[succ].append(job)\n", "\n", " needs = []\n", " durations = []\n", "\n", " for line in lines[req_idx + 3 : avail_idx - 1]:\n", " _, _, _, duration, *consumption, _ = re.split(\"\\s+\", line)\n", "\n", " needs.append(list(map(int, consumption)))\n", " durations.append(int(duration))\n", "\n", " _, *avail, _ = re.split(\"\\s+\", lines[avail_idx + 2])\n", " resources = list(map(int, avail))\n", "\n", " return ProblemData(\n", " len(durations),\n", " len(resources),\n", " np.array(durations),\n", " successors,\n", " predecessors,\n", " np.array(needs),\n", " np.array(resources),\n", " )" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "instance = ProblemData.read_instance(\"data/j9041_6.sm\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "DELTA = 0.75 # resource utilisation threshold\n", "ITERS = 5_000\n", "\n", "START_TRESH = 5 # start threshold for RRT\n", "STEP = 20 / ITERS # step size for RRT\n", "\n", "THETA = 0.9 # weight decay parameter\n", "WEIGHTS = [25, 5, 1, 0] # weight scheme weights\n", "SEG_LENGTH = 100 # weight scheme segment length\n", "\n", "Q = int(0.2 * instance.num_jobs)\n", "\n", "LB = 123 # lower bound on optimal makespan\n", "UB = 135 # upper bound on optimal makespan" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Solution state" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "@lru_cache(32)\n", "def schedule(jobs: tuple[int]) -> tuple[np.ndarray, np.ndarray]:\n", " \"\"\"\n", " Computes a serial schedule of the given list of jobs. See Figure 1\n", " in Fleszar and Hindi (2004) for the algorithm. Returns the schedule,\n", " and the resources used.\n", "\n", " Fleszar, K. and K.S. Hindi. 2004. Solving the resource-constrained\n", " project scheduling problem by a variable neighbourhood search.\n", " _European Journal of Operational Research_. 155 (2): 402 -- 413.\n", " \"\"\"\n", " used = np.zeros((instance.duration.sum(), instance.num_resources))\n", " sched = np.zeros(instance.num_jobs, dtype=int)\n", "\n", " for job in jobs:\n", " pred = instance.predecessors[job]\n", " t = max(sched[pred] + instance.duration[pred], default=0)\n", "\n", " needs = instance.needs[job]\n", " duration = instance.duration[job]\n", "\n", " # This efficiently determines the first feasible insertion point\n", " # after t. We compute whether resources are available, and add the\n", " # offset s of the first time sufficient are available for the\n", " # duration of the job.\n", " res_ok = np.all(used[t:] + needs <= instance.resources, axis=1)\n", " for s in np.flatnonzero(res_ok):\n", " if np.all(res_ok[s : s + duration]):\n", " sched[job] = t + s\n", " used[t + s : t + s + duration] += needs\n", " break\n", "\n", " return sched, used[: sched[instance.last_job]]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "class RcpspState:\n", " \"\"\"\n", " Solution state for the resource-constrained project scheduling problem.\n", "\n", " We use a list representation of the scheduled jobs, where job i is\n", " scheduled before j if i precedes j (i.e., the jobs are sorted\n", " topologically).\n", " \"\"\"\n", "\n", " def __init__(self, jobs: list[int]):\n", " self.jobs = jobs\n", "\n", " def __copy__(self):\n", " return RcpspState(self.jobs.copy())\n", "\n", " @property\n", " def indices(self) -> np.ndarray:\n", " \"\"\"\n", " Returns a mapping from job -> idx in the schedule. Unscheduled\n", " jobs have index ``len(self.jobs)``.\n", " \"\"\"\n", " indices = np.full(instance.num_jobs, len(self.jobs), dtype=int)\n", "\n", " for idx, job in enumerate(self.jobs):\n", " indices[job] = idx\n", "\n", " return indices\n", "\n", " @property\n", " def unscheduled(self) -> list[int]:\n", " \"\"\"\n", " All jobs that are not currently scheduled, in topological order.\n", " \"\"\"\n", " return sorted(set(range(instance.num_jobs)) - set(self.jobs))\n", "\n", " def objective(self) -> int:\n", " s, _ = schedule(tuple(self.jobs))\n", " return s[instance.last_job]\n", "\n", " def plot(self):\n", " \"\"\"\n", " Plots the current schedule. The plot includes a Gantt chart, the\n", " lower and upper bounds on an optimal makespan, and bar charts for\n", " resource use.\n", " \"\"\"\n", " fig = plt.figure(figsize=(12, 6 + instance.num_resources))\n", "\n", " hr = [1] * (instance.num_resources + 1)\n", " hr[0] = 6\n", "\n", " gs = plt.GridSpec(\n", " nrows=1 + instance.num_resources, ncols=1, height_ratios=hr\n", " )\n", "\n", " s, u = schedule(tuple(self.jobs))\n", " idcs = np.argsort(s)\n", "\n", " gantt = fig.add_subplot(gs[0, 0])\n", " gantt.axvspan(LB, UB, alpha=0.25, color=\"grey\")\n", " gantt.barh(\n", " np.arange(instance.num_jobs), instance.duration[idcs], left=s[idcs]\n", " )\n", "\n", " gantt.set_xlim(0, self.objective())\n", " gantt.set_ylim(0, instance.last_job)\n", " gantt.invert_yaxis()\n", "\n", " gantt.set_title(\"Gantt chart\")\n", "\n", " for res in range(instance.num_resources):\n", " res_ax = fig.add_subplot(gs[res + 1, 0], sharex=gantt)\n", " res_ax.bar(np.arange(u.shape[0]), u[:, res], align=\"edge\")\n", "\n", " res_ax.set_ylim(0, instance.resources[res])\n", " res_ax.set_ylabel(f\"R{res + 1}\")\n", "\n", " if res == instance.num_resources - 1:\n", " res_ax.set_xlabel(\"Time\")\n", "\n", " plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Destroy operators" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "def most_mobile_removal(state, rng):\n", " \"\"\"\n", " This operator unschedules those jobs that are most mobile, that is, those\n", " that can be 'moved' most within the schedule, as determined by their\n", " scheduled predecessors and successors. Based on Muller (2009).\n", "\n", " Muller, LF. 2009. An Adaptive Large Neighborhood Search Algorithm\n", " for the Resource-constrained Project Scheduling Problem. In _MIC\n", " 2009: The VIII Metaheuristics International Conference_.\n", " \"\"\"\n", " state = copy.copy(state)\n", " indices = state.indices\n", "\n", " # Left and right limits. These are the indices of the job's last\n", " # predecessor and first successor in the schedule. That indicates\n", " # the extent of the job's movement.\n", " ll = np.array(\n", " [\n", " np.max(indices[instance.predecessors[job]], initial=0)\n", " for job in range(instance.num_jobs)\n", " ]\n", " )\n", "\n", " rl = np.array(\n", " [\n", " np.min(\n", " indices[instance.successors[job]], initial=instance.num_jobs\n", " )\n", " for job in range(instance.num_jobs)\n", " ]\n", " )\n", "\n", " mobility = np.maximum(rl - ll, 0)\n", " mobility[[instance.first_job, instance.last_job]] = 0\n", " p = mobility / mobility.sum()\n", "\n", " for job in rng.choice(instance.num_jobs, Q, replace=False, p=p):\n", " state.jobs.remove(job)\n", "\n", " return state" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "def non_peak_removal(state: RcpspState, rng):\n", " \"\"\"\n", " Removes up to Q jobs that are scheduled in periods with limited resource\n", " use. Those jobs might be grouped together better when they are rescheduled.\n", " Based on Muller (2009).\n", "\n", " Muller, LF. 2009. An Adaptive Large Neighborhood Search Algorithm\n", " for the Resource-constrained Project Scheduling Problem. In _MIC\n", " 2009: The VIII Metaheuristics International Conference_.\n", " \"\"\"\n", " state = copy.copy(state)\n", "\n", " start, used = schedule(tuple(state.jobs))\n", " end = start + instance.duration\n", "\n", " # Computes a measure of resource utilisation in each period, and\n", " # determines periods of high resource use.\n", " used = used / instance.resources\n", " high_util = np.argwhere(np.mean(used, axis=1) > DELTA)\n", "\n", " # These are all non-peak jobs, that is, jobs that are completely\n", " # scheduled in periods of limited resource use.\n", " jobs = [\n", " job\n", " for job in range(instance.num_jobs)\n", " if np.all((high_util <= start[job]) | (high_util >= end[job]))\n", " ]\n", "\n", " for job in rng.choice(jobs, min(len(jobs), Q), replace=False):\n", " state.jobs.remove(job)\n", "\n", " return state" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "def segment_removal(state, rng):\n", " \"\"\"\n", " Removes a whole segment of jobs from the current solution.\n", " \"\"\"\n", " state = copy.copy(state)\n", " offset = rng.integers(1, instance.num_jobs - Q)\n", "\n", " del state.jobs[offset : offset + Q]\n", "\n", " return state" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Repair operators\n", "\n", "We only define a single repair operator: `random_insert`.\n", "This operator takes the unscheduled jobs, and randomly inserts them in feasible locations in the schedule.\n", "Together with a justification technique (shown below) that further improves the resulting schedule, this results in a new, hopefully improved solution." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "def justify(state):\n", " \"\"\"\n", " Helper method that double-justifies the schedule. Based on the\n", " justification technique of Valls et al. (2005).\n", "\n", " Valls, V. Ballestín, F. and S. Quintanilla. 2005. Jusitfication and\n", " RCPSP: A technique that pays. _ European Journal of Operational\n", " Research_. 165 (2): 375 -- 386.\n", " \"\"\"\n", " # We first right-justify the current schedule. That more or less means\n", " # that we schedule jobs from the right, such that no job can be started\n", " # later without increases the makespan.\n", " makespan = state.objective()\n", " used = np.zeros((makespan, instance.num_resources))\n", " sched = np.zeros(instance.num_jobs, dtype=int)\n", "\n", " for job in reversed(state.jobs):\n", " needs = instance.needs[job]\n", " duration = instance.duration[job]\n", "\n", " t = min(sched[instance.successors[job]], default=makespan)\n", " res_ok = np.all(used[:t] + needs <= instance.resources, axis=1)\n", "\n", " for s in reversed(np.flatnonzero(res_ok[: t - duration + 1])):\n", " if np.all(res_ok[s : s + duration]):\n", " sched[job] = s\n", " used[s : s + duration, :] += needs\n", " break\n", "\n", " # Right-justify the schedule, and then left-justify it again. This\n", " # results in a double-justified schedule that is hopefully better\n", " # than what we got initially.\n", " right_justified = np.argsort(sched)\n", " sched, _ = schedule(tuple(right_justified))\n", " left_justified = np.argsort(sched).tolist()\n", "\n", " return RcpspState(left_justified)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "def random_insert(state, rng):\n", " \"\"\"\n", " Randomly inserts jobs into the schedule. The resulting solution state\n", " is guaranteed to be feasible.\n", " \"\"\"\n", " indices = state.indices\n", " preds = instance.all_predecessors\n", " succs = instance.all_successors\n", "\n", " for job in state.unscheduled:\n", " # Left and right insertion limits. The job must be inserted\n", " # between these indices - the interval is [ll, rl).\n", " ll = np.max(indices[preds[job]], initial=-1) + 1\n", " rl = np.min(indices[succs[job]], initial=len(state.jobs))\n", "\n", " idx = rng.integers(ll, rl) if ll < rl else ll\n", " state.jobs.insert(idx, job)\n", "\n", " indices[indices >= idx] += 1\n", " indices[job] = idx\n", "\n", " return justify(state)" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Initial solution\n", "\n", "Our solution representation is a list of jobs.\n", "We can thus easily generate an initial solution as the list of all jobs, in the (topological) order we got them." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "init_sol = RcpspState(list(range(instance.num_jobs)))\n", "print(f\"Initial solution has objective {init_sol.objective()}.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "init_sol.plot()" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Heuristic solution\n", "\n", "With our initial solution in hand, we can now use ALNS to further improve it.\n", "We use a segmented roulette wheel operator selection strategy, and a simple hill-climbing acceptance criterion." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "rng = rnd.default_rng(SEED)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "alns = ALNS(rng)\n", "\n", "alns.add_destroy_operator(most_mobile_removal)\n", "alns.add_destroy_operator(non_peak_removal)\n", "alns.add_destroy_operator(segment_removal)\n", "\n", "alns.add_repair_operator(random_insert)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "select = SegmentedRouletteWheel(WEIGHTS, THETA, SEG_LENGTH, 3, 1)\n", "accept = HillClimbing()\n", "stop = MaxIterations(ITERS)\n", "\n", "res = alns.iterate(init_sol, select, accept, stop)\n", "sol = res.best_state\n", "\n", "print(f\"Heuristic solution has objective {sol.objective()}.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "_, ax = plt.subplots(figsize=(12, 6))\n", "res.plot_objectives(ax=ax)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "sol.plot()" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "## Conclusion\n", "\n", "In this notebook we solved a challenging instance of the resource-constrained project scheduling problem, using several operators and enhancement techniques from the literature.\n", "The resulting heuristic solution is competitive with other heuristics for this problem: the best known solution achieves a makespan of 135, and we find 142, just 5% higher." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.19" } }, "nbformat": 4, "nbformat_minor": 4 }