Operator selection schemes

The alns.select module contains the various operator selection schemes the alns package ships with. These are used during the ALNS search to select a destroy and repair operator pair in each iteration.

All operator selection schemes inherit from OperatorSelectionScheme.

class alns.select.OperatorSelectionScheme.OperatorSelectionScheme(num_destroy: int, num_repair: int, op_coupling: Optional[ndarray] = None)

Base class describing an operator selection scheme.

Parameters

num_destroy: Number of destroy operators.
num_repair: Number of repair operators.
op_coupling: Optional 2D boolean matrix that indicates coupling between destroy and repair operators. Entry (i, j) is True if destroy operator i can be used together with repair operator j, and False otherwise.

Attributes

num_destroy
num_repair
op_coupling

Methods

`__call__`(rnd, best, curr)	Determine which destroy and repair operator pair to apply in this iteration.
`update`(candidate, d_idx, r_idx, outcome)	Updates the selection schame based on the outcome of the applied destroy (d_idx) and repair (r_idx) operators.

abstract update(candidate: State, d_idx: int, r_idx: int, outcome: Outcome)

Updates the selection schame based on the outcome of the applied destroy (d_idx) and repair (r_idx) operators.

Parameters

candidate: The candidate solution state.
d_idx: Destroy operator index.
r_idx: Repair operator index.
outcome: Score enum value used for the various iteration outcomes.

class alns.select.AlphaUCB.AlphaUCB(scores: List[float], alpha: float, num_destroy: int, num_repair: int, op_coupling: Optional[ndarray] = None)

The \(\alpha\)-UCB (upper confidence bound) bandit scheme adapted from Hendel (2022).

The action space \(A\) is defined as each pair of (destroy, repair) operators that is allowed by the operator coupling matrix. The \(\alpha\)-UCB algorithm plays the following action in each iteration \(t\), computed as

\[Q(t) = \arg \max_{a \in A} \left\{ \bar{r}_a (t - 1) + \sqrt{\frac{\alpha \ln(1 + t)}{T_a (t - 1)}} \right\},\]

where \(T_a(t - 1)\) is the number of times action \(a\) has been played, and \(\bar r_a(t - 1)\) is the average reward of action \(a\), both in the first \(t - 1\) iterations. The value that is maximised over the actions \(a \in A\) consists of the average reward term \(\bar r_a(t - 1)\) and an exploration bonus depending on \(t\) and the number of times \(a\) has been played.

See update() for details on how the average reward \(\bar r_a\) is updated.

Note

The average reward \(\bar r_a(0)\) of each action \(a \in A\) is initialised to 1. The scores list passed into the \(\alpha\)-UCB scheme should be ‘reasonable’ with respect to this default.

Parameters

scores: A list of four non-negative elements, representing the rewards when the candidate solution results in a new global best (idx 0), is better than the current solution (idx 1), the solution is accepted (idx 2), or rejected (idx 3).
alpha: The \(\alpha \in [0, 1]\) parameter controls how much exploration is performed. Values of \(\alpha\) near one result in much exploration, whereas values of \(\alpha\) nearer to zero result in more exploitation of good operator pairs. Typically, \(\alpha \le 0.1\) is a good choice.
num_destroy: Number of destroy operators.
num_repair: Number of repair operators.
op_coupling: Optional boolean matrix that indicates coupling between destroy and repair operators. Entry (i, j) is True if destroy operator i can be used together with repair operator j, and False otherwise.

References

1: Hendel, G. 2022. Adaptive large neighborhood search for mixed integer programming. Mathematical Programming Computation 14: 185 - 221.

Attributes

alpha
num_destroy
num_repair
op_coupling
scores

Methods

`__call__`(rnd, best, curr)	Returns the (destroy, repair) operator pair that maximises the average reward and exploration bonus.
`update`(candidate, d_idx, r_idx, outcome)	Updates the average reward of the given destroy and repair operator combination `(d_idx, r_idx)`.

update(candidate, d_idx, r_idx, outcome)

Updates the average reward of the given destroy and repair operator combination (d_idx, r_idx).

In particular, the reward of the action \(a\) associated with this operator combination is updated as

\[\bar r_a (t) = \frac{T_a(t - 1) \bar r_a(t - 1) + \text{scores}[\text{outcome}]}{T_a(t - 1) + 1},\]

and \(T_a(t) = T_a(t - 1) + 1\).

class alns.select.MABSelector.MABSelector(scores: List[float], num_destroy: int, num_repair: int, learning_policy: LearningPolicyType, neighborhood_policy: Optional[NeighborhoodPolicyType] = None, seed: Optional[int] = None, op_coupling: Optional[ndarray] = None, **kwargs)

A selector that uses any multi-armed-bandit algorithm from MABWiser.

Warning

ALNS does not install MABWiser by default. You can install it as an extra dependency via pip install alns[mabwiser].

This selector is a wrapper around the many multi-armed bandit algorithms available in the MABWiser library. Since ALNS operator selection can be framed as a multi-armed-bandit problem (where each [destroy, repair] operator pair is a bandit arm), this wrapper allows you to use a variety of existing multi-armed-bandit algorithms as operator selectors instead of having to reimplement them.

Note

If the provided learning policy is a contextual bandit algorithm, your state class must implement a get_context method that returns a context vector for the current state. See the ContextualState protocol for details.

Parameters

scores: A list of four non-negative elements, representing the rewards when the candidate solution results in a new global best (idx 0), is better than the current solution (idx 1), the solution is accepted (idx 2), or rejected (idx 3).
num_destroy: Number of destroy operators.
num_repair: Number of repair operators.
learning_policy: A MABWiser learning policy that acts as an operator selector. See the MABWiser documentation for a list of available learning policies.
neighborhood_policy: The neighborhood policy that MABWiser should use. Only available for contextual learning policies. See the MABWiser documentation for a list of available neighborhood policies.
seed: A seed that will be passed to the underlying MABWiser object.
op_coupling: Optional boolean matrix that indicates coupling between destroy and repair operators. Entry (i, j) is True if destroy operator i can be used together with repair operator j, and False otherwise.
kwargs: Any additional arguments. These will be passed to the underlying MAB object.

References

1: Emily Strong, Bernard Kleynhans, & Serdar Kadioglu (2021). MABWiser: Parallelizable Contextual Multi-armed Bandits. Int. J. Artif. Intell. Tools, 30(4), 2150021: 1 - 19.

Attributes

mab
num_destroy
num_repair
op_coupling
scores

Methods

`__call__`(rnd_state, best, curr)	Returns the (destroy, repair) operator pair from the underlying MAB strategy.
`update`(cand, d_idx, r_idx, outcome)	Updates the underlying MAB algorithm given the reward of the chosen destroy and repair operator combination `(d_idx, r_idx)`.

update(cand: ContextualState, d_idx: int, r_idx: int, outcome: Outcome): Updates the underlying MAB algorithm given the reward of the chosen destroy and repair operator combination (d_idx, r_idx).

class alns.select.RandomSelect.RandomSelect(num_destroy: int, num_repair: int, op_coupling: Optional[ndarray] = None)

Randomly selects operator pairs with uniform probability. The operator pairs respect the operator coupling matrix.

Attributes

num_destroy
num_repair
op_coupling

Methods

`__call__`(rnd, best, curr)	Selects a (destroy, repair) operator pair with uniform probability.
`update`(candidate, d_idx, r_idx, outcome)	Updates the selection schame based on the outcome of the applied destroy (d_idx) and repair (r_idx) operators.

update(candidate, d_idx, r_idx, outcome)

Updates the selection schame based on the outcome of the applied destroy (d_idx) and repair (r_idx) operators.

Parameters

candidate: The candidate solution state.
d_idx: Destroy operator index.
r_idx: Repair operator index.
outcome: Score enum value used for the various iteration outcomes.

class alns.select.RouletteWheel.RouletteWheel(scores: List[float], decay: float, num_destroy: int, num_repair: int, op_coupling: Optional[ndarray] = None)

The RouletteWheel scheme updates operator weights as a convex combination of the current weight, and the new score.

When the algorithm starts, all operators \(i\) are assigned weight \(\omega_i = 1\). In each iteration, a destroy and a repair operator are selected by the ALNS algorithm, based on the normalised current weights \(\omega_i\). The selected operators are applied to the current solution, resulting in a new candidate solution. This candidate is evaluated by the ALNS algorithm, which leads to one of four outcomes:

The candidate solution is a new global best.
The candidate solution is better than the current solution, but not a new global best.
The candidate solution is accepted.
The candidate solution is rejected.

Each of these four outcomes is assigned a score \(s_j\) (with \(j = 1,...,4\)). After observing outcome \(j\), the weights of the selected destroy and repair operators \(d\) and \(r\) that were applied are updated as follows:

\[\begin{split}\begin{align} \omega_d &= \theta \omega_d + (1 - \theta) s_j, \\ \omega_r &= \theta \omega_r + (1 - \theta) s_j, \end{align}\end{split}\]

where \(0 \le \theta \le 1\) (known as the operator decay rate) is a parameter.

Parameters

scores: A list of four non-negative elements, representing the weight updates when the candidate solution results in a new global best (idx 0), is better than the current solution (idx 1), the solution is accepted (idx 2), or rejected (idx 3).
decay: Operator decay parameter \(\theta \in [0, 1]\). This parameter is used to weigh the running performance of each operator.
num_destroy: Number of destroy operators.
num_repair: Number of repair operators.
op_coupling: Optional boolean matrix that indicates coupling between destroy and repair operators. Entry (i, j) is True if destroy operator i can be used together with repair operator j, and False otherwise.

Attributes

decay
destroy_weights
num_destroy
num_repair
op_coupling
repair_weights
scores

Methods

`__call__`(rnd_state, best, curr)	Selects a destroy and repair operator pair to apply in this iteration.
`update`(cand, d_idx, r_idx, outcome)	Updates the selection schame based on the outcome of the applied destroy (d_idx) and repair (r_idx) operators.

update(cand, d_idx, r_idx, outcome)

Updates the selection schame based on the outcome of the applied destroy (d_idx) and repair (r_idx) operators.

Parameters

candidate: The candidate solution state.
d_idx: Destroy operator index.
r_idx: Repair operator index.
outcome: Score enum value used for the various iteration outcomes.

class alns.select.SegmentedRouletteWheel.SegmentedRouletteWheel(scores: List[float], decay: float, seg_length: int, num_destroy: int, num_repair: int, op_coupling: Optional[ndarray] = None)

Note

First read the documentation for RouletteWheel, the parent of this class.

The SegmentedRouletteWheel scheme extends the RouletteWheel scheme by fixing operator weights for a number of iterations (the segment length). This allows certain sets of operators to be selected more often in different neighbourhoods.

Initially, all weights are set to one, as in RouletteWheel. A separate score is tracked for each operator \(d\) and \(r\), to which the observed scores \(s_j\) are added in each iteration where \(d\) and \(r\) are applied. After the segment concludes, these summed scores are added to the existing weights \(\omega_d\) and \(\omega_r\) as a convex combination using a parameter \(\theta\) (the segment decay rate), as in RouletteWheel. The separate score list is then reset to zero, and a new segment begins.

Parameters

scores: A list of four non-negative elements, representing the weight updates when the candidate solution results in a new global best (idx 0), is better than the current solution (idx 1), the solution is accepted (idx 2), or rejected (idx 3).
decay: Operator decay parameter \(\theta \in [0, 1]\). This parameter is used to weigh the running performance of each operator.
seg_length: Length of a single segment.
num_destroy: Number of destroy operators.
num_repair: Number of repair operators.
op_coupling: Optional boolean matrix that indicates coupling between destroy and repair operators. Entry (i, j) is True if destroy operator i can be used together with repair operator j, and False otherwise.

Attributes

decay
destroy_weights
num_destroy
num_repair
op_coupling
repair_weights
scores
seg_length

Methods

`__call__`(rnd_state, best, curr)	Selects a destroy and repair operator pair to apply in this iteration.
`update`(cand, d_idx, r_idx, outcome)	Updates the selection schame based on the outcome of the applied destroy (d_idx) and repair (r_idx) operators.

update(cand, d_idx, r_idx, outcome)

Updates the selection schame based on the outcome of the applied destroy (d_idx) and repair (r_idx) operators.

Parameters

candidate: The candidate solution state.
d_idx: Destroy operator index.
r_idx: Repair operator index.
outcome: Score enum value used for the various iteration outcomes.