• Data Structures
  • Linked List
  • Binary Tree
  • Binary Search Tree
  • Segment Tree
  • Disjoint Set Union
  • Fenwick Tree
  • Red-Black Tree
  • Advanced Data Structures
  • Hungarian Algorithm for Assignment Problem | Set 2 (Implementation)
  • Clustering Coefficient in Graph Theory
  • Maximum number of edges in Bipartite graph
  • Types of Graphs with Examples
  • Count of nodes with maximum connection in an undirected graph
  • Erdos Renyl Model (for generating Random Graphs)
  • Program to find the number of region in Planar Graph
  • Maximize count of nodes disconnected from all other nodes in a Graph
  • Find node having maximum number of common nodes with a given node K
  • Convert the undirected graph into directed graph such that there is no path of length greater than 1
  • Ways to Remove Edges from a Complete Graph to make Odd Edges
  • Cost of painting n * m grid
  • Number of Simple Graph with N Vertices and M Edges
  • Count of Disjoint Groups by grouping points that are at most K distance apart
  • Program to check if N is a Concentric Hexagonal Number
  • Program to check if N is a Octagonal Number
  • Graph Data Structure And Algorithms
  • Program to check if N is a Icosidigonal Number
  • Program to check if N is a Octadecagon number

Hungarian Algorithm for Assignment Problem | Set 1 (Introduction)

hungarian1

  • For each row of the matrix, find the smallest element and subtract it from every element in its row.
  • Do the same (as step 1) for all columns.
  • Cover all zeros in the matrix using minimum number of horizontal and vertical lines.
  • Test for Optimality: If the minimum number of covering lines is n, an optimal assignment is possible and we are finished. Else if lines are lesser than n, we haven’t found the optimal assignment, and must proceed to step 5.
  • Determine the smallest entry not covered by any line. Subtract this entry from each uncovered row, and then add it to each covered column. Return to step 3.
Try it before moving to see the solution

Explanation for above simple example:

  An example that doesn’t lead to optimal value in first attempt: In the above example, the first check for optimality did give us solution. What if we the number covering lines is less than n.

Time complexity : O(n^3), where n is the number of workers and jobs. This is because the algorithm implements the Hungarian algorithm, which is known to have a time complexity of O(n^3).

Space complexity :   O(n^2), where n is the number of workers and jobs. This is because the algorithm uses a 2D cost matrix of size n x n to store the costs of assigning each worker to a job, and additional arrays of size n to store the labels, matches, and auxiliary information needed for the algorithm.

In the next post, we will be discussing implementation of the above algorithm. The implementation requires more steps as we need to find minimum number of lines to cover all 0’s using a program. References: http://www.math.harvard.edu/archive/20_spring_05/handouts/assignment_overheads.pdf https://www.youtube.com/watch?v=dQDZNHwuuOY

Please Login to comment...

Similar reads.

  • Mathematical

advertisewithusBannerImg

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

The assignment problem revisited

  • Original Paper
  • Published: 16 August 2021
  • Volume 16 , pages 1531–1548, ( 2022 )

Cite this article

fastest algorithm for assignment problem

  • Carlos A. Alfaro   ORCID: orcid.org/0000-0001-9783-8587 1 ,
  • Sergio L. Perez 2 ,
  • Carlos E. Valencia 3 &
  • Marcos C. Vargas 1  

962 Accesses

4 Citations

4 Altmetric

Explore all metrics

First, we give a detailed review of two algorithms that solve the minimization case of the assignment problem, the Bertsekas auction algorithm and the Goldberg & Kennedy algorithm. It was previously alluded that both algorithms are equivalent. We give a detailed proof that these algorithms are equivalent. Also, we perform experimental results comparing the performance of three algorithms for the assignment problem: the \(\epsilon \) - scaling auction algorithm , the Hungarian algorithm and the FlowAssign algorithm . The experiment shows that the auction algorithm still performs and scales better in practice than the other algorithms which are harder to implement and have better theoretical time complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

fastest algorithm for assignment problem

Similar content being viewed by others

fastest algorithm for assignment problem

The p-Median Problem

fastest algorithm for assignment problem

A review and comparison of solvers for convex MINLP

fastest algorithm for assignment problem

The DIRECT algorithm: 25 years Later

Bertsekas, D.P.: The auction algorithm: a distributed relaxation method for the assignment problem. Annal Op. Res. 14 , 105–123 (1988)

Article   MathSciNet   Google Scholar  

Bertsekas, D.P., Castañon, D.A.: Parallel synchronous and asynchronous implementations of the auction algorithm. Parallel Comput. 17 , 707–732 (1991)

Article   Google Scholar  

Bertsekas, D.P.: Linear network optimization: algorithms and codes. MIT Press, Cambridge, MA (1991)

MATH   Google Scholar  

Bertsekas, D.P.: The auction algorithm for shortest paths. SIAM J. Optim. 1 , 425–477 (1991)

Bertsekas, D.P.: Auction algorithms for network flow problems: a tutorial introduction. Comput. Optim. Appl. 1 , 7–66 (1992)

Bertsekas, D.P., Castañon, D.A., Tsaknakis, H.: Reverse auction and the solution of inequality constrained assignment problems. SIAM J. Optim. 3 , 268–299 (1993)

Bertsekas, D.P., Eckstein, J.: Dual coordinate step methods for linear network flow problems. Math. Progr., Ser. B 42 , 203–243 (1988)

Bertsimas, D., Tsitsiklis, J.N.: Introduction to linear optimization. Athena Scientific, Belmont, MA (1997)

Google Scholar  

Burkard, R., Dell’Amico, M., Martello, S.: Assignment Problems. Revised reprint. SIAM, Philadelphia, PA (2011)

Gabow, H.N., Tarjan, R.E.: Faster scaling algorithms for network problems. SIAM J. Comput. 18 (5), 1013–1036 (1989)

Goldberg, A.V., Tarjan, R.E.: A new approach to the maximum flow problem. J. Assoc. Comput. Mach. 35 , 921–940 (1988)

Goldberg, A.V., Tarjan, R.E.: Finding minimum-cost circulations by successive approximation. Math. Op. Res. 15 , 430–466 (1990)

Goldberg, A.V., Kennedy, R.: An efficient cost scaling algorithm for the assignment problem. Math. Programm. 71 , 153–177 (1995)

MathSciNet   MATH   Google Scholar  

Goldberg, A.V., Kennedy, R.: Global price updates help. SIAM J. Discr. Math. 10 (4), 551–572 (1997)

Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logist. Quart. 2 , 83–97 (1955)

Kuhn, H.W.: Variants of the Hungarian method for the assignment problem. Naval Res. Logist. Quart. 2 , 253–258 (1956)

Lawler, E.L.: Combinatorial optimization: networks and matroids, Holt. Rinehart & Winston, New York (1976)

Orlin, J.B., Ahuja, R.K.: New scaling algorithms for the assignment ad minimum mean cycle problems. Math. Programm. 54 , 41–56 (1992)

Ramshaw, L., Tarjan, R.E., Weight-Scaling Algorithm, A., for Min-Cost Imperfect Matchings in Bipartite Graphs, : IEEE 53rd Annual Symposium on Foundations of Computer Science. New Brunswick, NJ 2012 , 581–590 (2012)

Zaki, H.: A comparison of two algorithms for the assignment problem. Comput. Optim. Appl. 4 , 23–45 (1995)

Download references

Acknowledgements

This research was partially supported by SNI and CONACyT.

Author information

Authors and affiliations.

Banco de México, Mexico City, Mexico

Carlos A. Alfaro & Marcos C. Vargas

Mountain View, CA, 94043, USA

Sergio L. Perez

Departamento de Matemáticas, CINVESTAV del IPN, Apartado postal 14-740, 07000, Mexico City, Mexico

Carlos E. Valencia

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Carlos A. Alfaro .

Ethics declarations

Conflict of interest.

There is no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors were partially supported by SNI and CONACyT.

Rights and permissions

Reprints and permissions

About this article

Alfaro, C.A., Perez, S.L., Valencia, C.E. et al. The assignment problem revisited. Optim Lett 16 , 1531–1548 (2022). https://doi.org/10.1007/s11590-021-01791-4

Download citation

Received : 26 March 2020

Accepted : 03 August 2021

Published : 16 August 2021

Issue Date : June 2022

DOI : https://doi.org/10.1007/s11590-021-01791-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Assignment problem
  • Bertsekas auction algorithm
  • Combinatorial optimization and matching
  • Find a journal
  • Publish with us
  • Track your research

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

QuickMatch: A very fast algorithm for the assignment problem

Profile image of James Orlin

Related Papers

Operations Research Center …

Ankur Chaudhry

fastest algorithm for assignment problem

James Orlin

The shortest path problem is a classic and important combinatorial optimization problems. It often appears as a subproblem when solving difficult combinatorial problems like multicommodity network flow (MCNF) problems. Most shortest path algorithms in the literature are either to compute the 1-ALL single source shortest path (SSSP) tree, or to compute the ALL-ALL all pairs shortest paths (APSP). However, most real world applications require only multiple pairs shortest paths (MPSP), where the shortest paths and distances between only some specific pairs of origin-destination nodes in a network are desired. The traditional single source shortest path (SSSP) and all pairs shortest paths (APSP) algorithms may do unnecessary computations to solve the MPSP problem. We survey and summarize many shortest path algorithms, and discuss their pros and cons. We also investigate the Least Squares Primal-Dual method, a new LP algorithm that avoids degenerate pivots in each primal-dual iteration, for solving 1-1 and 1-ALL shortest path problems with nonnegative arc lengths, show its equivalence to the classic Dijkstra's algorithm, and compare it with the original primal-dual method. We propose two new shortest path algorithms to save computational work when solving the MPSP problem. Our MPSP algorithms are especially suitable for applications with fixed network topology but changeable arc lengths. We discuss the theoretical details and complexity analyses. We test several implementations of our new MPSP algorithms extensively and compare them with many state-of-the-art SSSP algorithms for solving many families of artificially generated networks and a real Asia-Pacific flight network. Our MPSP computational experiments show that there exists no "killer" shortest path algorithm. Our algorithms have better performance for dense networks, but become worse for larger networks. Although they do not have best performance for the artificially generated graphs, they seem to be competitive for the real Aisa-Pacific flight network. We provide an extensive survey on both the applications and solution methods for MCNF problems in this thesis. Among those methods, we investigate the combination of the primal-dual algorithm with the key path decomposition method. In particular, to efficiently solve the restricted primal problem (RPP) in each primal-dual iteration, we relax the nonnegativity constraints for some set of basic variables, which makes the relaxed RPP smaller and easier to solve since the convexity constraint will be implicitly maintained. We implement our new primal-dual key path method (KEY), propose new methods to identify max step length, and suggest perturbation methods to avoid degenerate pivots and indefinite cycling problems caused by primal and dual degeneracy. We perform limited computational experiments to compare the running time of the generic primal-dual (PD) method, the Dantzig-Wolfe (DW) decomposition method, and the CPLEX LP solver that solves the node-arc form (NA) of the MCNF problems, with our method KEY. Observations from the computational experiments suggest directions for better DW implementation and possible remedies for improving PD and KEY.

Panos M Pardalos

Abstract In this paper we present computational testing results on very large scale random assignment problems. We consider a fully dense assignment problem with 2n nodes. Some conjectured or derived properties regarding fully dense assignment problems including the convergence of the optimal objective function value and the portion of nodes assigned with their kth best arc have been verified for networks up to n= 100, 000 in size.

Recently a new least-squares primal-dual (LSPD) algorithm, that is impervious to degeneracy, has effectively been applied to solving linear programming problems by Barnes et al., 2002. In this paper, we show an application of LSPD to shortest path problems with nonnegative arc length is equivalent to the Dijkstra’s algorithm. We also compare the LSPD algorithm with the conventional primal-dual algorithm in solving shortest path problems and show their difference due to degeneracy in solving the 1-1 shortest path problems.

Alexandre Ribeiro

Emre Çapoğlu

http://ssrn.com/abstract=489784 Two Dynamic Programming,Methodologies in Very Large Scale Neighborhood Search Applied to the Traveling Salesman Problem ÖzlemErgun Industrial and Systems Engineering,Georgi a Institute ofTechnology, Atlanta, GA 30339 email: [email protected] James B. Orlin Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA 02139 email: [email protected] ,1

francesco corman

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs (BY-NC-ND): http://creativecommons. org/licenses/by-nc-nd/3.0/legalcode In brief, this license authorizes each and everybody to share (to copy, distribute and transmit) the work under the following conditions, without impairing or restricting the authors' moral rights: Attribution: The work must be attributed to its authors.

Thu Hiền Chu Thị

RELATED PAPERS

Linear Programming 1

Joel Padilla

Frank G Ortmann

Andrea Dariano

Roberto Calvo

Annabell Berger

Transportation Science

Jesper Larsen

Mathematical Programming

Satoru Iwata

CiteSeer Archives

Vladimir Batagelj

Computational Optimization and Applications

Dimitri Bertsekas

Discrete Applied Mathematics

Mauro Dell'Amico

Operations Research

Experimental and Efficient …

Alba Cristina Magalhaes Alves de Melo

Fred Glover

… and Efficient Algorithms

Thomas Santana

Mathematics of Operations Research

Lecture Notes in Computer Science

Amr Elmasry

Simone M A R T I N S Cabral

Gunduz Ulusoy

Sharmila Ravula

Anna Harutyunyan

Nurullah Shahin

rolando dlores

Alayjyoti Banerjee

Camil Demetrescu

Journal of Global Optimization

Simge Kucukyavuz , April Andreas , Cole Smith

DIMACS Series in Discrete Mathematics and Theoretical Computer Science

Zhi-long Chen

Transportation Research Procedia

Alberto Colorni

Automatic Control, IEEE Transactions on

John N. Tsitsiklis

Björn Lisper

Proceedings of the thirty-second annual ACM symposium on Theory of computing - STOC '00

alejandro garcia

Algorithms—ESA'98

Umberto Nanni

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Quadratic assignment problem

Author: Thomas Kueny, Eric Miller, Natasha Rice, Joseph Szczerba, David Wittmann (SysEn 5800 Fall 2020)

  • 1 Introduction
  • 2.1 Koopmans-Beckman Mathematical Formulation
  • 2.2.1 Parameters
  • 2.3.1 Optimization Problem
  • 2.4 Computational Complexity
  • 2.5 Algorithmic Discussions
  • 2.6 Branch and Bound Procedures
  • 2.7 Linearizations
  • 3.1 QAP with 3 Facilities
  • 4.1 Inter-plant Transportation Problem
  • 4.2 The Backboard Wiring Problem
  • 4.3 Hospital Layout
  • 4.4 Exam Scheduling System
  • 5 Conclusion
  • 6 References

Introduction

The Quadratic Assignment Problem (QAP), discovered by Koopmans and Beckmann in 1957 [1] , is a mathematical optimization module created to describe the location of invisible economic activities. An NP-Complete problem, this model can be applied to many other optimization problems outside of the field of economics. It has been used to optimize backboards, inter-plant transportation, hospital transportation, exam scheduling, along with many other applications not described within this page.

Theory, Methodology, and/or Algorithmic Discussions

Koopmans-beckman mathematical formulation.

Economists Koopmans and Beckman began their investigation of the QAP to ascertain the optimal method of locating important economic resources in a given area. The Koopmans-Beckman formulation of the QAP aims to achieve the objective of assigning facilities to locations in order to minimize the overall cost. Below is the Koopmans-Beckman formulation of the QAP as described by neos-guide.org.

Quadratic Assignment Problem Formulation

{\displaystyle F=(F_{ij})}

Inner Product

{\displaystyle A,B}

Note: The true objective cost function only requires summing entries above the diagonal in the matrix comprised of elements

{\displaystyle F_{i,j}(X_{\phi }DX_{\phi }^{T})_{i,j}}

Since this matrix is symmetric with zeroes on the diagonal, dividing by 2 removes the double count of each element to give the correct cost value. See the Numerical Example section for an example of this note.

Optimization Problem

With all of this information, the QAP can be summarized as:

{\displaystyle \min _{X\in P}\langle F,XDX^{T}\rangle }

Computational Complexity

QAP belongs to the classification of problems known as NP-complete, thus being a computationally complex problem. QAP’s NP-completeness was proven by Sahni and Gonzalez in 1976, who states that of all combinatorial optimization problems, QAP is the “hardest of the hard”. [2]

Algorithmic Discussions

While an algorithm that can solve QAP in polynomial time is unlikely to exist, there are three primary methods for acquiring the optimal solution to a QAP problem:

  • Dynamic Program
  • Cutting Plane

Branch and Bound Procedures

The third method has been proven to be the most effective in solving QAP, although when n > 15, QAP begins to become virtually unsolvable.

The Branch and Bound method was first proposed by Ailsa Land and Alison Doig in 1960 and is the most commonly used tool for solving NP-hard optimization problems.

A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root. The algorithm explores branches of this tree, which represent subsets of the solution set. Before one lists all of the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and the branch is eliminated if it cannot produce a better solution than the best one found so far by the algorithm.

Linearizations

The first attempts to solve the QAP eliminated the quadratic term in the objective function of

{\displaystyle min\sum _{i=1}^{n}\sum _{j=1}^{n}c{_{\phi (i)\phi (j)}}+\sum _{i=1}^{n}b{_{\phi (i)}}}

in order to transform the problem into a (mixed) 0-1 linear program. The objective function is usually linearized by introducing new variables and new linear (and binary) constraints. Then existing methods for (mixed) linear integer programming (MILP) can be applied. The very large number of new variables and constraints, however, usually poses an obstacle for efficiently solving the resulting linear integer programs. MILP formulations provide LP relaxations of the problem which can be used to compute lower bounds.

Numerical Example

Qap with 3 facilities.

{\displaystyle D={\begin{bmatrix}0&5&6\\5&0&3.6\\6&3.6&0\end{bmatrix}}}

Applications

Inter-plant transportation problem.

The QAP was first introduced by Koopmans and Beckmann to address how economic decisions could be made to optimize the transportation costs of goods between both manufacturing plants and locations. [1] Factoring in the location of each of the manufacturing plants as well as the volume of goods between locations to maximize revenue is what distinguishes this from other linear programming assignment problems like the Knapsack Problem.

The Backboard Wiring Problem

As the QAP is focused on minimizing the cost of traveling from one location to another, it is an ideal approach to determining the placement of components in many modern electronics. Leon Steinberg proposed a QAP solution to optimize the layout of elements on a blackboard by minimizing the total amount of wiring required. [4]

When defining the problem Steinberg states that we have a set of n elements

{\displaystyle E=\left\{E_{1},E_{2},...,E_{n}\right\}}

as well as a set of r points

{\displaystyle P_{1},P_{2},...,P_{r}}

In his paper he derives the below formula:

{\displaystyle min\sum _{1\leq i\leq j\leq n}^{}C_{ij}(d_{s(i)s(j))})}

In his paper Steinberg a backboard with a 9 by 4 array, allowing for 36 potential positions for the 34 components that needed to be placed on the backboard. For the calculation, he selected a random initial placement of s1 and chose a random family of 25 unconnected sets.

The initial placement of components is shown below:

fastest algorithm for assignment problem

After the initial placement of elements, it took an additional 35 iterations to get us to our final optimized backboard layout. Leading to a total of 59 iterations and a final wire length of 4,969.440.

fastest algorithm for assignment problem

Hospital Layout

Building new hospitals was a common event in 1977 when Alealid N Elshafei wrote his paper on "Hospital Layouts as a Quadratic Assignment Problem". [5] With the high initial cost to construct the hospital and to staff it, it is important to ensure that it is operating as efficiently as possible. Elshafei's paper was commissioned to create an optimization formula to locate clinics within a building in such a way that minimizes the total distance that a patient travels within the hospital throughout the year. When doing a study of a major hospital in Cairo he determined that the Outpatient ward was acting as a bottleneck in the hospital and focused his efforts on optimizing the 17 departments there.

Elshafei identified the following QAP to determine where clinics should be placed:

{\displaystyle min\sum _{i,j}\sum _{k,q}f_{ik}d_{jq}y_{ij}y_{kq}}

For the Cairo hospital with 17 clinics, and one receiving and recording room bringing us to a total of 18 facilities. By running the above optimization Elshafei was able to get the total distance per year down to 11,281,887 from a distance of 13,973,298 based on the original hospital layout.

Exam Scheduling System

The scheduling system uses matrices for Exams, Time Slots, and Rooms with the goal of reducing the rate of schedule conflicts. To accomplish this goal, the “examination with the highest cross faculty student is been prioritized in the schedule after which the examination with the highest number of cross-program is considered and finally with the highest number of repeating student, at each stage group with the highest number of student are prioritized.” [6]

{\displaystyle n!}

  • ↑ 1.0 1.1 1.2 Koopmans, T., & Beckmann, M. (1957). Assignment Problems and the Location of Economic Activities. Econometrica, 25(1), 53-76. doi:10.2307/1907742
  • ↑ 2.0 2.1 Quadratic Assignment Problem. (2020). Retrieved December 14, 2020, from https://neos-guide.org/content/quadratic-assignment-problem
  • ↑ 3.0 3.1 3.2 Burkard, R. E., Çela, E., Pardalos, P. M., & Pitsoulis, L. S. (2013). The Quadratic Assignment Problem. https://www.opt.math.tugraz.at/~cela/papers/qap_bericht.pdf .
  • ↑ 4.0 4.1 Leon Steinberg. The Backboard Wiring Problem: A Placement Algorithm. SIAM Review . 1961;3(1):37.
  • ↑ 5.0 5.1 Alwalid N. Elshafei. Hospital Layout as a Quadratic Assignment Problem. Operational Research Quarterly (1970-1977) . 1977;28(1):167. doi:10.2307/300878
  • ↑ 6.0 6.1 Muktar, D., & Ahmad, Z.M. (2014). Examination Scheduling System Based On Quadratic Assignment.

Navigation menu

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

Algorithm for approximately solving quadratic assignment problems.

sogartar/faqap

Folders and files, repository files navigation, fast approximate quadratic assignment problem solver.

This is a Python implementation of an algorithm for approximately solving quadratic assignment problems described in

Joshua T. Vogelstein and John M. Conroy and Vince Lyzinski and Louis J. Podrazik and Steven G. Kratzer and Eric T. Harley and Donniell E. Fishkind and R. Jacob Vogelstein and Carey E. Priebe (2012) Fast Approximate Quadratic Programming for Large (Brain) Graph Matching. arXiv:1112.5507 .

min 𝑃∈𝒫 <𝐹, 𝑃𝐷𝑃 𝖳 >

where 𝐷, 𝐹 ∈ ℝ 𝑛×𝑛 , 𝒫 is the set of 𝑛×𝑛 permutation matrices and <., .> denotes the Frobenius inner product.

The implementation employs the Frank–Wolfe algorithm .

GPU Support

GPU support is enabled through Torch. It is an optional dependency. In order to use the GPU you must pass Torch tensors that are on the CUDA device. If you pass CPU tensors the GPU will not be used.

Note that linear sum assignment, which is a part of the algorithm, is done on the CPU though SciPy. On a system with GPU GeForce RTX 2080 SUPER and CPU AMD Ryzen Threadripper 2920X (single thread at 3.5 - 4.3 GHz) for a float32, 128 sized problem linear sum assignment takes ~60% of the execution time. It may be possible to move that part on the GPU as well, but currently there are no good off-the-shelf GPU implementations for that. It is also unclear if there will be any significant speedup.

Installation

Dependencies.

  • Python (>=3.5)
  • NumPy (>=1.10)
  • SciPy (>=1.4)
  • Torch (optional)
  • Python 100.0%

Maximum Flow and the Linear Assignment Problem

The Hungarian graph algorithm solves the linear assignment problem in polynomial time. By modeling resources (e.g., contractors and available contracts) as a graph, the Hungarian algorithm can be used to efficiently determine an optimum way of allocating resources.

Maximum Flow and the Linear Assignment Problem

By Dmitri Ivanovich Arkhipov

Dmitri has a PhD in computer science from UC Irvine and works primarily in UNIX/Linux ecosystems. He specializes in Python and Java.

PREVIOUSLY AT

Here’s a problem: Your business assigns contractors to fulfill contracts. You look through your rosters and decide which contractors are available for a one month engagement and you look through your available contracts to see which of them are for one month long tasks. Given that you know how effectively each contractor can fulfill each contract, how do you assign contractors to maximize the overall effectiveness for that month?

This is an example of the assignment problem, and the problem can be solved with the classical Hungarian algorithm .

Bipartite Matching

The Hungarian algorithm (also known as the Kuhn-Munkres algorithm) is a polynomial time algorithm that maximizes the weight matching in a weighted bipartite graph. Here, the contractors and the contracts can be modeled as a bipartite graph, with their effectiveness as the weights of the edges between the contractor and the contract nodes.

In this article, you will learn about an implementation of the Hungarian algorithm that uses the Edmonds-Karp algorithm to solve the linear assignment problem. You will also learn how the Edmonds-Karp algorithm is a slight modification of the Ford-Fulkerson method and how this modification is important.

The Maximum Flow Problem

The maximum flow problem itself can be described informally as the problem of moving some fluid or gas through a network of pipes from a single source to a single terminal. This is done with an assumption that the pressure in the network is sufficient to ensure that the fluid or gas cannot linger in any length of pipe or pipe fitting (those places where different lengths of pipe meet).

By making certain changes to the graph, the assignment problem can be turned into a maximum flow problem.

Preliminaries

The ideas needed to solve these problems arise in many mathematical and engineering disciplines, often similar concepts are known by different names and expressed in different ways (e.g., adjacency matrices and adjacency lists). Since these ideas are quite esoteric, choices are made regarding how generally these concepts will be defined for any given setting.

This article will not assume any prior knowledge beyond a little introductory set theory.

The implementations in this post represent the problems as directed graphs (digraph).

A digraph has two attributes , setOfNodes and setOfArcs . Both of these attributes are sets (unordered collections). In the code blocks on this post, I’m actually using Python’s frozenset , but that detail isn’t particularly important.

(Note: This, and all other code in this article, are written in Python 3.6.)

A node n is composed of two attributes:

n.uid : A unique identifier .

This means that for any two nodes x and y ,

  • n.datum : This represents any data object.

An arc a is composed of three attributes:

a.fromNode : This is a node , as defined above.

a.toNode : This is a node , as defined above.

a.datum : This represents any data object.

The set of arcs in the digraph represents a binary relation on the nodes in the digraph . The existence of arc a implies that a relationship exists between a.fromNode and a.toNode .

In a directed graph (as opposed to an undirected graph), the existence of a relationship between a.fromNode and a.toNode does not imply that a similar relationship between a.toNode and a.fromNode exists.

This is because in an undirected graph, the relationship being expressed is not necessarily symmetric.

Nodes and arcs can be used to define a digraph , but for convenience, in the algorithms below, a digraph will be represented using as a dictionary .

Here’s a method that can convert the graph representation above into a dictionary representation similar to this one :

And here’s another one that converts it into a dictionary of dictionaries, another operation that will be useful:

When the article talks about a digraph as represented by a dictionary, it will mention G_as_dict to refer to it.

Sometimes it’s helpful to fetch a node from a digraph G by it up through its uid (unique identifier):

In defining graphs, people sometimes use the terms node and vertex to refer to the same concept; the same is true of the terms arc and edge.

Two popular graph representations in Python are this one which uses dictionaries and another which uses objects to represent graphs. The representation in this post is somewhere in between these two commonly used representations.

This is my digraph representation. There are many like it, but this one is mine.

Walks and Paths

Let S_Arcs be a finite sequence (ordered collection) of arcs in a digraph G such that if a is any arc in S_Arcs except for the last, and b follows a in the sequence, then there must be a node n = a.fromNode in G.setOfNodes such that a.toNode = b.fromNode .

Starting from the first arc in S_Arcs , and continuing until the last arc in S_Arcs , collect (in the order encountered) all nodes n as defined above between each two consecutive arcs in S_Arcs . Label the ordered collection of nodes collected during this operation S_{Nodes} .

If any node appears more than once in the sequence S_Nodes then call S_Arcs a Walk on digraph G .

Otherwise, call S_Arcs a path from list(S_Nodes)[0] to list(S_Nodes)[-1] on digraph G .

Source Node

Call node s a source node in digraph G if s is in G.setOfNodes and G.setOfArcs contains no arc a such that a.toNode = s .

Terminal Node

Call node t a terminal node in digraph G if t is in G.setOfNodes and G.setOfArcs contains no arc a such that a.fromNode = t .

Cuts and s-t Cuts

A cut cut of a connected digraph G is a subset of arcs from G.setOfArcs which partitions the set of nodes G.setOfNodes in G . G is connected if every node n in G.setOfNodes and has at least one arc a in G.setOfArcs such that either n = a.fromNode or n = a.toNode , but a.fromNode != a.toNode .

The definition above refers to a subset of arcs , but it can also define a partition of the nodes of G.setOfNodes .

For the functions predecessors_of and successors_of , n is a node in set G.setOfNodes of digraph G , and cut is a cut of G :

Let cut be a cut of digraph G .

cut is a cut of digraph G if: (get_first_part(cut, G).union(get_second_part(cut, G)) == G.setOfNodes) and (len(get_first_part(cut, G).intersect(get_second_part(cut, G))) == 0) cut is called an x-y cut if (x in get_first_part(cut, G)) and (y in get_second_part(cut, G) ) and (x != y) . When the node x in a x-y cut cut is a source node and node y in the x-y cut is a terminal node , then this cut is called a s-t cut .

Flow Networks

You can use a digraph G to represent a flow network.

Assign each node n , where n is in G.setOfNodes an n.datum that is a FlowNodeDatum :

Assign each arc a , where a is in G.setOfArcs and a.datum that is a FlowArcDatum .

flowNodeDatum.flowIn and flowNodeDatum.flowOut are positive real numbers .

flowArcDatum.capacity and flowArcDatum.flow are also positive real numbers.

For every node node n in G.setOfNodes :

Digraph G now represents a flow network .

The flow of G refers to the a.flow for all arcs a in G .

Feasible Flows

Let digraph G represent a flow network .

The flow network represented by G has feasible flows if:

For every node n in G.setOfNodes except for source nodes and terminal nodes : n.datum.flowIn = n.datum.flowOut .

For every arc a in G.setOfNodes : a.datum.capacity <= a.datum.flow .

Condition 1 is called a conservation constraint .

Condition 2 is called a capacity constraint .

Cut Capacity

The cut capacity of an s-t cut stCut with source node s and terminal node t of a flow network represented by a digraph G is:

Minimum Capacity Cut

Let stCut = stCut(s,t,cut) be an s-t cut of a flow network represented by a digraph G .

stCut is the minimum capacity cut of the flow network represented by G if there is no other s-t cut stCutCompetitor in this flow network such that:

Stripping the Flows Away

I would like to refer to the digraph that would be the result of taking a digraph G and stripping away all the flow data from all the nodes in G.setOfNodes and also all the arcs in G.setOfArcs .

Maximum Flow Problem

A flow network represented as a digraph G , a source node s in G.setOfNodes and a terminal node t in G.setOfNodes , G can represent a maximum flow problem if:

Label this representation:

Where sourceNodeUid = s.uid , terminalNodeUid = t.uid , and maxFlowProblemStateUid is an identifier for the problem instance.

Maximum Flow Solution

Let maxFlowProblem represent a maximum flow problem . The solution to maxFlowProblem can be represented by a flow network represented as a digraph H .

Digraph H is a feasible solution to the maximum flow problem on input python maxFlowProblem if:

strip_flows(maxFlowProblem.G) == strip_flows(H) .

H is a flow network and has feasible flows .

If in addition to 1 and 2:

  • There can be no other flow network represented by digraph K such that strip_flows(G) == strip_flows(K) and find_node_by_uid(t.uid,G).flowIn < find_node_by_uid(t.uid,K).flowIn .

Then H is also an optimal solution to maxFlowProblem .

In other words a feasible maximum flow solution can be represented by a digraph , which:

Is identical to digraph G of the corresponding maximum flow problem with the exception that the n.datum.flowIn , n.datum.flowOut and the a.datum.flow of any of the nodes and arcs may be different.

Represents a flow network that has feasible flows .

And, it can represent an optimal maximum flow solution if additionally:

  • The flowIn for the node corresponding to the terminal node in the maximum flow problem is as large as possible (when conditions 1 and 2 are still satisfied).

If digraph H represents a feasible maximum flow solution : find_node_by_uid(s.uid,H).flowOut = find_node_by_uid(t.uid,H).flowIn this follows from the max flow, min cut theorem (discussed below). Informally, since H is assumed to have feasible flows this means that flow can neither be ‘created’ (except at source node s ) nor ‘destroyed’ (except at terminal node t ) while crossing any (other) node ( conservation constraints ).

Since a maximum flow problem contains only a single source node s and a single terminal node t , all flow ‘created’ at s must be ‘destroyed’ at t or the flow network does not have feasible flows (the conservation constraint would have been violated).

Let digraph H represent a feasible maximum flow solution ; the value above is called the s-t Flow Value of H .

This means that mfps is a successor state of maxFlowProblem , which just means that mfps is exacly like maxFlowProblem with the exception that the values of a.flow for arcs a in maxFlowProblem.setOfArcs may be different than a.flow for arcs a in mfps.setOfArcs .

Here’s a visualization of a mfps along with its associated maxFlowProb . Each arc a in the image has a label, these labels are a.datum.flowFrom / a.datum.flowTo , each node n in the image has a label, and these labels are n.uid {n.datum.flowIn / a.datum.flowOut} .

s-t Cut Flow

Let mfps represent a MaxFlowProblemState and let stCut represent a cut of mfps.G . The cut flow of stCut is defined:

s-t cut flow is the sum of flows from the partition containing the source node to the partition containing the terminal node minus the sum of flows from the partition containing the terminal node to the partition containing the source node .

Max Flow, Min Cut

Let maxFlowProblem represent a maximum flow problem and let the solution to maxFlowProblem be represented by a flow network represented as digraph H .

Let minStCut be the minimum capacity cut of the flow network represented by maxFlowProblem.G .

Because in the maximum flow problem flow originates in only a single source node and terminates at a single terminal node and, because of the capacity constraints and the conservation constraints , we know that the all of the flow entering maxFlowProblem.terminalNodeUid must cross any s-t cut , in particular it must cross minStCut . This means:

Solving the Maximum Flow Problem

The basic idea for solving a maximum flow problem maxFlowProblem is to start with a maximum flow solution represented by digraph H . Such a starting point can be H = strip_flows(maxFlowProblem.G) . The task is then to use H and by some greedy modification of the a.datum.flow values of some arcs a in H.setOfArcs to produce another maximum flow solution represented by digraph K such that K cannot still represent a flow network with feasible flows and get_flow_value(H, maxFlowProblem) < get_flow_value(K, maxFlowProblem) . As long as this process continues, the quality ( get_flow_value(K, maxFlowProblem) ) of the most recently encountered maximum flow solution ( K ) is better than any other maximum flow solution that has been found. If the process reaches a point that it knows that no other improvement is possible, the process can terminate and it will return the optimal maximum flow solution .

The description above is general and skips many proofs such as whether such a process is possible or how long it may take, I’ll give a few more details and the algorithm.

The Max Flow, Min Cut Theorem

From the book Flows in Networks by Ford and Fulkerson , the statement of the max flow, min cut theorem (Theorem 5.1) is:

For any network, the maximal flow value from s to t is equal to the minimum cut capacity of all cuts separating s and t .

Using the definitions in this post, that translates to:

The solution to a maxFlowProblem represented by a flow network represented as digraph H is optimal if:

I like this proof of the theorem and Wikipedia has another one .

The max flow, min cut theorem is used to prove the correctness and completeness of the Ford-Fulkerson method .

I’ll also give a proof of the theorem in the section after augmenting paths .

The Ford-Fulkerson Method and the Edmonds-Karp Algorithm

CLRS defines the Ford-Fulkerson method like so (section 26.2):

Residual Graph

The Residual Graph of a flow network represented as the digraph G can be represented as a digraph G_f :

agg_n_to_u_cap(n,u,G_as_dict) returns the sum of a.datum.capacity for all arcs in the subset of G.setOfArcs where arc a is in the subset if a.fromNode = n and a.toNode = u .

agg_n_to_u_cap(n,u,G_as_dict) returns the sum of a.datum.flow for all arcs in the subset of G.setOfArcs where arc a is in the subset if a.fromNode = n and a.toNode = u .

Briefly, the residual graph G_f represents certain actions which can be performed on the digraph G .

Each pair of nodes n,u in G.setOfNodes of the flow network represented by digraph G can generate 0, 1, or 2 arcs in the residual graph G_f of G .

The pair n,u does not generate any arcs in G_f if there is no arc a in G.setOfArcs such that a.fromNode = n and a.toNode = u .

The pair n,u generates the arc a in G_f.setOfArcs where a represents an arc labeled a push flow arc from n to u a = Arc(n,u,datum=ResidualNode(n_to_u_cap_sum - n_to_u_flow_sum)) if n_to_u_cap_sum > n_to_u_flow_sum .

The pair n,u generates the arc a in G_f.setOfArcs where a represents an arc labeled a pull flow arc from n to u a = Arc(n,u,datum=ResidualNode(n_to_u_cap_sum - n_to_u_flow_sum)) if n_to_u_flow_sum > 0.0 .

Each push flow arc in G_f.setOfArcs represents the action of adding a total of x <= n_to_u_cap_sum - n_to_u_flow_sum flow to arcs in the subset of G.setOfArcs where arc a is in the subset if a.fromNode = n and a.toNode = u .

Each pull flow arc in G_f.setOfArcs represents the action of subtracting a total of x <= n_to_u_flow_sum flow to arcs in the subset of G.setOfArcs where arc a is in the subset if a.fromNode = n and a.toNode = u .

Performing an individual push or pull action from G_f on the applicable arcs in G might generate a flow network without feasible flows because the capacity constraints or the conservation constraints might be violated in the generated flow network .

Here’s a visualization of the residual graph of the previous example visualization of a maximum flow solution , in the visualization each arc a represents a.residualCapacity .

Augmenting Path

Let maxFlowProblem be a max flow problem , and let G_f = get_residual_graph_of(G) be the residual graph of maxFlowProblem.G .

An augmenting path augmentingPath for maxFlowProblem is any path from find_node_by_uid(maxFlowProblem.sourceNode,G_f) to find_node_by_uid(maxFlowProblem.terminalNode,G_f) .

It turns out that an augmenting path augmentingPath can be applied to a max flow solution represented by digraph H generating another max flow solution represented by digraph K where get_flow_value(H, maxFlowProblem) < get_flow_value(K, maxFlowProblem) if H is not optimal .

Here’s how:

In the above, TOL is some tolerance value for rounding the flow values in the network. This is to avoid cascading imprecision of floating point calculations . So, for example, I used TOL = 10 to mean round to 10 significant digits .

Let K = augment(augmentingPath, H) , then K represents a feasible max flow solution for maxFlowProblem . For the statement to be true, the flow network represented by K must have feasible flows (not violate the capacity constraint or the conservation constraint .

Here’s why: In the method above, each node added to the new flow network represented by digraph K is either an exact copy of a node from digraph H or a node n which has had the same number added to its n.datum.flowIn as its n.datum.flowOut . This means that the conservation constraint is satisfied in K as long as it was satisfied in H . The conservation constraint is satisfied because we explicitly check that any new arc a in the network has a.datum.flow <= a.datum.capacity ; thus, as long as the arcs from the set H.setOfArcs which were copied unmodified into K.setOfArcs do not violate the capacity constraint , then K does not violate the capacity constraint .

It’s also true that get_flow_value(H, maxFlowProblem) < get_flow_value(K, maxFlowProblem) if H is not optimal .

Here’s why: For an augmenting path augmentingPath to exist in the digraph representation of the residual graph G_f of a max flow problem maxFlowProblem then the last arc a on augmentingPath must be a ‘push’ arc and it must have a.toNode == maxFlowProblem.terminalNode . An augmenting path is defined as one which terminates at the terminal node of the max flow problem for which it is an augmenting path . From the definition of the residual graph , it is clear that the last arc in an augmenting path on that residual graph must be a ‘push’ arc because any ‘pull’ arc b in the augmenting path will have b.toNode == maxFlowProblem.terminalNode and b.fromNode != maxFlowProblem.terminalNode from the definition of path . Additionally, from the definition of path , it is clear that the terminal node is only modified once by the augment method. Thus augment modifies maxFlowProblem.terminalNode.flowIn exactly once and it increases the value of maxFlowProblem.terminalNode.flowIn because the last arc in the augmentingPath must be the arc which causes the modification in maxFlowProblem.terminalNode.flowIn during augment . From the definition of augment as it applies to ‘push’ arcs , the maxFlowProblem.terminalNode.flowIn can only be increased, not decreased.

Some Proofs from Sedgewick and Wayne

The book Algorithms, fourth edition by Robert Sedgewich and Kevin Wayne has some wonderful and short proofs (pages 892-894) that will be useful. I’ll recreate them here, though I’ll use language fitting in with previous definitions. My labels for the proofs are the same as in the Sedgewick book.

Proposition E: For any digraph H representing a feasible maximum flow solution to a maximum flow problem maxFlowProblem , for any stCut get_stcut_flow(stCut,H,maxFlowProblem) = get_flow_value(H, maxFlowProblem) .

Proof: Let stCut=stCut(maxFlowProblem.sourceNode,maxFlowProblem.terminalNode,set([a for a in H.setOfArcs if a.toNode == maxFlowProblem.terminalNode])) . Proposition E holds for stCut directly from the definition of s-t flow value . Suppose that there we wish to move some node n from the s-partition ( get_first_part(stCut.cut, G) ) and into the t-partition (get_second_part(stCut.cut, G)) , to do so we need to change stCut.cut , which could change stcut_flow = get_stcut_flow(stCut,H,maxFlowProblem) and invalidate proposition E . However, let’s see how the value of stcut_flow will change as we make this change. node n is at equilibrium meaning that the sum of flow into node n is equal to the sum of flow out of it (this is necessary for H to represent a feasible solution ). Notice that all flow which is part of the stcut_flow entering node n enters it from the s-partition (flow entering node n from the t-partition either directly or indirectly would not have been counted in the stcut_flow value because it is heading the wrong direction based on the definition). Additionally, all flow exiting n will eventually (directly or indirectly) flow into the terminal node (proved earlier). When we move node n into the t-partition, all the flow entering n from the s-partition must be added to the new stcut_flow value; however, all flow exiting n must the be subtracted from the new stcut_flow value; the part of the flow heading directly into the t-partition is subtracted because this flow is now internal to the new t-partition and is not counted as stcut_flow . The part of the flow from n heading into nodes in the s-partition must also be subtracted from stcut_flow : After n is moved into the t-partition, these flows will be directed from the t-partition and into the s-partition and so must not be accounted for in the stcut_flow , since these flows are removed the inflow into the s-partition and must be reduced by the sum of these flows, and the outflow from the s-partition into the t-partition (where all flows from s-t must end up) must be reduced by the same amount. As node n was at equilibrium prior to the process, the update will have added the same value to the new stcut_flow value as it subtracted thus leaving proposition E true after the update. The validity of proposition E then follows from induction on the size of the t-partition.

Here are some example flow networks to help visualize the less obvious cases where proposition E holds; in the image, the red areas indicate the s-partition, the blue areas represent the t-partition, and the green arcs indicate an s-t cut . In the second image, flow between node A and node B increases while the flow into terminal node t doesn’t change.:

fastest algorithm for assignment problem

Corollary: No s-t cut flow value can exceed the capacity of any s-t cut .

Proposition F. (max flow, min cut theorem): Let f be an s-t flow . The following 3 conditions are equivalent:

There exists an s-t cut whose capacity equals the value of the flow f .

f is a max flow .

There is no augmenting path with respect to f .

Condition 1 implies condition 2 by the corollary. Condition 2 implies condition 3 because the existence of an augmenting path implies the existence of a flow with larger values, contradicting the maximality of f . Condition 3 implies condition 1: Let C_s be the set of all nodes that can be reached from s with an augmenting path in the residual graph . Let C_t be the remaining arcs , then t must be in C_t (by our assumption). The arcs crossing from C_s to C_t then form an s-t cut which contains only arcs a where either a.datum.capacity = a.datum.flow or a.datum.flow = 0.0 . If this were otherwise then the nodes connected by an arc with remaining residual capacity to C_s would be in the set C_s since there would then be an augmenting path from s to such a node . The flow across the s-t cut is equal to the s-t cut’s capacity (since arcs from C_s to C_t have flow equal to capacity) and also to the value of the s-t flow (by proposition E ).

This statement of the max flow, min cut theorem implies the earlier statement from Flows in Networks .

Corollary (integrality property): When capacities are integers, there exists an integer-valued max flow, and the Ford-Fulkerson algorithm finds it.

Proof: Each augmenting path increases the flow by a positive integer, the minimum of the unused capacities in the ‘push’ arcs and the flows in the ‘pull’ arcs , all of which are always positive integers.

This justifies the Ford-Fulkerson method description from CLRS . The method is to keep finding augmenting paths and applying augment to the latest maxFlowSolution coming up with better solutions, until no more augmenting path meaning that the latest maximum flow solution is optimal .

From Ford-Fulkerson to Edmonds-Karp

The remaining questions regarding solving maximum flow problems are:

How should augmenting paths be constructed?

Will the method terminate if we use real numbers and not integers?

How long will it take to terminate (if it does)?

The Edmonds-Karp algorithm specifies that each augmenting path is constructed by a breadth first search ( BFS ) of the residual graph ; it turns out that this decision of point 1 above will also force the algorithm to terminate (point 2) and allows the asymptotic time and space complexity to be determined.

First, here’s a BFS implementation:

I used a deque from the python collections module .

To answer question 2 above, I’ll paraphrase another proof from Sedgewick and Wayne : Proposition G. The number of augmenting paths needed in the Edmonds-Karp algorithm with N nodes and A arcs is at most NA/2 . Proof: Every augmenting path has a bottleneck arc - an arc that is deleted from the residual graph because it corresponds either to a ‘push’ arc that becomes filled to capacity or a ‘pull’ arc through which the flow becomes 0. Each time an arc becomes a bottleneck arc , the length of any augmenting path through it must increase by a factor of 2. This is because each node in a path may appear only once or not at all (from the definition of path ) since the paths are being explored from shortest path to longest that means that at least one more node must be visited by the next path that goes through the particular bottleneck node that means an additional 2 arcs on the path before we arrive at the node . Since the augmenting path is of length at most N each arc can be on at most N/2 augmenting paths , and the total number of augmenting paths is at most NA/2 .

The Edmonds-Karp algorithm executes in O(NA^2) . If at most NA/2 paths will be explored during the algorithm and exploring each path with BFS is N+A then the most significant term of the product and hence the asymptotic complexity is O(NA^2) .

Let mfp be a maxFlowProblemState .

The version above is inefficient and has worse complexity than O(NA^2) since it constructs a new maximum flow solution and new a residual graph each time (rather than modifying existing digraphs as the algorithm advances). To get to a true O(NA^2) solution the algorithm must maintain both the digraph representing the maximum flow problem state and its associated residual graph . So the algorithm must avoid iterating over arcs and nodes unnecessarily and update their values and associated values in the residual graph only as necessary.

To write a faster Edmonds Karp algorithm, I rewrote several pieces of code from the above. I hope that going through the code which generated a new digraph was helpful in understanding what’s going on. In the fast algorithm, I use some new tricks and Python data structures that I don’t want to go over in detail. I will say that a.fromNode and a.toNode are now treated as strings and uids to nodes . For this code, let mfps be a maxFlowProblemState

Here’s a visualization of how this algorithm solves the example flow network from above. The visualization shows the steps as they are reflected in the digraph G representing the most up-to-date flow network and as they are reflected in the residual graph of that flow network. Augmenting paths in the residual graph are shown as red paths, and the digraph representing the problem the set of nodes and arcs affected by a given augmenting path is highlighted in green. In each case, I’ll highlight the parts of the graph that will be changed (in red or green) and then show the graph after the changes (just in black).

Here’s another visualization of how this algorithm solving a different example flow network . Notice that this one uses real numbers and contains multiple arcs with the same fromNode and toNode values.

**Also notice that because Arcs with a ‘pull’ ResidualDatum may be part of the Augmenting Path, the nodes affected in the DiGraph of the Flown Network _may not be on a path in G! .

Bipartite Graphs

Suppose we have a digraph G , G is bipartite if it’s possible to partition the nodes in G.setOfNodes into two sets ( part_1 and part_2 ) such that for any arc a in G.setOfArcs it cannot be true that a.fromNode in part_1 and a.toNode in part_1 . It also cannot be true that a.fromNode in part_2 and a.toNode in part_2 .

In other words G is bipartite if it can be partitioned into two sets of nodes such that every arc must connect a node in one set to a node in the other set.

Testing Bipartite

Suppose we have a digraph G , we want to test if it is bipartite . We can do this in O(|G.setOfNodes|+|G.setOfArcs|) by greedy coloring the graph into two colors.

First, we need to generate a new digraph H . This graph will have will have the same set of nodes as G , but it will have more arcs than G . Every arc a in G will create 2 arcs in H ; the first arc will be identical to a , and the second arc reverses the director of a ( b = Arc(a.toNode,a.fromNode,a.datum) ).

Matchings and Maximum Matchings

Suppose we have a digraph G and matching is a subset of arcs from G.setOfArcs . matching is a matching if for any two arcs a and b in matching : len(frozenset( {a.fromNode} ).union( {a.toNode} ).union( {b.fromNode} ).union( {b.toNode} )) == 4 . In other words, no two arcs in a matching share a node .

Matching matching , is a maximum matching if there is no other matching alt_matching in G such that len(matching) < len(alt_matching) . In other words, matching is a maximum matching if it is the largest set of arcs from G.setOfArcs that still satisfies the definition of matching (the addition of any arc not already in the matching will break the matching definition).

A maximum matching matching is a perfect matching if every for node n in G.setOfArcs there exists an arc a in matching where a.fromNode == n or a.toNode == n .

Maximum Bipartite Matching

A maximum bipartite matching is a maximum matching on a digraph G which is bipartite .

Given that G is bipartite , the problem of finding a maximum bipartite matching can be transformed into a maximum flow problem solvable with the Edmonds-Karp algorithm and then the maximum bipartite matching can be recovered from the solution to the maximum flow problem .

Let bipartition be a bipartition of G .

To do this, I need to generate a new digraph ( H ) with some new nodes ( H.setOfNodes ) and some new arcs ( H.setOfArcs ). H.setOfNodes contains all the nodes in G.setOfNodes and two more nodess , s (a source node ) and t (a terminal node ).

H.setOfArcs will contain one arc for each G.setOfArcs . If an arc a is in G.setOfArcs and a.fromNode is in bipartition.firstPart and a.toNode is in bipartition.secondPart then include a in H (adding a FlowArcDatum(1,0) ).

If a.fromNode is in bipartition.secondPart and a.toNode is in bipartition.firstPart , then include Arc(a.toNode,a.fromNode,FlowArcDatum(1,0)) in H.setOfArcs .

The definition of a bipartite graph ensures that no arc connects any nodes where both nodes are in the same partition. H.setOfArcs also contains an arc from node s to each node in bipartition.firstPart . Finally, H.setOfArcs contains an arc each node in bipartition.secondPart to node t . a.datum.capacity = 1 for all a in H.setOfArcs .

First partition the nodes in G.setOfNodes the two disjoint sets ( part1 and part2 ) such that no arc in G.setOfArcs is directed from one set to the same set (this partition is possible because G is bipartite ). Next, add all arcs in G.setOfArcs which are directed from part1 to part2 into H.setOfArcs . Then create a single source node s and a single terminal node t and create some more arcs

Then, construct a maxFlowProblemState .

Minimal Node Cover

A node cover in a digraph G is a set of nodes ( cover ) from G.setOfNodes such that for any arc a of G.setOfArcs this must be true: (a.fromNode in cover) or (a.toNode in cover) .

A minimal node cover is the smallest possible set of nodes in the graph that is still a node cover . König’s theorem states that in a bipartite graph, the size of the maximum matching on that graph is equal to the size of the minimal node cover , and it suggests how the node cover can recovered from a maximum matching :

Suppose we have the bipartition bipartition and the maximum matching matching . Define a new digraph H , H.setOfNodes=G.setOfNodes , the arcs in H.setOfArcs are the union of two sets.

The first set is arcs a in matching , with the change that if a.fromNode in bipartition.firstPart and a.toNode in bipartition.secondPart then a.fromNode and a.toNode are swapped in the created arc give such arcs a a.datum.inMatching=True attribute to indicate that they were derived from arcs in a matching .

The second set is arcs a NOT in matching , with the change that if a.fromNode in bipartition.secondPart and a.toNode in bipartition.firstPart then a.fromNode and a.toNode are swapped in the created arc (give such arcs a a.datum.inMatching=False attribute).

Next, run a depth first search ( DFS ) starting from each node n in bipartition.firstPart which is neither n == a.fromNode nor n == a.toNodes for any arc a in matching . During the DFS, some nodes are visited and some are not (store this information in a n.datum.visited field). The minimum node cover is the union of the nodes {a.fromNode for a in H.setOfArcs if ( (a.datum.inMatching) and (a.fromNode.datum.visited) ) } and the nodes {a.fromNode for a in H.setOfArcs if (a.datum.inMatching) and (not a.toNode.datum.visited)} .

This can be shown to lead from a maximum matching to a minimal node cover by a proof by contradiction , take some arc a that was supposedly not covered and consider all four cases regarding whether a.fromNode and a.toNode belong (whether as toNode or fromNode ) to any arc in matching matching . Each case leads to a contradiction due to the order that DFS visits nodes and the fact that matching is a maximum matching .

Suppose we have a function to execute these steps and return the set of nodes comprising the minimal node cover when given the digraph G , and the maximum matching matching :

The Linear Assignment Problem

The linear assignment problem consists of finding a maximum weight matching in a weighted bipartite graph.

Problems like the one at the very start of this post can be expressed as a linear assignment problem . Given a set of workers, a set of tasks, and a function indicating the profitability of an assignment of one worker to one task, we want to maximize the sum of all assignments that we make; this is a linear assignment problem .

Assume that the number of tasks and workers are equal, though I will show that this assumption is easy to remove. In the implementation, I represent arc weights with an attribute a.datum.weight for an arc a .

Kuhn-Munkres Algorithm

The Kuhn-Munkres Algorithm solves the linear assignment problem . A good implementation can take O(N^{4}) time, (where N is the number of nodes in the digraph representing the problem). An implementation that is easier to explain takes O(N^{5}) (for a version which regenerates DiGraphs ) and O(N^{4}) for (for a version which maintains DiGraphs ). This is similar to the two different implementations of the Edmonds-Karp algorithm.

For this description, I’m only working with complete bipartite graphs (those where half the nodes are in one part of the bipartition and the other half in the second part). In the worker, task motivation this means that there are as many workers as tasks.

This seems like a significant condition (what if these sets are not equal!) but it is easy to fix this issue; I talk about how to do that in the last section.

The version of the algorithm described here uses the useful concept of zero weight arcs . Unfortunately, this concept only makes sense when we are solving a minimization (if rather than maximizing the profits of our worker-task assignments we were instead minimizing the cost of such assignments).

Fortunately, it is easy to turn a maximum linear assignment problem into a minimum linear assignment problem by setting each the arc a weights to M-a.datum.weight where M=max({a.datum.weight for a in G.setOfArcs}) . The solution to the original maximizing problem will be identical to the solution minimizing problem after the arc weights are changed. So for the remainder, assume that we make this change.

The Kuhn-Munkres algorithm solves minimum weight matching in a weighted bipartite graph by a sequence of maximum matchings in unweighted bipartite graphs. If a we find a perfect matching on the digraph representation of the linear assignment problem , and if the weight of every arc in the matching is zero, then we have found the minimum weight matching since this matching suggests that all nodes in the digraph have been matched by an arc with the lowest possible cost (no cost can be lower than 0, based on prior definitions).

No other arcs can be added to the matching (because all nodes are already matched) and no arcs should be removed from the matching because any possible replacement arc will have at least as great a weight value.

If we find a maximum matching of the subgraph of G which contains only zero weight arcs , and it is not a perfect matching , we don’t have a full solution (since the matching is not perfect ). However, we can produce a new digraph H by changing the weights of arcs in G.setOfArcs in a way that new 0-weight arcs appear and the optimal solution of H is the same as the optimal solution of G . Since we guarantee that at least one zero weight arc is produced at each iteration, we guarantee that we will arrive at a perfect matching in no more than |G.setOfNodes|^{2}=N^{2} such iterations.

Suppose that in bipartition bipartition , bipartition.firstPart contains nodes representing workers, and bipartition.secondPart represents nodes representing tasks.

The algorithm starts by generating a new digraph H . H.setOfNodes = G.setOfNodes . Some arcs in H are generated from nodes n in bipartition.firstPart . Each such node n generates an arc b in H.setOfArcs for each arc a in bipartition.G.setOfArcs where a.fromNode = n or a.toNode = n , b=Arc(a.fromNode, a.toNode, a.datum.weight - z) where z=min(x.datum.weight for x in G.setOfArcs if ( (x.fromNode == n) or (x.toNode == n) )) .

More arcs in H are generated from nodes n in bipartition.secondPart . Each such node n generates an arc b in H.setOfArcs for each arc a in bipartition.G.setOfArcs where a.fromNode = n or a.toNode = n , b=Arc(a.fromNode, a.toNode, ArcWeightDatum(a.datum.weight - z)) where z=min(x.datum.weight for x in G.setOfArcs if ( (x.fromNode == n) or (x.toNode == n) )) .

KMA: Next, form a new digraph K composed of only the zero weight arcs from H , and the nodes incident on those arcs . Form a bipartition on the nodes in K , then use solve_mbm( bipartition ) to get a maximum matching ( matching ) on K . If matching is a perfect matching in H (the arcs in matching are incident on all nodes in H.setOfNodes ) then the matching is an optimal solution to the linear assignment problem .

Otherwise, if matching is not perfect , generate the minimal node cover of K using node_cover = get_min_node_cover(matching, bipartition(K)) . Next, define z=min({a.datum.weight for a in H.setOfArcs if a not in node_cover}) . Define nodes = H.setOfNodes , arcs1 = {Arc(a.fromNode,a.toNode,ArcWeightDatum(a.datum.weigh-z)) for a in H.setOfArcs if ( (a.fromNode not in node_cover) and (a.toNode not in node_cover)} , arcs2 = {Arc(a.fromNode,a.toNode,ArcWeightDatum(a.datum.weigh)) for a in H.setOfArcs if ( (a.fromNode not in node_cover) != (a.toNode not in node_cover)} , arcs3 = {Arc(a.fromNode,a.toNode,ArcWeightDatum(a.datum.weigh+z)) for a in H.setOfArcs if ( (a.fromNode in node_cover) and (a.toNode in node_cover)} . The != symbol in the previous expression acts as an XOR operator. Then arcs = arcs1.union(arcs2.union(arcs3)) . Next, H=DiGraph(nodes,arcs) . Go back to the label KMA . The algorithm continues until a perfect matching is produced. This matching is also the solution to the linear assignment problem .

This implementation is O(N^{5}) because it generates a new maximum matching matching at each iteration; similar to the previous two implementations of Edmonds-Karp this algorithm can be modified so that it keeps track of the matching and adapts it intelligently to each iteration. When this is done, the complexity becomes O(N^{4}) . A more advanced and more recent version of this algorithm (requiring some more advanced data structures) can run in O(N^{3}) . Details of both the simpler implementation above and the more advanced implementation can be found at this post which motivated this blog post.

None of the operations on arc weights modify the final assignment returned by the algorithm. Here’s why: Since our input graphs are always complete bipartite graphs a solution must map each node in one partition to another node in the second partition, via the arc between these two nodes . Notice that the operations performed on the arc weights never changes the order (ordered by weight) of the arcs incident on any particular node .

Thus when the algorithm terminates at a perfect complete bipartite matching each node is assigned a zero weight arc , since the relative order of the arcs from that node hasn’t changed during the algorithm, and since a zero weight arc is the cheapest possible arc and the perfect complete bipartite matching guarantees that one such arc exists for each node . This means that the solution generated is indeed the same as the solution from the original linear assignment problem without any modification of arc weights.

Unbalanced Assignments

It seems like the algorithm is quite limited since as described it operates only on complete bipartite graphs (those where half the nodes are in one part of the bipartition and the other half in the second part). In the worker, task motivation this means that there are as many workers as tasks (seems quite limiting).

However, there is an easy transformation that removes this restriction. Suppose that there are fewer workers than tasks, we add some dummy workers (enough to make the resulting graph a complete bipartite graph ). Each dummy worker has an arc directed from the worker to each of the tasks. Each such arc has weight 0 (placing it in a matching gives no added profit). After this change the graph is a complete bipartite graph which we can solve for. Any task assigned a dummy worker is not initiated.

Suppose that there are more tasks than workers. We add some dummy tasks (enough to make the resulting graph a complete bipartite graph ). Each dummy task has an arc directed from each worker to the dummy task. Each such arc has a weight of 0 (placing it in a matching gives no added profit). After this change the graph is a complete bipartite graph which we can solve for. Any worker assigned to dummy task is not employed during the period.

A Linear Assignment Example

Finally, let’s do an example with the code I’ve been using. I’m going to modify the example problem from here . We have 3 tasks: we need to clean the bathroom , sweep the floor , and wash the windows .

The workers available to use are Alice , Bob , Charlie , and Diane . Each of the workers gives us the wage they require per task. Here are the wages per worker:

If we want to pay the least amount of money, but still get all the tasks done, who should do what task? Start by introducing a dummy task to make the digraph representing the problem bipartite.

Supposing that the problem is encoded in a digraph , then kuhn_munkres( bipartition(G) ) will solve the problem and return the assignment. It’s easy to verify that the optimal (lowest cost) assignment will cost $5.

Here’s a visualization of the solution the code above generates:

That is it. You now know everything you need to know about the linear assignment problem.

You can find all of the code from this article on GitHub .

Further Reading on the Toptal Blog:

  • Graph Data Science With Python/NetworkX

Dmitri Ivanovich Arkhipov

Irvine, CA, United States

Member since January 23, 2017

About the author

World-class articles, delivered weekly.

Subscription implies consent to our privacy policy

Toptal Developers

  • Algorithm Developers
  • Angular Developers
  • AWS Developers
  • Azure Developers
  • Big Data Architects
  • Blockchain Developers
  • Business Intelligence Developers
  • C Developers
  • Computer Vision Developers
  • Django Developers
  • Docker Developers
  • Elixir Developers
  • Go Engineers
  • GraphQL Developers
  • Jenkins Developers
  • Kotlin Developers
  • Kubernetes Experts
  • Machine Learning Engineers
  • Magento Developers
  • .NET Developers
  • R Developers
  • React Native Developers
  • Ruby on Rails Developers
  • Salesforce Developers
  • SQL Developers
  • Tableau Developers
  • Unreal Engine Developers
  • Xamarin Developers
  • View More Freelance Developers

Join the Toptal ® community.

IMAGES

  1. Solved Assignment Problems

    fastest algorithm for assignment problem

  2. Solved Assignment Problems

    fastest algorithm for assignment problem

  3. Job Assignment Problem using Branch And Bound

    fastest algorithm for assignment problem

  4. algorithm for problem solving in computer

    fastest algorithm for assignment problem

  5. Operation Research 16: Formulation of Assignment Problem

    fastest algorithm for assignment problem

  6. Solved Assignment Problems

    fastest algorithm for assignment problem

VIDEO

  1. Assignment Problem ( Brute force method) Design and Analysis of Algorithm

  2. which algorithm is the fastest #edit

  3. Fastest Algorithm For This PLL .#viral #viralshorts #rubikscube ##trending

  4. fastest algorithm for opposite edge swap(NO REGRIPS)#3by3 #cube 😳

  5. fastest reader

  6. Liang–Barsky algorithm Assignment

COMMENTS

  1. PDF QuickMatch: A Very Fast Algorithm for the Assignment Problem

    QuickMatch: A Very Fast Algorithm for the Assignment Problem by Yusin Lee and James B. Orlin Abstract In this paper, we consider the linear assignment problem defined on a bipartite network G = ( U V, A). The problem may be described as assigning each person in a set IU to a set V of tasks so as to minimize the total cost of the assignment. ...

  2. Assignment problem

    The assignment problem is a fundamental combinatorial optimization problem. In its most general form, the problem is as follows: ... This is currently the fastest run-time of a strongly polynomial algorithm for this problem. ... Their work proposes an approximation algorithm for the assignment problem ...

  3. PDF 7.13 Assignment Problem

    The algorithm maintains a matching M and compatible prices p. Pf. Follows from Lemmas 2 and 3 and initial choice of prices. ! Theorem. The algorithm returns a min cost perfect matching. Pf. Upon termination M is a perfect matching, and p are compatible Optimality follows from Observation 2. ! Theorem. The algorithm can be implemented in O(n 3 ...

  4. Hungarian Algorithm for Assignment Problem

    Time complexity : O(n^3), where n is the number of workers and jobs. This is because the algorithm implements the Hungarian algorithm, which is known to have a time complexity of O(n^3). Space complexity : O(n^2), where n is the number of workers and jobs.This is because the algorithm uses a 2D cost matrix of size n x n to store the costs of assigning each worker to a job, and additional ...

  5. Assignment Problem and Hungarian Algorithm

    But, due to the specifics of the problem, there are more efficient algorithms to solve it. We'll handle the assignment problem with the Hungarian algorithm (or Kuhn-Munkres algorithm). I'll illustrate two different implementations of this algorithm, both graph theoretic, one easy and fast to implement with O(n4) complexity, and the other ...

  6. Hungarian algorithm

    The Hungarian method is a combinatorial optimization algorithm that solves the assignment problem in polynomial time and which anticipated later primal-dual methods.It was developed and published in 1955 by Harold Kuhn, who gave it the name "Hungarian method" because the algorithm was largely based on the earlier works of two Hungarian mathematicians, Dénes Kőnig and Jenő Egerváry.

  7. What is the fastest algorithm for this assignment problem?

    I need this part of the program to be as fast as possible. I'm wondering if there is an optimal algorithm I should use. I have been researching and came across the Hungarian algorithm but I'm wondering if there is another option I should be considering. Here is an example of the problem: My grid has its' positions labelled, a,b,c,d ...

  8. QuickMatch: A very fast algorithm for the assignment problem

    This paper describes a new algorithm called QuickMiatch for solving the assignment problem. QuickMatch is based on the successive shortest path (SSP) algorithm for the assignment problem, which in ...

  9. PDF The assignment problem revisited

    The-scaling auction algorithm [5] and the Goldberg & Kennedy algorithm [13] are algorithms that solve the assignment problem. The -scaling auction algorithm operates like a real auction, where a set of persons U, compete for a set of objects V. In this scenario, to each object is assigned a price which, in certain sense, represents

  10. The assignment problem revisited

    First, we give a detailed review of two algorithms that solve the minimization case of the assignment problem, the Bertsekas auction algorithm and the Goldberg & Kennedy algorithm. It was previously alluded that both algorithms are equivalent. We give a detailed proof that these algorithms are equivalent. Also, we perform experimental results comparing the performance of three algorithms for ...

  11. QuickMatch--a very fast algorithm for the assignment problem

    The theoretical analysis and computational testing supports the hypothesis that QuickMatch runs in linear time on randomly generated sparse assignment problems, and presents some theoretical justifications as to why the algorithm's performance is superior in practice to the usual SSP algorithm. In this paper, we consider the linear assignment problem defined on a bipartite network G = ( U V, A).

  12. PDF Lecture 8: Assignment Algorithms

    Hungarian algorithm steps for minimization problem. Step 1: For each row, subtract the minimum number in that row from all numbers in that row. Step 2: For each column, subtract the minimum number in that column from all numbers in that column. Step 3: Draw the minimum number of lines to cover all zeroes.

  13. QuickMatch: A very fast algorithm for the assignment problem

    Quick Match 7 The following algorithm is known as the successive shortest path algorithm for the assignment problem. Algorithm SSP begin x = 0; while some node is free do begin select an origin node i; in the residual network, find the minimum cost augmenting path P from i to some free destination node t; augment along the path P; update data ...

  14. The Assignment Problem (Using Hungarian Algorithm)

    Total Cost= 2+8+4+6=20. Approach 3: Greedy Approach In this case, the algorithm will choose the lowest cost worker to be assigned to the task as the first assignment, then choose the next lowest ...

  15. Quadratic assignment problem

    The Quadratic Assignment Problem (QAP), ... proposed by Ailsa Land and Alison Doig in 1960 and is the most commonly used tool for solving NP-hard optimization problems. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming ...

  16. optimization

    There are a few papers which have fast algorithms for weighted bipartite graphs. A recent paper Ramshaw and Tarjan, 2012 "On Minimum-Cost Assignments in Unbalanced Bipartite Graphs" presents an algorithm called FlowAssign and Refine that solves for the min-cost, unbalanced, bipartite assignment problem and uses weight scaling to solve the perfect and imperfect assignment problems, but not ...

  17. PDF Methods for Fast Linear Programming

    Linear programming helps us solve problems where we want to nd the best out-come subject to linear constraints. E ciently solving linear programs is central to a variety of important optimization algorithms. These problems include max-ow prob-lems, scheduling problems, TSP problems, and many other algorithms that can be

  18. fastest algorithm for rectangular linear assignment problem

    Is my problem in fact $\Theta(m^3)$? I.e., is the method of duplicating workers and using Kuhn-Munkres (as fast as) the fastest algorithm for solving the rectangular linear assignment problem (RLAP)?. I want to know because I have a reduction of RLAP to another problem, and I want to lower-bound the complexity of this other problem.

  19. Fast Approximate Quadratic Assignment Problem Solver

    This is a Python implementation of an algorithm for approximately solving quadratic assignment problems described in. Joshua T. Vogelstein and John M. Conroy and Vince Lyzinski and Louis J. Podrazik and Steven G. Kratzer and Eric T. Harley and Donniell E. Fishkind and R. Jacob Vogelstein and Carey E. Priebe (2012) Fast Approximate Quadratic Programming for Large (Brain) Graph Matching.

  20. Maximum Flow and the Linear Assignment Problem

    Here, the contractors and the contracts can be modeled as a bipartite graph, with their effectiveness as the weights of the edges between the contractor and the contract nodes. In this article, you will learn about an implementation of the Hungarian algorithm that uses the Edmonds-Karp algorithm to solve the linear assignment problem.

  21. Fast Linear Assignment Problem using Auction Algorithm (mex)

    Mex implementation of Bertsekas' auction algorithm [1] for a very fast solution of the linear assignment problem. The implementation is optimised for sparse matrices where an element A (i,j) = 0 indicates that the pair (i,j) is not possible as assignment. Solving a sparse problem of size 950,000 by 950,000 with around 40,000,000 non-zero ...

  22. Difference between solving Assignment Problem using the Hungarian

    It solves all LP problems and focus in development is to be fast on average on all LPs and also to be fast-ish in the pathological cases. When using the Hungarian method, you do not build a model, you just pass the cost matrix to a tailored algorithm. You will then use an algorithm developed for that specific problem to solve it.

  23. How to tractably solve the assignment optimisation task

    What you're trying to solve here is known as the assignment problem: given two lists of n elements each and n×n values (the value of each pair), how to assign them so that the total "value" is maximized (or equivalently, minimized). There are several algorithms for this, such as the Hungarian algorithm ( Python implementation ), or you could ...

  24. Assigning Fastest Pick-Ups to Uber Drivers with Linear Programming

    The algorithm formulates the assignment problem to minimum-cost flow problem and solves it using push-relabel method. The running time is O(n*m*log ... From our plot it is apparent that The Minimal Cost solver is the fastest, followed very closely by the Linear Sum Assignment solver. Though we can see the minimal cost solver scales better since ...