Permutations in Excel

  • Buffer
  • Sharebar
  • Buffer
jethro's picture

I posted an article about Permutations in Excel once before, back in 2005. I actually made a mistake in my example, and was corrected by a reader in a comment- thanks MathBoss.

I have since had another reader, Dam ask a question that I am not sure of the answer of. So I am opening it up to you Excel experts out there.

Here is the question as Dan wrote it.

How in Excel can I create ALL permutations of a given data set. for example, imagine 5 columns. Each column has a differnet (sic) number of unique entries. Column 1 can be A, B. Column 2 could be A, B, C, D, E and so on. I want to combine 1 item of each 5 columns and keep doing it until I create all possible combinations. The number of permutations (PERMUT) may be very large so I need to automate this. I'd like to avoid building a macro since I am sick of building them and I like the static spreadsheet challenge more.

Add your answers in the comments.

Share this post:

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Elias Hedberg's picture

Maybe the following solution

Maybe the following solution could be a start, it will give you all the combinations (but not necessarily all permutations) of two columns and can be extended to more columns by extending the functions (fairly easily but not absolutely trivially, you need to understand what you are doing).

SETUP
• Place the first data set in column A, without blanks.
• Place the second data set in column B, without blanks.

CAVEAT
• The solution will not work completely if there are blanks in the data sets (since the formulas count the number of rows with a value, lets call it "n", and then assumes the data only stretches from row 1 to n; if your data has 8 blanks then it will stretch from row 1 to (n+8) and the solution will miss the last 8 rows).
This can be solved by sorting the data, then cells with values will be grouped together, separated from the empty cells, and can be place at the top of the column.

SOLUTION
Place the following formulas in columns next to each other and fill down as far as necessary.

  1. =IF(ROW()<=COUNTA(A:A)*COUNTA(B:B);INDIRECT("A"&CEILING(ROW()/COUNTA(B:B);1));"")
  2. Place the following formula in a column next to the previous column:
    =IF(ROW()<=COUNTA(A:A)*COUNTA(B:B);INDIRECT("B"&(ROW()-(CEILING(ROW()/COUNTA(B:B);1)-1)*COUNTA(B:B)));"")

I will post a detailed explanation of the solution below.

Elias Hedberg's picture

EXPLANATION of the solution •

EXPLANATION of the solution
• The IF(ROW()<=COUNTA(A:A)*COUNTA(B:B);;"") code in both formulas is there to make sure the list stops when all combinations have been listed. Strictly speaking you could leave it out and only use INDIRECT("A"&CEILING(ROW()/COUNTA(B:B);1)) and INDIRECT("B"&(ROW()-(CEILING(ROW()/COUNTA(B:B);1)-1)*COUNTA(B:B))), respectively.
    What the IF clause does is that it checks if the present row ("ROW()") is less than or equal to ("<=") the total number of combinations ("COUNTA(A:A)*COUNTA(B:B)"), where "COUNTA(A:A)" counts the number of values in column A (excluding any empty cells) and "COUNTA(B:B)" does the same for column B, if it is, then the calculation is performed, if not, then an empty string ("""") is inserted in the cell.

• Breaking down the first formula:
INDIRECT("A"&CEILING(ROW()/COUNTA(B:B);1)
    ▸ INDIRECT(…) lets us reference the cell by its code or name (e.g. "A1"), this also lets us update the cell reference as we move along.
    ▸ "A"&… references a cell in column A and on the row specified by the formula following the "&".
    ▸ CEILING(ROW()/COUNTA(B:B);1) specifies the row and makes sure that the first column shows the values in column A as many times as there are values in column B before moving to the next value.
        Example: if the present row number ("ROW()") is 4 and the total number of items in column B ("COUNTA(B:B)") is 9, then ROW()/COUNTA(B:B) = 4/9 = 0.444…, rounding that value up ("CEILING(…;1)") yields "1", but as soon as we have passed row 9 we will get a value larger than one which will move us to the next cell in column A (and the IF-clause around the entire formula ensures that no more values are listed when we have listed all combinations).

• Breaking down the second formula:
INDIRECT("B"&(ROW()-(CEILING(ROW()/COUNTA(B:B);1)-1)*COUNTA(B:B)))
    ▸ INDIRECT(…) lets us reference the cell by its code or name (e.g. "B1"), just as in the first formula.
    ▸ "B"&… references a cell in column B and on the row specified by the formula following the "&", just as in the first formula (see above).
    ▸ ROW()-(CEILING(ROW()/COUNTA(B:B);1)-1)*COUNTA(B:B)) specifies the row, making sure we loop through the values in column B over and over.
    It does this by taking the present row ("ROW()") and subtracting the number rows we are beyond the last value, rounded down(!) (yes, in spite of using "CEILING(…)", see the example at the end) to the nearest whole multiple of the number of values in column B.
        Example: if we have 12 values in column B and the present cell is "D28", then the returned value for the row will be 28-2*12 = 4 (since 12 fits 2 times in 28).
        Using the actual formula with 12 values in column B and being positioned in cell "D28" we would get:
        ▸ ROW() = 28
        ▸ COUNTA(B:B) = 12
        Hence:
        ▸ ROW()/COUNTA(B:B) = 28/12 = 2.333…
        ▸ (CEILING(ROW()/COUNTA(B:B);1) = CEILING(2.333…;1) = 3
        From this we subtract one ("(CEILING(ROW()/COUNTA(B:B);1)-1)"), since we do not want to subtract the number of values in column B from the present row number the first time we loop through the values in column B (i.e. while we are at rows 1–12), before we multiply with the number of values in column B (its "size" so to say; "(CEILING(ROW()/COUNTA(B:B);1)-1)*COUNTA(B:B)").
        But couldn’t we have rounded "ROW()/COUNTA(B:B)" down instead of up? Yes and no. If we had rounded down instead of up we would have had a problem where the number of values in column B is a multiple of the present row, basically we would be looping rows 0–11 instead of rows 1–12.
        This can, however, be compensated by subtracting one from the present row: INDIRECT("B"&(ROW()-(FLOOR((ROW()-1)/COUNTA(B:B);1)*COUNTA(B:B))), so in either case you have to do some subtraction to get it right.

jethro's picture

Awesome thanks Elia - i will

Awesome thanks Elia - i will have to try this out
Cheers very much
Tim