I need some help here. But you could consider it a MathQuiz. I have been searching for an algorithm that does the following:
Given a set A and a subset of it B. The elements of A have a total order. Determine the index of the B in the sequence of all subsets of A ordered lexicographically.
The difficulty is in doing so without creating the sequence explicitly, but to calculate the index from A and B alone.
Example:
A = {1,2,3,4}
B = {1,2,3}
The ordered sequence of subsets of A is
This is useful if you have some property associated with each element of the sequence, but do not what to store the key and search for it (especially if the property is just a single or a few bits). In this case you can just calculate the index directly and do an array-lookup.
Interesting special case (maybe this is easier): We are interested only in subsets of a certain length:
Given a set A and a subset of it B. The elements of A have a total order. Determine the index of the B in the sequence of all subsets of the same cardinality of A ordered lexicographically.
Example:
A = {1,2,3,4,5}
B = {2,4}
The ordered sequence of subsets of A is
I already know, that at least the latter special case should be solvable by evaluating a polynomial on the index of the elements in their total order.
The 2-element subset case looks as follows:
A ={1,2,...,n) B ={b1,b2}
index = n*(n-1)/2 - (n-b1)*(n-b1+1)/2 + (b2-b1)
This is useful if you have some property associated with each element of the sequence, but do not what to store the key and search for it (especially if the property is just a single or a few bits). In this case you can just calculate the index directly and do an array-lookup.
If this is the only thing you need the subset indices for, then it seems that you don't really need the indices from a lexicographic order; any order on the subsets will do just fine. If A has N elements, then there is a simple mapping between subsets of A and unsigned integers with at most N bits. If you have an ordered array containing the N elements of A, and a possibly unordered array containing the M elements of B, you can compute the index of B in O(M log N) operations. (For each element of B, you need to look up its index in A, then do a bit shift and an addition.) -- JosephDale
You are right, for the general case the bitset encoding is the better choice if you want to achieve the stated objective (save memory). I knew the set-encoding, but didn't think about, because I'm actually more interested in the special case. I cannot also use the set-encoding there, because it creates too much overhead. What I need is e.g. all 8-of-32 sets. -- .gz
Ah, I didn't realize that that was what you were looking for. Here's an idea. Suppose that x1 < x2 < x3, and we want the index of the set {x1, x2, x3} in the lexicographic ordering of all 3-of-N sets. The number of sets that precede {x1, x2, x3} in the ordering is equal to:
That's exactly, what I have come up with by now too. I will post an algorithm shortly. One optimization: I think, that the binominal coefficients may be related and so we can factor out some multiplications. By the way: Do you know an otherwise efficient binominal coefficient calculation for java? -- .gz
This was my initial code (java):
public static int indexOfASubSet(int setMaxElement, int[] orderedSubSet1Based) { int setSize = orderedSubSet1Based.length; int indexAcc = 0; for ( int posInSubSet = 1; posInSubSet <= setSize; posInSubSet++ ) { int a = combinationsOf(setSize - posInSubSet + 1, setMaxElement - (setSize - posInSubSet) - (posInSubSet == 1 ? 0 : orderedSubSet1Based[posInSubSet - 1 - 1])); int b = combinationsOf(setSize - posInSubSet + 1, setMaxElement - (setSize - posInSubSet) - (orderedSubSet1Based[posInSubSet - 1] - 1)); indexAcc += a - b; } return indexAcc; } public static int combinationsOf(int numberOfElements, int setMaxElement) { long accu = 1; for ( int i = 0; i < numberOfElements; i++ ) { accu *= (setMaxElement + i); } for ( int i = 2; i <= numberOfElements; i++ ) { accu /= i; } return (int) accu; }Yes, the func "numberOfElements" is in fact a binomial coeff. And the two call for it can be combined.
I searched the web for an algorithm or even a mathematical description, but could'nt find any. Does anybody know, if this IndexOfaSubSet algorithms has some special name? -- .gz