Custom Collation Order

From discussion on EvidenceDiscussion about FunctorVsClosure and the use of HigherOrderFunctions and LambdaExpressions in general, the following challenge is set:

"Given a list of SKUs, sort them so that those starting with letters are sorted alphabetically, the numeric ones have to be sorted numerically (and appear strictly after all the alphabetic ones) including the ones with a dash in column six but they have to appear just before the 'P'-coded SKUs (i.e., those that start with a 'P')."

In other words, given the following unsorted list of SKUs:

 1029381 
 1390283 
 2393840 
 99934
 5932
 1394821 
 2398221 
 B34982
 P348293
 A348924
 Z398402
 P393482
 13902-85 
 23938-43 
 13948-22 
 23982-27
...they should emerge as follows:
 A348924
 B34982
 13902-85
 13948-22
 23938-43
 23982-27
 P348293
 P393482
 Z398402
 5932
 99934
 1029381
 1390283
 1394821
 2393840
 2398221
A Java 8 Solution
 import java.util.Arrays;
 import java.util.Collections;
 import java.util.List;

public class Demo2 {

// Return s as Long if s is numeric. Return null if it isn't. private static Long obtainNumber(String s) { try { return Long.parseLong(s); } catch (NumberFormatException e) { return null; } }

// Return Long representation of s if it's numeric with a dash in a given position. Return null if it isn't. private static Long obtainNumberFromNumericWithDashInPosition(String s, int position) { if (s.indexOf('-') != position) return null; String sWithoutDash = (new StringBuilder(s)).deleteCharAt(position).toString(); return obtainNumber(sWithoutDash); }

final static int maxNumericLength = Long.toString(Long.MAX_VALUE).length();

// Obtain an internal string representation of an SKU for sorting purposes // such that SKUs starting with letters are sorted alphabetically, // the numeric ones have to be sorted numerically (and appear strictly after // all the alphabetic ones) including the ones with a dash in column six // but they have to appear just before the 'P'-coded SKUs (i.e., those that // start with a 'P'). static String getSorter(String sku) { Long number = obtainNumberFromNumericWithDashInPosition(sku, 6 - 1); if (number != null) return "B" + String.format("%0" + maxNumericLength + "d", number); number = obtainNumber(sku); if (number != null) return "Z" + String.format("%0" + maxNumericLength + "d", number); if (sku.charAt(0) < 'P') return "A" + sku; return "C" + sku; }

public static void main(String[] args) { List<String> list = Arrays.asList( "1029381", "1390283", "2393840", "99934", "5932", "1394821", "2398221", "B34982", "P348293", "A348924", "Z398402", "P393482", "13902-85", "23938-43", "13948-22", "23982-27" );

Collections.sort(list, (str1, str2) -> getSorter(str1).compareTo(getSorter(str2)));

for (String s : list) System.out.println(s); }

}
Constructing a string on which to sort -- which is the usual way to handle this sort of problem in SQL -- seems a bit dirty, but it's simple.

What's this demonstrating exactly?

You proposed SQL and other approaches as alternatives to HOFs on SummaryOfHofExamples. On EvidenceDiscussion, you wrote, "[a]nyhow, without exploring a specific specification and written pattern, it's difficult to say which is the ideal here. There are many ways to skin a cat with two birds."

So, here's a specific specification of an example of a typical problem encountered in custom business application development. I've solved it using a Java 8 LambdaExpression passed to the built-in Collections.sort HigherOrderFunction. Perhaps you can show how some of the alternatives to HOFs that you've suggested compare to it?

What does the HOF get you? It's looks like a one-off kind of problem. (Note that if you have several millions of these, then doing it in RAM may cause problems, but I'll ignore that issue for time being.)

The HOF enables you to inject the specification of the custom collation order into the general-purpose sort algorithm implementation, which makes it a practical demonstration of the HofPattern. It means handling a "one-off kind of problem" like this doesn't require writing a new one-off 'sort' algorithm implementation or altering an existing one.

Yes, RAM is a limitation in this particular example.

I'm sorry, I'm not seeing what the HOF is getting one. Your explanation is not clear to me in terms of practical issues. I don' know what you are comparing it to in your mind that is worse in production. And I don't want to make an SQL version because we'd be comparing SQL to Java, which creates other language-related comparison issues besides HOF's

What the HOF is "getting one" is the ability to inject a customisation (a custom collation order) into a generic algorithm (sort). See HofPattern.

What I'm hoping to compare this to is your list at the bottom of SummaryOfHofExamples, where you suggest a number of alternatives to HOFs but don't explain how they're alternatives. I've compared FunctorObjects with HigherOrderFunctions/LambdaExpressions on FunctorVsClosure, which presumably covers your "objects" alternative, but what about the rest?


Mechanic: "You should add a tri-flux capacitor to your car. It gives injection customization."

Customer: "But why would I need injection customization?"

Mechanic: "Because you car is better with injection customization than without."

Customer: "How so in practical terms?"

Mechanic: "Because you'd have a tri-flux capacitor in your car and other cars don't. Therefore, it's better than the other cars."

Huh? What does that have to do with this page?

If you have an alternative to using HigherOrderFunctions/LambdaExpressions (as you've suggested on SummaryOfHofExamples) to solve this problem, please demonstrate it.


Here is an SQL version. Can't guarantee it or something close works in all dialects. Most of the common dialects have some kind of "CASE" statement(s) that can function as an IF/ELSE kind of construct, but the syntax varies. -t

 SELECT target, sortfield FROM
 (
 SELECT target, 
   CASE
   WHEN instr(target || 'XXXXXX',6,1) = '-' THEN  // sku with a dash [1]
     'B:' || target          // first letter controls general grouping
   WHEN instr(target,1,1) BETWEEN 'A' AND 'O' THEN
     'A:' || target
   WHEN instr(target,1,1) BETWEEN 'P' AND 'Z' THEN
     'C:' || target
   ELSE
     'D:' || right('00000000000000000000' || target, 20)  // pad with zeros [2]
   END CASE AS sortfield
   FROM mytable
 )
 ORDER BY sortfield
[1] The X's are only to avoid index-out-of-range error by guaranteeing min length of 6

[2] If the equivalent of "maxNumericLength" (Java version) is available in a dialect, then we may be able to avoid an assumed max.

Cool. There are a couple of bugs: When there's a dash in the SKU, it should be treated as numeric. I.e., remove the dash and do the right('00000000000000000000' || target, 20) on it. The other bug is that if the first character in 'target' is, say, a '*', it will pad the target with zeros resulting in an incorrect sort. However, these are mere quibbles. Thanks for creating it. In an application where a SQL DBMS is already in use or readily available, it's certainly a reasonable alternative to using Collections.sort(). Of course, there are many more customisable algorithms than those that can be easily implemented in a SQL query.

Now, how about the other alternatives to HOFs at the bottom of SummaryOfHofExamples?

I never said all were always replacements in all circumstances. Perhaps I should clarify that on that topic.

Until you created the SQL example above -- which is appropriate for one circumstance -- and until I created FunctorVsClosure to illustrate FunctorObjects as an alternative to HOFs/lambdas, it wasn't clear how any were replacements in any circumstances. It's still not clear how EVAL is an alternative to HOFs/lambdas (though I can imagine), and it's not clear how "CASE/IF" statements are an alternative to HOFs/lambdas.

Examples are nice to have, we both agree. I believe that both EVAL and CASE/IF were described before, but the scenarios given by the "HOF side" were not fleshed out enough to give further details or sufficiently useful comparisons of worth beyond suggesting possibilities to explore.

I'm not sure what you mean by "the scenarios given by the 'HOF side'", but the scenario on this page was clearly "fleshed out enough" to permit creation of Java and SQL examples. Could you show how it would be implemented using EVAL and CASE/IF?

Other than possibly proving it can be done ("runs"), I don't see any benefit in using those other techniques for this particular example so far.

The benefit is that it would illustrate how EVAL and CASE/IF are alternatives to HOFs, as you suggest at the bottom of SummaryOfHofExamples. If EVAL and CASE/IF are only beneficial alternatives in certain cases, perhaps those cases should be identified and explained.

I don't remember all of the examples that influenced those suggestions. The "weather" example may have been one of them, but since it's allegedly protected by somebody's non-disclosure agreement, we cannot explore the PRACTICAL trade-offs further.


Alright, fine, here's a draft OOP mini-sample:

 listFoo = list(3, 2, 8, 9, 4, etc);
 class sortIt extends Sorter {
   main(listFoo);  // constructor
   comparer(x, y) { return y.someProperty - x.someProperty; }  // method
 }
[How's this work, exactly? If that's the complete code snippet you'd need to sort a list, it doesn't correspond to the usual OO semantics of any programming language I've seen. What does all this syntax mean? -DavidMcLean?]

It's not much different than the pseudo-code found in the NodeJsAndHofGuiDiscussion topics.

[True. The semantics of your 'class' construct weren't well-explained there either. What's it do? It's clearly not the same thing 'class' does in any current language. -DavidMcLean?]

Indeed. Speculating wildly, it appears your 'class' construct actually creates a prototype instance, and a method header with a body defines/overrides that method whilst a method header without a body invokes it, and a constructor is always named 'main'??? Is that what you have in mind? If so, how do you define a custom constructor and then invoke it?

If we're assuming a conventional OO language of the C#/Java variety, it could look like this:

Sort sorter = new Sort<String>() {
            public int compareTo(String str1, String str2) {
                return getSorter(str1).compareTo(getSorter(str2);
            }
        };
        sorter.sort(list);
This example creates an anonymous subclass of Sort with the 'compareTo' method overridden. Assume getSorter() and 'list' exist and are defined as shown at the top of this page. 'Sort' is a generic class that defines sorting a specified collection (of a specified parametrised type) via the 'sort' method. A custom comparison operator can be defined by overriding the default 'compareTo' method.

You could have the constructor do the sorting and eliminate the invocation of 'sorter.sort()', but it's generally considered bad practice to put significant processing in a constructor. See ErrorsInConstructor. However, if we use that approach, it will look like this:

new Sort<String>(list) {
            public int compareTo(String str1, String str2) {
                return getSorter(str1).compareTo(getSorter(str2));
            }
        };
For comparison's sake, the Java LambdaExpression equivalent -- using the Collections.sort() Java 8 code from the top of this page but reformatted to make the functionality of each line roughly correspondent across these three examples -- is:
Collections.sort(list,
            (str1, str2) -> 
                getSorter(str1).compareTo(getSorter(str2))
        );
The Java/C# style is far from the ideal in my opinion. I prefer OO languages that blur the distinction between class and object. Prototype-based OOP tends to be this way. Either way, the "code size" is dependent on specific language design.

Prototype-based OOP is eminently reasonable. However, your example above of what I presume to be prototype-based OOP is unclear. See the comment associated with it.


EditText of this page (last edited October 5, 2014) or FindPage with title or text search