Coding Problem Smalltalk Solution

Here is the ExtremeProgramming approach to solving this problem in Smalltalk.

When we start any kind of programming, we first write a test case. We make a SegmenterTestCase? subclass of TestCase, and implement the simplest possible test.

SegmenterTestCase?>>testOneOne

| input result |
input := #(1).
result := Segmenter segment: input into: 1.
self should: [result = #((1))]

In implementing the test case, we decide we want a Segmenter object, since there is no obvious other home for this method. In compiling the method, we are prompted for the meaning of "Segmenter". We tell the system to make a new class for us. Now we run the test case. It fails, because the method #segment:into: isn't defined. We push the "Define" button on the error window, popping us to a browser with a template for the method.

Segmenter class>>segment: aCollection into: aNumber

^self new segment: aCollection into: aNumber

Now we press "Continue" in the error window, popping up another error window, from which we define:

Segmenter>>segment: aCollection into: aNumber

^Array with: aCollection

Hey, it's the simplest thing that could possibly work for our given test cases. The test case now works. We write another.

testTwoOne

| input result |
input := #(1 2).
result := Segmenter segment: input into: 1.
self should: [result = #((1 2))]

It works, too. We write another:

testTwoTwo

| input result |
input := #(1 2).
result := Segmenter segment: input into: 2.
self should: [result = #((1) (2))]

It breaks. Now we have to really decide on the algorithm. We'll make a stream on aCollection, and take off "the right number" of elements from the stream. The size of the collection divided by the number of segments is a good enough measure for this test case.

segment: aCollection into: aNumber

| reader |
reader := aCollection readStream.
^(1 to: aNumber) collect: [:each | reader next: aCollection size / aNumber]

Now we come to the real problem, different sized segments. First, we write the test case:

testThreeTwo

| input result |
input := #(1 2 3).
result := Segmenter segment: input into: 2.
self should: [result = #((1 2) (3))]

Now we borrow a trick from error diffusion halftoning. We pick an integer number of elements at each step, but carry along the error between the "ideal" number of elements and the real number of elements, feeding it back in at every step.

Right, we all reach now into our kit bag and pull error diffusion halftoning right off the top ... ;->

segment: aCollection into: aNumber

| reader error |
reader := aCollection readStream.
error := 0.
^(1 to: aNumber) collect: 
[:each | 
| elements |
elements := (aCollection size / aNumber + error) rounded.
error := (aCollection size / aNumber + error) - elements.
reader next: elements]

Now we can extract a temporary variable to explain the idea of the ideal number of elements.

segment: aCollection into: aNumber

| reader error |
reader := aCollection readStream.
error := 0.
^(1 to: aNumber) collect: 
[:each | 
| elements idealElements |
idealElements := aCollection size / aNumber + error.
elements := idealElements rounded.
error := idealElements - elements.
reader next: elements]

As a final check, here is Ron's original testcase.

testRon

| input result |
input := #(1 2 3 4 5 6 7 8 9 10).
result := Segmenter segment: input into: 3.
self should: [result = #((1 2 3) (4 5 6) (7 8 9 10))]

It doesn't work, because our algorithm chooses the second segment to have four elements. Re-reading the specification, it doesn't say how the error is distributed, so we change the expected value.

testRon

| input result |
input := #(1 2 3 4 5 6 7 8 9 10).
result := Segmenter segment: input into: 3.
self should: [result = #((1 2 3) (4 5 6 7) (8 9 10))]

Now it works.

I'm still working on cleaning up the code. I like Michael's idea of reducing the problem to creating an array of segment sizes. I'll try that when I get a minute.

I'm back. I couldn't figure out how to make the change with pure refactorings, so I did it by hand. By changing #segment:into: a little, we can return a collection of segment sizes:

Segmenter>>segmentSizes: aCollection into: aNumber

| error |
error := 0.
^(1 to: aNumber) collect: 
[:each | 
| elements idealElements |
idealElements := aCollection size / aNumber + error.
elements := idealElements rounded.
error := idealElements - elements.
elements]

The original method simplifies to:

Segmenter>>segment: aCollection into: aNumber

| reader sizes |
reader := aCollection readStream.
sizes := self segmentSizes: aCollection into: aNumber.
^sizes collect: [:each | reader next: each]

Much nicer. I still get the feeling there is some cool trick with inject:into: that would simplify segmentSizes:into: further.

--KentBeck


Well, hmph. There's this interesting fact that you can fairly divide a number into, say, 5 shares by taking 1/5 of the original number, 1/4 of what's left, 1/3 of what's left, ..., 1/1 of what's left. But the code isn't any more clear, really. You wind up (after factoring the collection out of Kent's #segmentSizes:into:) with:

 segmentSizes: aNumber into: sharesNumber 
| remainder share |
remainder := aNumber.
^((0 to: sharesNumber - 1)
inject: OrderedCollection new
into: 
[ :sum :each |
share := remainder // (sharesNumber - each).
remainder := remainder - share.
sum add: share; yourself]) asArray

To me, that's no better. It seems so simple ... but how to make it really clear? Oh well, it's encapsulated, but I'm afraid this one's gonna need some MethodCommenting. --RonJeffries

You can use collect: instead of inject:into:. Otherwise I like this one better than the one with error feedback.

segmentSizes: aNumber into: sharesNumber

| remainder share |
remainder := aNumber.
^(0 to: sharesNumber - 1) collect: 
[ :each |
share := remainder // (sharesNumber - each).
remainder := remainder - share.
share]


I think an extreme programmer would have to say that the halftoning solution is adding unnecessary complexity to the solution that wasn't called for in the original problem. A simpler solution is to have the first "remainder" collections take an extra element:

segment: aCollection into: aNumber

| reader remainder segmentSize |
reader := aCollection readStream.
remainder := aCollection size \\ aNumber.
segmentSize := aCollection size // aNumber.
^(1 to: aNumber) collect:
[:each |
each <= remainder
ifTrue: [reader next: segmentSize + 1]
ifFalse: [reader next: segmentSize]]

--JohnBrant

Yes! That is what I was doing with "for (int i = 0; i < K; i++) Distrib [i] = N/K + (i < N % K ? 1 : 0);" -- MichaelFeathers

I like splitting the computation into two parts, but here is a segment size calculation using John's idea combined with Gary's idea about simple expressions and lots of temps. How much more documentation do you need? --KentBeck


segmentSizes: aCollection into: aNumber

| segmentSize biggerSegmentCount smallerSegmentCount biggerSegments smallerSegments |
segmentSize := aCollection size // aNumber.
biggerSegmentCount := aCollection size \\ aNumber.
smallerSegmentCount := aNumber - biggerSegmentCount.
biggerSegments := Array new: biggerSegmentCount withAll: segmentSize + 1.
smallerSegments := Array new: smallerSegmentCount withAll: segmentSize.
^biggerSegments , smallerSegments

Interesting, that's about what CodingProblemSolutionByProof ended up with, only without the separate segmentSizes:into: method.


OK, here's a compact solution, but maybe too obscure. It does the FairShare? thing, dividing by N, N-1, etc, all in one whack. I feel like if you don't know that trick, you may not be able to figure this one out. But it's so compact. (Putting on propellor beanie.) What do y'all think? A couple of years ago I'd have left it in as is. Now, after all that Kent Beck torture ... --RonJeffries

 split: aCollection into: anInteger
| reader remaining numberToRead |
reader := aCollection readStream.
remaining := aCollection size.
^(anInteger to: 1 by: -1)
inject: OrderedCollection new
into: 
[ :sum :each |
numberToRead := remaining // each.
remaining := remaining - numberToRead.
sum add: (reader next: numberToRead); yourself]

BTW, with this one, you get this to compare with Kent's above ...

 segmentSizes:aCollection into: aNumber
| segmentSize remaining |
remaining := aCollection size.
^(anInteger to: 1 by: -1)
inject: OrderedCollection new
        into: 
            [ :sum :each |
segmentSize := remaining // each.
remaining := remaining - segmentSize 
sum add: segmentSize; yourself]


I've got to make a silly observation now. In the text above, OrderedCollection is a link in Wiki. What would life be like if compilers and interpreters accepted HTML and development environments used Wiki bases for source editing and communication among team members? I've seen that Wiki is written in HyperPerl but that link is down so I haven't seen the code or its hyper-ness.. could mean that what I'm talking about is old news. Anyone think along these lines? -- MichaelFeathers


Our Smalltalk browsers are set up so that we can highlight any word (typically a class or method name) and browse it. Clicking a class takes us to a browser for the whole class, a method goes to a browser of all implementors of the method. It's nearly good. (Another reason to learn Smalltalk, eh, Michael?) I'd like to see the wiki built in to what we do, though there can be no doubt that the verbal communication we use is better where it's possible. --RonJeffries

Or you could just buy Metrowerks Codewarrior, which does the same thing for Java, C/C++, and Pascal :) --RussellGold


Why is it legitimate in running testRon, above, to change the test to match what the code does? That seems like cheating, to me - and forces future changes to keep the same algorithm. In such cases, I have tended to make my tests forgiving of discrepencies in algorithms, but that is more work... -- RussellGold

It was the simplest thing that could possibly work. Re-reading the specification it became clear to me that the test didn't reflect the specification, so changing it wasn't cheating.

I now am using JohnBrant's generic test. I still like writing specific tests with specific names better than a MonkeyTest? (like testing all segment sizes of all collections up to size 20). Reading the tests reminds what I was thinking of when I wrote the code, and (often more important) what I wasn't thinking of. --KentBeck

Changing the test was legit in this case, as all we wanted was an equitable partitioning. I had to outlaw dealing, in fact, just to get to the interesting part of the problem, which (I think) was finding a clear way to do the front-to-back partition. John's generic test is better, though harder to write. OTOH, writing it might have helped one generate alternative algorithms, by breaking the connection to any particular order of the over- and under-sized partitions. --RonJeffries


A fully general test (that goes through and makes sure the array matches the criteria without regard to where the shorter/longer arrays end up) isn't necessary at first. When one is interested in creating one algorithm and making sure that it doesn't regress, the direct tests are just fine and much easier to produce. DoTheSimplestThingThatCouldPossiblyWork. Now that we're comparing algorithms, the generic tests are better.


EditText of this page (last edited November 7, 2008) or FindPage with title or text search