Coupling And Cohesion

Given two lines of code, A and B, they are coupled when B must change behavior only because A changed.

They are cohesive when a change to A allows B to change so that both add new value.

The difference between CouplingAndCohesion is a distinction on a process of change, not a static analysis of code's quality. But there are plenty of indicators and BestPractices...

Does "must change behavior" refer to source code changes, or run-time results?
It refers to explicit modification of behavior - i.e. the source code. Automatically adapting run-time results in a value-added way more closely matches the concept of cohesion, below. 'Coupling' is more readily identified by the way things break. Consider: the only reason your B "must" change behavior as cause of A changing is that the change to A broke the behavior of B.

It's my summary judgement that "change" is too open-ended to make this a rigorous concept, a discussed below. There's an effectively infinite way any given module can "change". One has to first "tame" and classify "change" as a prerequisite to rigor-tizing C-and-C if it's tied to the term "change". --top

The rubric is chosen to give the term the most value in an AgileSoftwareDevelopment context; hardening code against bugs by pushing C1 down & C2 up.

It would be difficult to escape the notion of probability here. There are often myriad potential changes and related cross-influences, but each change path is rarely equal. In practice, we are often weighing probabilities of different kinds of changes, sort of a cost/benefit/probability analysis of each potential change. Most authors who talk about C&C seem to bypass the issue of probability, making questionable or untested assumptions in the process as a substitute. Related: ChangePattern. --top

Probability of change is a non-issue for CouplingAndCohesion, but is important for ChangePatterns. Consider: two units may be tightly coupled even if the probability of change (and consequential breakage) is very low. Because of this, you must distinguish between the coupling and the probability of change.

Almost everything is "coupled" at least by the fact that the names must match.

Uh... unless you mean to say "Almost everything that is 'coupled' via a shared name dependency is 'coupled' by the fact that the names must match" (which is trivially true) you are very much incorrect. For example, the code I'm (supposed to be) working on right now certainly isn't coupled to the code you're working on, and there is no need at all for names to match.

Thus, "coupling" is a matter of degree.

''While it is true that coupling is a matter of degree. I can make no sense at all of the main statement just above. Does any one have any idea what it might mean?' I am thinking that it might be equivalent to the claim that programs that have no dependency at all are not coupled-- have zero coupling, but I can not figure out how that leads to "coupling is a matter of degree" unless 0 is a degree, but then I am not sure what the point was. HELP wanted here I think others may also be confused. - ANNON

While your logic leading to it is baffling, I agree with this conclusion. That is why we discuss the desire for "low" external coupling and "high" internal cohesion.

And to calculate a "coupling" score, it would make sense to consider probability of change rather than just the existence of a potential breakage due to change.

Incorrect. It would not make sense to consider the probability of change. But it wouldn't be a bad idea to consider the probability of breakage given a change. Mathematically, that is P(BREAK|CHANGE) = P(BREAK AND CHANGE) / P(CHANGE). In layman terms, this (loosely) is the probability that code unit 'B' will 'break' after an unknown change to code unit 'A'. The closer this probability is to 100%, the higher the coupling from 'B' to 'A'.

Perhaps we should make a distinction between calculating coupling and calculating the actual effects of coupling.
Indeed. The above wouldn't truly aim to calculate coupling, but rather to measure it. All 'measurements' are performed through 'effects'. Even distance measurements are based on the effect of distances on, say, reflected light or sound, or on diminishing covering arcs for objects of known height or width, etc. In this case, observing or scoring coupling based on its effects - how it is associated with breakage between modules - is, indeed, a measurement rather than a calculation. Good catch.

We may find that things A and B have coupling points C,D,E; and things J and K have coupling points L,M,N. If we merely count the quantity of coupling points, then the coupling between A-B and J-K would be considered the same. However, in practice, changes to L,M,N may be much less likely than C,D,E. Thus, the utility of measuring coupling is greatly enhanced by considering probability of occurrence. Otherwise, we'd optimize our code for what could be very infrequent events over frequent ones. --top

We should optimize our code for ChangePatterns that have been anticipated based on past experience. But coupling is not the same thing as change patterns. Distinguishing between them is useful for providing or understanding advice such as: "Reduce coupling to details or features that are subject to frequent change." Indeed, the utility of 'measuring coupling' is enhanced by such distinction. If we don't distinguish between 'coupling' and 'frequency of change', then there is little purpose in having both phrases.

The distinction between practical couplings and theoretically couplings can make a huge practical difference. For example, a medical application may be coupled to the assumption that a typical human has two arms because it asks the doc to inspect both arms of patient. In theory if an alien race with 3+ arms immigrated to Earth, the application could break or become useless. But in practice you'd be labeled silly to dwell on such. However, there are plenty of "anal" or GoldPlating developers out there who take some advice too literally and may build a convoluted indirection layer to reduce the impact of more arms. To avoid getting tangled in our own underwear, we have to know where the draw the line, and change probability is a key tool for that.

These are some of the better-defined qualities that separate good software from bad software. Although they were formalized during the invention of StructuredProgramming, they apply exactly as well to ObjectOrientedProgramming as to any other kind.

Cohesion of a single module/component is the degree to which its responsibilities form a meaningful unit; higher cohesion is better.

Someone had vague reference to decomposability here. Clarification?
How about: 'Cohesion is inversely proportional to the number of responsibilities a module/component has.'

Coupling between modules/components is their degree of mutual interdependence; lower coupling is better.

size: number of connections between routines
intimacy: the directness of the connection between routines
visibility: the prominence of the connection between routines
flexibility: the ease of changing the connections between routines

A first-order principle of software architecture is to increase cohesion and reduce coupling.

Cohesion (interdependency within module) strength/level names : (from worse to better, high cohesion is good)

Coincidental Cohesion : (Worst) Module elements are unrelated
Logical Cohesion : Elements perform similar activities as selected from outside module, i.e. by a flag that selects operation to perform (see also CommandObject).
- i.e. body of function is one huge if-else/switch on operation flag
Temporal Cohesion : operations related only by general time performed (i.e. initialization() or FatalErrorShutdown?())
Procedural Cohesion : Elements involved in different but sequential activities, each on different data (usually could be trivially split into multiple modules along linear sequence boundaries)
Communicational Cohesion : unrelated operations except need same data or input
Sequential Cohesion : operations on same data in significant order; output from one function is input to next (pipeline)
Informational Cohesion: a module performs a number of actions, each with its own entry point, with independent code for each action, all performed on the same data structure. Essentially an implementation of an abstract data type.
- i.e. define structure of sales_region_table and its operators: init_table(), update_table(), print_table()
Functional Cohesion : all elements contribute to a single, well-defined task, i.e. a function that performs exactly one operation
- get_engine_temperature(), add_sales_tax()

Coupling (interdependence between modules) level names: (from worse to better, high coupling is bad)

Content/Pathological Coupling : (worst) When a module uses/alters data in another
Control Coupling : 2 modules communicating with a control flag (first tells second what to do via flag)
Common/Global-data Coupling : 2 modules communicating via global data
Stamp/Data-structure Coupling : Communicating via a data structure passed as a parameter. The data structure holds more information than the recipient needs.
Data Coupling : (best) Communicating via parameter passing. The parameters passed are only those that the recipient needs.
No data coupling : independent modules.

As usual with software source code metrics, these qualities are difficult (but not necessarily impossible) to reduce to quantitative data that can be meaningfully compared across different projects or organizations, despite their value as qualitative measures.

These measures of quality arose in the context of structured procedural programming, but also apply to other paradigms, including OO; the best OO practices can be seen in these same principles. This should not be surprising; OO did not evolve in a vacuum.

Can coupling be summed up as as answering to three questions "yes" or "no"? Thus having 8 possible outcomes. Questions are "share meaning (are related)?", "share (similar) algorithm?" X "share data?" - Mila

No, coupling cannot be summed up as answering those three questions. Sharing data is, however, one form of coupling.

Quoted from "BoulderPatternsGroupMinutesOld?" (not otherwise a page to inspire browsing):

the correct terminology is "tight internal cohesion" and "loose external coupling". This basically means that each method in a class should have one task and the class as a whole should have one major responsibility (tight internal cohesion) and that other classes should not depend on the inner workings of this class but should be designed to the "interface" of the class (loose external coupling). See a recent post by AlanShalloway on this: http://groups.yahoo.com/group/dpexplained/message/108

[The following question in DecouplingObjects inspired the creation of this page.]

I am somewhat surprised to find little in-depth discussion of CouplingAndCohesion on the Wiki - that is, to be honest, nothing I can readily identify as in-depth discussion of the two measures per se as opposed to second-order principles of design that are asserted to be beneficial "because they reduce coupling" or "because they promote cohesion". I have an "intuitive" understanding of the notions, gained mainly by osmosis and reading texts which refer to them, but I have a feeling that my knowledge might be incomplete. Or are the concepts so straightforward that no such discussion is warranted? Or did I miss some obvious pointers? -- LaurentBossavit

Someone in TheStructureOfScientificRevolutions claimed that knowledge of these concepts has largely been forgotten in OOP. That is most unfortunate, if true.

As an experienced programmer facile with OO and structured methods, I think that concepts of cohesion and coupling absolutely have their place in OOP/OOD. There are many chunks of OO and/or XP wisdom that fundamentally boil down to those two concepts, appropriately framed.

A class is a coupled blob. The methods and properties must agree with each other on how the class is put together, or else there's little purpose in your class. That is, the internal interface of the class is what couples the implementation to it.

A wise OO practitioner wants that internal interface to be as small as reasonable. When a class gets large or complex, it's a center of coupling and begs to be refactored.

Cohesion represents unity of purpose. We have long ago learned that the most internally unified unit of work is to formally accept inputs and compute outputs. This is the practice espoused by FunctionalProgramming, which takes FunctionalCohesion? to the limits of reasonability - and beyond!

We also know that internal unity is not the same goal as global unity. We object-users (Classists?) want to express unity on as many levels as possible. Zillions of disorganized but simple functions do not express that unity. A well-designed OO program expresses the unity of purpose for groups of related methods by stuffing them in a class together. That cohesion is not about behavior in the atomic sense. It does apply to the boundaries of the interface which the class implements. Large classes or hierarchies can endanger that quality at their respective levels of abstraction.

Perhaps the great thing about OO technology is that it makes design problems so clear. Still we must learn to read the writing on the wall. When you have a solid grounding in phonics, you can learn to read English with far less trouble than without, and you can do so on your own from reading books. CouplingAndCohesion are like the phonics of CodeSmells. By analogy, perhaps books on OO Design focus on the shape of whole words without teaching the letters first. This yields learning if and only if the prerequisites are met.

-- IanKjos

In modular and functional programming coupling is the level of "dependency" between functions. And cohesion is a measure of how closely lines or groups of lines within a given function relate to each other: are they all "doing the same thing" or "contributing to a single goal?" One generally frowns upon global variables and parameters which are flags or codes.

In OO programming, I think we've created hierarchy of encapsulation, each level of which can be subjected to CouplingAndCohesion measures:

Each individual method should have high cohesion and low coupling with other methods.
Each individual class should have high cohesion within the class and low coupling with other classes.

I recall, as a modular programmer learning OO, that I was very impressed that practically every Smalltalk article I saw had discussions of how they improved methods in classes - in ways that improved cohesion within the resulting methods and reduced coupling between methods. Smalltalk culture clearly promoted appropriate CouplingAndCohesion, far more than even the most ardent modular programming texts, and the OO developers didn't even talk about it; it was just "the way things are done." -- JeffGrigg

I am finding in my current project that there is a tendency if one is not careful to swing the pendulum too much one way or the other. When they are too coupled, logic and functionality may be dispersed across several components. Both of these lead to a proto AntiPattern: how does PendulumOfCouplingAndCohesion? sound?

I think that the main issue is that what is "intuitive" to one person is not the same as what is to another. Also, it is difficult to correctly break up the functionalities of a system without error. This is why we RefactorMercilessly right? I sometimes find myself feeling dirty when I'm coding. To me this is a clear indication that this CodeSmells. I am not enamoured of using scents to describe code but my colleagues understood immediately what I was saying when I told them "I feel dirty."

Does this make sense or am I just rambling? -- IainLowe

What I find interesting is that I don't see coupling and cohesion as opposite quantities, such that decreasing the one automatically increases the other. What kind of coding tactics do you find yourself switching between as "the pendulum swings", as you put it?

I think I was a bit DazedAndConfused? when I spoke about cohesion up there. There is a relationship between the two, though: when modules are very de-coupled it stands to reason that they are cohesive. If they were not cohesive, they would require greater insight into the inner workings of the other modules in the system. So there is a pendulum effect but probably something more like HalfPendulumOfCouplingAndCohesion? since you are correct in stating that cohesion will reach a "sweet-spot" beyond which you cannot make a module more cohesive. I need to think about this a bit more... I still haven't pin-pointed the tactic-switching you mention above. -- IainLowe

It seems to me that the case is that a module which is cohesive is necessarily de-coupled. It cannot have unity of purpose if other classes accomplish half of that purpose. That is, while the module may be focused on one task it would not cover all of that task. However, being de-coupled does not imply cohesion. One module which manages three entirely distinct tasks can still be de-coupled from all others. -- James Ferguson

Look at it from a different perspective: Increasing coupling increases dependence (thus increasing future instability) and Increasing cohesion increases stability. The Holy Grail of CouplingAndCohesion is the BlackBox which offers The minimum dependence for the maximum stability for its user.

This dictates against PrematureGeneralization. After all, PrematureGeneralization increases an object's dependencies needlessly.
This supports DontRepeatYourself. If I am too interdependent, I've probably repeated myself needlessly and should RefactorMercilessly.
This supports OnceAndOnlyOnce. DontRepeatYourself recommends OAOO whenever possible.
This wants to support YAGNI. After all, if YouArentGonnaNeedIt, why head toward more instability?

-- WyattMatthews

I smell a false dichotomy. Maybe. Coupling is connection crossing a boundary. Cohesion is connection which doesn't cross a boundary. This suggests that the ideal system is a single global space, and the worst is a highly modular space. Then there's geometric complexity which is a measure of dependency range (How many elements are involved in any one transaction). It is all rather more complicated than a single 'rule of thumb'. I suspect we are applying aesthetic judgement as much as any analytical principle. -- RichardHenderson

My comment was not intended to imply a single global space as the ideal, but to imply individual spaces should have a minimum of dependence upon another individual space that it was not derived from. Patterns such as AbstractInteractions increase coupling at one level (the interactions are now required maintenance for the dependent class), but expose less dependence upon other spaces because they deliberately allow the substitution of the object previously required for its execution.

I know . The global thing is the implication of the definitions of coupling and cohesion with a rule to maximize one and minimize the other. I'm not trying to contradict the thesis of the page. I am a great believer in the basic principle. I just think that there is a whole load of expert intuition involved in applying this HeuristicRule, suggesting additional factors are involved.

Perhaps we can identify these forces more appropriately. For example, I would think that Coupling is necessary between parent and child classes, but only cohesion is necessary for siblings.

To me coupling implies direct maintenance costs and makes me think of welding. Breaking the coupling could result in the loss of established quality (short term) or function.
1. Good coupling is seen in a well constructed inheritance tree. I'll term this DeepCoupling because the child classes are exposed to the majority of the parent class (your welding the entire surface area where the parts join).
2. Bad coupling requires a specific implementation of Class B to run Class A with no inheritance. This situation BEGS for refactoring or an ORB-style implementation. I'll term this as ShallowCoupling because it is done through "ConcreteInteractions".
3. Good coupling is also seen in some ORB? contexts (AbstractInteractions are used). I'll term this as SurfaceCoupling because this coupling is like welding only the outsides of a joint together.
Similarly, cohesion implies the ability to exchange objects with similar interfaces. This concept makes me think of NutsAndBolts or HooksAndSlots.
1. DeepCohesion uses AbstractInteractions.
2. ShallowCohesion uses "ConcreteIterations".
3. SurfaceCohesion implies nearly impossible communications between the participants.

Any thoughts on my breakdown of these two forces?

-- WyattMatthews

It took me a long time to make sense of this word cohesion which I kept hearing. There seem to be many people who use this word without being able to tell me what it means, or show me code examples of cohesion. I have finally come to some level of understanding. First, if a class has a lot of different behaviour which doesn't naturally go together, it lacks a sense of cohesion. Similarly, if in order to alter some behaviour, one must go fiddle with many different classes, this might be a hint that the behaviour is spread out, rather than being in one cohesive unit. Second, if you look at a class, and everything you need to know about some behaviour is right there, and there isn't any thing else there to clutter up what that class does, then that class has a high sense of cohesion.

Not just cohesion means, but also coupling. At the top, it says: Coupling applies to any relationship between software components. Perhaps it should say: Coupling is the amount of relationship between software components.? (Improvements welcome.) -- ChrisDailey?

Oh. How about from http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?query=coupling&action=Search - "The degree to which components depend on one another." -- ChrisDailey?

Relationships are what makes software powerful. "Coupling" is not something to get rid of, but to manage. Encapsulation is a form of coupling, for example. If two things have NO relationship at the time of writing, then yes, perhaps they don't belong together nor connected (at the time).

Some references:

[Kudos for JeffGrigg for collecting a significant amount of relevant pointers.]

"Coupling and Cohesion" slide in a presentation (the bullets above are mainly a restatement from that slide) -- http://www.cc.gatech.edu/computing/classes/cs2390_97_summer/lectures/rdd/slide7.html
ISBN 0136907695 "Practical Guide to Structured Systems Design" (Yourdon Press Computing Series)

by Meilir Page-Jones

"Measuring Coupling and Cohesion: An Information-Theory Approach" - a paper from the November 1999 IEEE International Symposium on Software Metrics, by Edward B. Allen and Taghi M. Khoshgoftaar -- http://csdl.computer.org/comp/proceedings/metrics/1999/0403/00/04030119abs.htm (payment required for access)
"Coupling and cohesion in object-oriented design and coding" at ACM Annual Computer Science Conference, by Joel Henry and Donald Gotterbarn -- http://dev.acm.org/pubs/citations/proceedings/csc/228329/p149-henry/ (?)
calculator -- http://141.215.8.244/ccc/coupling.asp (gone since at least 2003-06-22)
http://www.sfcc.spokane.cc.wa.us/bladek/Bladek/CS2w01/COUP_COH.HTM (gone since at least 2003-06-22)
http://www2.umassd.edu/CoursePages/SoftwareEngineering/lectureMat/couplingcohesion.html#cnc (gone since at least 2003-06-22)
- (Apparently was part of "CIS 311 - Software Engineering" course -- http://www2.umassd.edu/CoursePages/SoftwareEngineering/ (gone since at least 2003-06-22) but may not be part of lecture notes any more. ;-)
http://www.cpsc.ucalgary.ca/~jonesb/seng/613/groupwork/sasd/report.html (gone since at least 2003-06-22)

Larry L. Constantine and Ed Yourdon. Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design. 1978.

: Pre-OO. Essentially the origin of the those terms in software design, devoting a chapter to each.

Nope! I have a 1975 book that covers all of this! "Reliable Software Through Composite Design", Glenford J. Myers. This might be the earliest book appearance. But it cites an earlier article, "Structured Design", L.L. Constantine, G.J. Myers, W.P. Stevens, IBM Systems Journal, Vol. 13, No. 2, 115-139 (May 1974)....Note that the article is coauthored by the book author, Myers, and the article is also coauthored by Constantine, who coauthored the later 1978 book mentioned above.
And things had been percolating unpublished for several years prior, it also cites as origin of some of the coupling/cohesion terms the unpublished manuscript from 1971, "Fundamentals of Program System Design", L.L. Constantine again.
So if anyone is looking to give credit, it looks like the lion's share goes to Constantine, and the rest to Myers and Stevens.

[Yes, I did invent the concepts and the original metrics of coupling and cohesion, with first publication in 1968 ("Segmentation and Design Strategies for Modular Programming." In T. O. Barnett and L. L. Constantine,eds., Modular Programming: Proceedings of a National Symposium. Cambridge, Mass.: Information & Systems Press, 1968.) Glen Myers and Wayne Stevens were students and colleagues of mine at IBM's Systems Research Institute where I was on the faculty from 1968-1974. --Larry Constantine]

Links from Chris:

"Reducing Coupling" by Martin Fowler http://martinfowler.com/ieeeSoftware/coupling.pdf
"Dynamic Coupling And Cohesion Metrics For Java Programs" by Aine Mitchell http://www.cs.may.ie/~ainem/contents.html
Unfortunately, it's difficult to find links to papers by David Lorge Parnas. http://www.cas.mcmaster.ca/sqrl/parnas.homepg.html
- Here is one of the best : If you have not read it you should http://www.cs.umd.edu/class/spring2003/cmsc838p/Design/criteria.pdf -MarcGrundfest
- See OnDecomposingSystems
UNC Comp145: http://www.cs.unc.edu/~stotts/COMP145/modules.html

Most examples of CouplingAndCohesion seem to be device-driver-like examples. If one does not deal with device drivers, then those examples are not very good sales material. Some also depend on a pro-subtype viewpoint. Those of us who think ThereAreNoTypes would like to see something else. If the concept depends on subtyping-based ChangePerception?, that is fine by me. I just want to clarify if this is the case. -- top

Types in OO land aren't really all that much different from "domains" in relational-land. If you don't understand that, you need to go re-read ChrisDate.

Types maybe. But it is subtypes that are the issue.

A few of the places coupling and cohesion is discussed: NatureOfOrderDiscussion ExtractMethod CodeNormalization BenefitsOfOo SoftwareMetrics MaintainAbility GoodCode LargeExtremeProgramming RefactoringAndRewriting GradyBooch FundamentalFlawsInProceduralDesigns DontRefineExceptions CppUtxOverview (in regard to testing)

The basic idea is that stuff that works together, should be together.

Put things that fit really well together right next to each other, in one module - seek strong cohesion.
Between modules, don't let them fiddle around with each other's private parts too much - seek low coupling.

It's all about how you cut up your program, into pieces called modules.

To think that you can win the game by putting everything into one module is just as silly as saying, "What, Global Variables are bad? Okay! I'll just put everything into one gigantic structure then, and pass it around. Yay me!" You have missed the point.

Yep! And that "yay me!" solution is the one called "Stamp Coupling". It has enjoyed a resurgence in popularity in FunctionalProgramming, where it is represented as a Monad, as far as I can tell, but outside of FunctionalProgramming, it's not a good idea.

In ObjectOrientedProgramming, the main vehicle of Coupling is Polymorphism, and the main vehicle of Cohesion is Encapsulation.

Eh? I see lots of coupling in ObjectOrientedProgramming implemented by direct instantiation of one class by another. Often these are chained until virtually the entire system is coupled together.
I concur. AbstractConstructor or AbstractFactory patterns could reduce this coupling, but are often painful to use (LanguageSmell). GiladBracha?'s NewspeakLanguage? has a restrictive module system to help with this problem.

Besides "stuff that works together" as a way to cut the pie might be "stuff that changes together" per the SingleResponsibilityPrinciple. Is that a useful way to find or measure cohesion? JLS

Because so much of programming now is basically drawing pipes between modules, it seems to me that StampCoupling has become more common and accepted. (...since, if you do StampCoupling, you don't have to draw in new pipes when you need something- it's just automatically available to you.)

I don't see what pipes have to do with it. Pipes are Sequential Cohesion, and needn't have anything to do with Stamp Coupling. When the total environment is shoved into a data structure, it's common to pass around a pointer to it, but I've never seen it passed around as a gigantic chunk in a pipeline. And that would only work on the read-only part of the data, at that, since pipes pass copies.

If you do only DataCoupling?, then every time you need to get a new wire from module A to module D, then you need to massage A->B, B->C, and C->D. But if you're stamping it, you need far less massaging. You just pull what you need from the wire, no need to change a bunch of code.

Programs written in the "loose" languages, like Python and what not, seem to promote this sort of StampCoupling. Then, as things become formalized, efficient, rigid, and secure, a piece of software moves towards DataCoupling?, and is implemented in stricter languages.

Perhaps you could give an example, since your train of thought isn't obvious yet.

Okay, so, I think the confusion rests in different understandings of the phrase "Stamp Coupling."

As such, it's probably best continued on that page: StampCoupling.

Re: "The basic idea is that stuff that works together, should be together."

This is often impossible. Factors often interweave such that there is no one perfect grouping. The real world is multi-dimensional but textual code is limited to 1 or 2 dimensions (2D on small scale). The best we can do is find the best compromise, and which is the best compromise is often a source of HolyWars (for example, grouping by nouns versus grouping by verbs). See GroupRelatedInformation.

True, but unhelpful. You're talking about problems that can arise even when some version of best practice is followed. CouplingAndCohesion primarily addresses problems that arise when best practice is not followed.

It's better for an attempt to have been made to group code based on it working together, than not to do so, even in cases where there is no single perfect answer to grouping.

Except maybe for newbies, almost everyone uses some kind of grouping approach beyond random.

Yes, but it's not a binary issue, because it's not the case that almost everyone follows best practice, even if they don't follow worst practice. For instance, LogicalCohesion? is second-worst on the list, and is not at all uncommon.

Newer methodologies that attempt to address difficulties in grouping, such as GenericProgramming, AspectOrientedProgramming, and perhaps IntentionalProgramming, do not in any way contradict the principles of CouplingAndCohesion, so far as I am aware.

HolyWars over i.e. OO versus GenericProgramming, although the two methodologies can contradict each other, each individually are in alignment with CouplingAndCohesion.

[I agree with the person who said, "This is often impossible." However, just because some "theoretical purity" is unknowable or unattainable, doesn't mean that the theory's a bunch of bunk. The purpose of this idea isn't to give you a perfect algorithm to show you the one right way to do something. The purpose of the idea is to help you recognize patterns, and help you think about things. Breaking a problem down into smaller interconnected pieces is pretty universal.]

Because we have to as humans, not because it necessarily fits actual reality. Grouping stuff is an attempt to find the most UsefulLie. One often groups by their perceived probabilities of future changes, but it is often hard to agree on the most likely change patterns.

True, but it pays to keep trying to do better, and gradually over time computer science is learning more about what does and doesn't work well. CouplingAndCohesion is an example of something that was first invented a long time ago, back in the 1970s, but tends to be neglected, when it should be one of the tools in everyone's arsenal.

CouplingAndCohesion has been implemented as actual objective automated metrics quite a few times, but such things tend to suffer from problems of insufficiently powerful automated software analysis state of the art in general. Still, I expect this state of the art to continue to improve, and perhaps eventually the whole topic will become significantly less subjective.

Where are these "automated metrics"? And we have to be careful to make sure the metrics translate into real benefits and that there are not counter-metrics that are being skipped. For example, small "code size" is good, but if it's the only metric we judge on, then really compressed code that might score high on compactness may be difficult to read in practice.

Until then, one does the best one can.

I believe that SoftwareDevelopmentIsGambling. One evaluates the horses relative to each other and then picks the best guesses. There is not always one right answer for every circumstance. Saying "X is always bad" is always bad :-)

It's one thing to say that there isn't always one concrete correct answer, it's' another thing to claim that there isn't any objective means possible to measure the degree of correctness of answers in the abstract.

The latter would mean that software is of necessity doomed to forever be an art, not a branch of engineering and science. Some would agree, but this seems a risky proposition in an age where we are beginning to understand even the rules of aesthetics (via evolutionary psychology and via tentative systems like Alexander's).

One does not know if they made the right choice until and when things do change. That is how we test our change-handling concepts in the end. However, this does not provide any formalism. Further, as I have learned after many heated debates with OO proponents, people perceive (likelihoods of) change differently from one another. Since we cannot produce the actual real world changes during the discussion to see whose design is the most change-friendly, we can only base our designs on how we *expect* the future will change, and this is where the differences in change pattern perceptions make the process very messy. However, a consolation prize would be to better document one's assumptions and describe why alternative change scenarios were ranked lower. But, most software engineers don't seem to have this intraspection skill yet. People don't really know how to properly question their own model of reality, but such is needed to document change-related assumptions properly. -t
I do know I made the right choice. I deny your right to claim the general case. I prove it every time the change does not force me to rewrite the system. So do others. The fact that what I do is not possible suggests that I am undercharging :) -AnonymousDonor
You predicted the nature of future domain changes? That may be a different issue. If we know the future, then our design choices are of course going to be a lot easier. I find the domain future difficult to predict, even after years of experience. New markets, new fads, new management personalities, new presentation technologies, etc. just require too powerful a crystal ball. -t

It would also mean that these characteristics of software inherently cannot be correctly embedded in a mathematical metric space, even in principle, which seems exceedingly rash.

Back to brass tacks: name a situation where CouplingAndCohesion gives an incorrect answer to whether something is good or bad. (The issue of objectively measuring such was already addressed above your comments, not by your comments, so that's a different issue.) -- DougMerritt

Example? Indirection. Indirection reduces coupling, but can also complicate a given design. Having a complex design may make it harder to change because there is more code to read and more code to change. Also, how "bad" each form of coupling is, is subject to subjective rankings.

Analysis by reductio ad absurdum.

Coupling

It is my understanding that coupling can only be reduced, not eliminated. A program with zero coupling could have at most one machine instruction (not source statement). Any more than that and there would be dependencies between the action of one instruction and the action of the next.

Not necessarily. The system isn't inherently mapped to a metric space, but if it were, there are workable definitions of a zero in non-trivial programs.

So, by inference, coupling is indicated by the degree of dependency between the various components of a system. Of course, by their very nature, modules depend on one another to perform their respective functions properly, but we are concerned with how much one module depends on the implementation of another module. The ideal is not at all, and we can come very close, but never arrive.

Cohesion

By the same argument used above to show that coupling cannot be eliminated, a certain level of cohesion is also inherent in the sequential nature of electronic computer instructions. (All bets are off on quantum computing.) The goal is to maximize it.

Not necessarily, as above.

I won't bother pontificating on this because an item above covered it very well:

Coupling is connection crossing a boundary. Cohesion is connection which doesn't cross a boundary.

That only applies in G. Spencer-Brown's first order system. And he didn't explain his higher order system. And his followers who did take stabs at it came up with systems where the above doesn't really apply, just as binary logic of one bit leads to limitations that are not true of systems with a large number of binary bits.

The only addition I would make is to emphasize that the connection need not be explicit, it can be implicit. It could be knowledge as innocent as knowing that certain values passed are 'bit flags'.

Well I'm out of time. So long.

-- BobBockholt

Some examples would be nice. English is insufficient it seems. For example, "contributing toward the same goal" can get into some sticky philosophical discussions. It just seems another case where people end up modeling their own internal view of the world and thus nobody agrees. FuzzFlag??

This stuff was invented in 1974, but it's been partially forgotten, and seems to be taught at only, I dunno, 20% of colleges and universities these days, even though it is not obsolete; it's still quite important.

I took the trouble to add definitions for every single level of coupling and cohesion recently, which had never been on this page before, and I did this by doing a bunch of web searching and trying to pick out the pages that seemed to have more coherent definitions. Then I dug up the actual origin of the terms, since that was misquoted here.

I understand why you would say that the result is still insufficient, but you know what, I'm kind of tired...how about if now you do some similar searching and find some nice examples and add them to this page? Google is your friend, too, not just mine. :-)

I did not mean to have you carry the entire load.

I think there are (at least) two things to look at when investigating "links" between things:

Does a link reflect an actual domain link (real-life association), or is it a software artifact? If the second, then it should perhaps be a higher alert level. It makes sense that if we are modeling two things that are linked as a domain requirement, then our software/database is also going to contain a link of some kind to reflect that.
- Related is that the strength level of the domain link should also be reflected in the software/database. If the link is tenuous, then it should be easier to remove from the software also. Of course this may take some estimating skills to apply appropriately.
How expensive is it to add or remove links? What kind of DiscontinuitySpike do such create? For example, I often fuss about the expense of converting is-a relationships to has-a relationships.

-- top

How ResponsibilityDrivenDesign could help with CohesionAndCoupling ?

UseCases protect and give answers to the concerns and interests of all stakeholders of a given system.

All stakeholders of a given system are represented through the Actors in the interactions with a such system.

For every goal that a given Actor has with a given system, this system must answer every such goal through its responsibilities.

This system should become a catalyst for such business, because its purpose is to open new roads to facilitate this business. This catalyst metaphor fits perfectly with TheSimplestThingThatCouldPossiblyWork.

To design such a system, we must understand the business this system must service, with this catalyst idea in mind.

To start with, we look at the business terms or entities that are involved in the business. We are looking into the problem domain of the business. In this domain, all business entities involved are classes.

Only a group of all of the problem domain classes will show up in our system. That will depend on the scope of our system.

Besides, for the group of problem domain classes that will show up in our system, some of the business entities will turn into system classes, while others will become just properties of these classes. Whether a problem domain class becomes a system class or a property of another system class will also depend on the scope of our system.

For instance, let's take a look at the ZipCode? business entity (problem domain class). If we are designing a typical SalesOrderProcessing? system, in most cases the ZipCode? will become a property of some system class(es).

But what if we are designing a system for FedEx or UPS?. In such a case, the problem domain ZipCode? class might become a system class.

(YAGNI and NIAGNI help us define partial, temporal projections into the scope of a given system. We may start with a very narrow scope in the first iteration, like a WalkingSkeleton of a system, and gradually evolve into a wider scope, all the way down to the final scope of such a system).

So, for every problem domain class we must fully understand what it has to know and what it has to do in order for it to answer to all its responsibilities (the word responsibility comes from Latin and it means "to give answer to").

How do we do that? Simply by asking for the BusinessRules involved in the problem domain.

This is the real initial point in our problem domain exploration: BusinessRules will help us identify all business entities (problem domain classes), and also everything that a given class must know and must do.

BusinessRules are atomic and elementary always, if we state them in the proper way.

BusinessRules will also tell us how classes interact (collaborate) among themselves.

Through these collaborations among classes to fulfill business goals (UseCase goals), we could discover how to aggregate knowledge and behavior of every problem domain class to identify its responsibilities.

In such a manner, we can get highly cohesive problem domain classes. Cohesion is the middle ground between atomic BusinessRules and lumpier UseCase goals.

If we build a cross-collaboration matrix among all problem domain classes, we will find lumps or clusters of classes with stronger collaboration ties.

These clusters tell us what classes belong to a given logical (problem domain) package or component.

If we apply this very same grouping to our software components, we could have loosely coupled components.

Problem Domain Analysis helps us determine the classes involved in it, as well as the public interface of theses classes.

Since we are dealing with the what of a business and not the how of that business, this analysis helps us design for the public interface of classes, an not for any particular implementation ("Responsibilities = Public Interface").

This is consistent with the first principle of object-oriented design stated in the DesignPatternsBook (page 18):

"Program to an interface, not an implementation".

-- GastonNusimovich

"Fuzzy Metric" Complaint

In a usenet debate, RobertMartin suggested that separate case-statement lists are "coupled" because they are allegedly likely to change together.

   functionA(...) {
     ...
     select on x
     case 'aa': {asd()}
     case 'bb': {jgusss(...)}
     case 'cc': {j7()}
     otherwise...
   }
   functionB(...) {
     ...
     select on x
     case 'aa': {balasdf()}
     case 'bb': {nib()}
     case 'cc': {zork(...)}
     otherwise...
   }

Robert implies there is some kind of what comes across as existential coupling between these two lists because they may be affected by some changes, such as adding a new item to each list. It is not a "hard" coupling, because there are no existing references between the two lists (other than "x", which does not change in the scenarios usually used).

But the case lists might also drift apart. We may add a "dd" to one, but don't want it for the other, for example. There is no guarantee they will change in lock-step. Even if you feel they are likely to change together, the "coupling" still depends on probability. There are change probabilities that require polymorphic classes to change in lock-step also, the classic being adding a new method to every shape sub-class. (See SwitchStatementsSmell for discussion on change impact and case statements.)

Thus, "coupling" is drifting from what may have been a clear-cut metric to something that depends on probability estimates and personal judgment. It is not the "magic metric" that some paint it as.

An example of objective coupling is a method in a class:

  class foo {
    method bar{....}  // location A
    ...
  }
  .....
  x = new foo(...);
  x.bar(...)  // location B

Here, the method call in location B is "coupled" to location A because if we remove method "bar" at location A, then the method call at location B is no longer valid. (Changing parameter signatures can be a similar issue.) This kind of thing is where such a metric is useful. However, the link between two CASE lists that may or may not change in lock-stop is subjective, or at least dependent on PerceptionOfChange and SoftwareDevelopmentIsGambling.

The "bar" example does not assume any external knowledge or experience. One can look at the code and only the code. In fact, an algorithm could probably be written to draw lines between coupled portions of code. The two CASE list issue could not be done this way unless we make up-front assumptions about the likelyhood of change, but people (like me) can question such assumption, wanting more evidence beyond someone else's anecdotes.

-- top

That's true, but there is no truly magic metric, and IMHO, CouplingAndCohesion does better in that department that most suggested measures.

How can it when it is not measuring any "hard links"? It is purely a human perception thing, and we know where relying on that leads as far as reaching agreement or defining good design rules. And the few clearer ones risk SovietShoeFactoryPrinciple.

Similarly with the rest of your comments: perhaps true in the absolute, but replacing switch statements with class-based method dispatch typically produces real, tangible benefits.

The most common exception is probably your fond example of tables where neither rows nor columns have natural precedence over each other, so that neither is appropriate as the relative root. In pure OO realms people work around this with DoubleDispatch, which is often good enough if not perfect, and in impure realms, multi-methods/generic functions which choose a method based on multiple parameter types/classes simultaneously solves this very nicely.

On the other hand, I won't dispute that it can also be solved nicely without OO and without multimethods, using good old-fashioned data-driven programming, which pre-dates OO but has philosophical similarities with it. In this case, one might have e.g. a 2-D table indexed by appropriate manifest constants, retrieving the appropriate function pointer to use. This amounts to implementing multi-methods by hand.

I am not sure what you mean "by hand". A good table-oriented language makes it a snap, perhaps easier than code-based approaches because you can key it into a nice grid (TableBrowser) rather than fiddle with hard-to-read syntax. Perhaps in C it is a pain, but CeeIsNotThePinnacleOfProcedural.
[It's "by hand" because a table (VeeTable for example) of function pointers is what OO-supporting languages use to implement dispatch (efficiently); when you build your own table of function pointers for the purposes of single or MultipleDispatch, you are reimplementing a feature of other OO languages by hand in your own code. Implementing the table "by hand" is intrinsically going to be more work than not implementing the table at all, which is an option if your language of choice already implements multimethods (or simply methods, if you don't need MultipleDispatch). It may be fairly straightforward work, especially if you're familiar with working in tables, but it's clearly not less work than not implementing a dispatch table at all. -DavidMcLean?]
Perhaps for simple stuff, but not if you wanted to do query-like dispatch, such as run all processes for cities that have a population between 1 mil and 3 mil with an average humidity below 40%. And tables are easier to read or can be re-projected to be easier to read than OOP code attributes all jammed up in unnatural ways.
[Querying in order to "run all processes for cities that have a population between 1 mil and 3 mil with an average humidity below 40%" is not an example of dispatch. Dispatch is fundamentally a system whereby the caller of some area of code does not determine the specific behaviour invoked, but one or more callees (arguments). Querying for a particular dataset and then running "all processes" on it implies either that the "all processes" will be exactly the same for every row of the dataset, or that the "all processes" are polymorphic in some way---but in that case the dispatch occurs in the "all processes", not in the query to collect data for them. As for tables being easier to read than "OOP code attributes", the relevant properties of code when it comes to dispatch are the signatures of methods, which in most languages are quite easy to read excepting pathological cases (methods with tens of arguments). A single DoubleDispatch method in a multimethod-supporting language gives you just a list of method signatures, listing off the objects the method'll work on; a single triple-dispatch method in a multimethod-supporting language takes exactly the same form. Each DoubleDispatch method implemented by hand with relational tools must have its own table, with rows corresponding to the first argument and columns to the second (or vice versa); a triple-dispatch method implemented in relational requires three dimensions, which tables do not afford. (In addition, the multimethod version does not require you to create a bunch of extra named functions. With a relational table, you need to store a name or some other identifier in the table for each possible method implementation.) The point is, a (non-pathological) implementation of multimethods as part of language core will be able to solve dispatch-class problems with much less work from the programmer than any implementation of the same functionality by the programmer, whether it's in relational tables or some other form. -DavidMcLean?]
I mean a table something like this ControlTable:

       city.....pop_m..avg_humid...function or expr.
       ---------------------------------------------
       ST. LUIS...4.2.......50.....foo()
       SALK LAKE..3.1.......42.....bar()
       MAUI.......2.6.......65.....zaz() + foo()
       //Etc. (Simplified for illustration purposes. Joins may be used in practice to get average, etc.)

That same info, including attribute values, as OOP sub-classes is harder to read in my opinion. And generally requires programmers to change instead of merely power users. But maybe we are wondering off topic.
[Disregarding for a moment that calculating any information about a specific city by calling some global function foo() isn't particularly logical, what properties would you calculate in that way, and can not the expressions for deriving such properties be derived from the existing established properties of the cities themselves? Defining a varying expression for calculating "something" on a per-row basis genuinely is best done by providing the expression in the row directly, but is defining an expression on a per-row basis really the best way to calculate that something in the first place? (Using code that will be directly evaluated in the row, as opposed to some "safe" representation like a domain-specific language that will be evaluated in a sandbox, naturally raises concerns of security in the application, but that's neither here nor there.) -DavidMcLean?]
Not sure what you mean by not being logical. Similar factoring questions often arise from long, repetitious CASE lists also, but it's often heavily dependent on the domain. As far as "security", that would be an issue with just about any dynamic tool, and is to be evaluated on a situational basis. But that's the main point: domain issues and domain changes shape the tradeoffs such that saying "always use code pattern X" is silly.
[I claim it's not logical to calculate a property of a specific city (say, the tax rate in Townsville) by calling a global function like foo(), since that global function has no way of knowing what it's supposed to be doing in this specific situation. Security is not necessarily an issue with all dynamic tools, since even when using a dynamic tool one is not required to evaluate arbitrary language expressions received as input; as noted, one could devise a safe DomainSpecificLanguage (one with no side-effects for example) and use InterpreterPattern to evaluate its expressions with no security risk, or alternatively use some form of sandboxing in conjunction with a "normal" language. As for the major claim, if the problem necessitates that arbitrary expressions be evaluated at runtime then naturally evaluating arbitrary expressions at runtime is the only real solution. Why would it, though? If the objective is to allow power-users to modify functionality, asking them to write actual expressions in your programming language is asking them to be programmers; what other use case involves modification of functionality that couldn't be done just as easily in the code? -DavidMcLean?]
Typically the function(s) would have access to its row's attributes. Perhaps I should have shown "foo(thisRow)" or something. (I sometimes make the current row array a global or outer variable to the called functions' scope to simplify the expression syntax.) And power-users are often comfortable with spreadsheet-like formulas, but may not be programmer material. It's not unrealistic to have a tool that's half spreadsheet and half application. Some spreadsheets outgrow Excel and need SOME database-like parts. These are usually internal department-only apps with a small set of users and power-users.
[I'd argue that if you plan to treat the fields like Excel formulæ rather than like full arbitrary language expressions, you should be interpreting the content of those fields using InterpreterPattern on a formula-specific DSL, rather than technically accepting any arbitrary language expression. Nonetheless, you make a reasonable point regarding spreadsheets as a whole. -DavidMcLean?]
When you add libraries to the application, you create a dependency between the library and the code-base that may complicate future maintenance. For example, when the language is upgraded, something in the interpreter library or add-on might break. And it creates another library for the maintenance programmer to learn. Using built-in EVAL-like functionality generally avoids this problem, being the most KISS. For many environments, what you describe is SafetyGoldPlating.
[Sorry, when did I suggest adding libraries to the application? I don't recall arguing that additional external libraries should be added, although I hardly think avoiding external dependencies simply because they are external dependencies is wise. -DavidMcLean?]
InterpreterPattern. A decent interpreter would be library-sized, and be reinventing what Eval() does out of the box. Anyhow, it's a shop's tradeoff design decision. LetTheReaderDecide.
[A decent interpreter for a specialised DSL intended to allow entry of particular mathematical formulæ and nothing else will not be library-sized. Nor will it be a reinvention of what eval() does, especially if the application language used does not feature the same expression syntax as is desired for formulæ (for instance, if it's Lisp). Implementing a complete sophisticated sublanguage would approach library size, but we don't want a complete sophisticated sublanguage. -DavidMcLean?]
A simple recursive-descent parser for canonical formulæ is tiny, often consisting of a single class and one method per terminal or non-terminal in the grammar.

Even the latter is typically superior to just using switch statements, though. An example that I've frequently run across in my own work is in interpreters and compilers full of switch statements on the types of objects being manipulated. Some of them do a type-specific print of a value, some do type-specific code generation, etc, and in practice, not theory, these switch statements suck and suffer bit rot and get out of sync with changes, etc, and replacing them with any of the above approaches - including table-oriented methods - has always resulted in sharply better, more maintainable, more readable code.

I agree that systems software may be able to make better use of "subtypes". (See OopBizDomainGap). In your compiler example, "types" are hard-wired into the language definition, and thus are relatively immutable. (It may be a case of modeling types with types, which is kind of a self-fullfilling prophecy.) However, I have yet to find too many immutable counter-parts in the biz domain, where hierarchical taxonomies or mutually-exclusive lists are simply a poor model of real change patterns. Show me actual specimens of biz case lists that go awry and I may change my mind. Also in practice I rarely find something that fits a double-dispatch version. A clean one-to-one matrix of factors-to-behavior is too regular to fit the messy biz world. The only practical examples seem to be things such as modem drivers (yet more SystemsSoftware). -- top

I don't agree with everything RobertMartin says, but in this case, I believe he is right on target. Most of the time, switch statements have worse CouplingAndCohesion than OO/generic/data-driven approaches, and one aspect of this is not fuzzy: the switch statements are scattered all over the code, whereas the other approaches centralize things in such a way that lack of synchronized changes becomes less likely, and in fact the language can to some extent actually assist. -- DougMerritt

The issue at hand is whether similar case statements are "coupling", not whether they are "good". The goodness issue is taken up in SwitchStatementsSmell. Note that Polymorphism scatters the "method list" all over the place in a similar fashion. It is not a free lunch. It generally seems to boil down to a probability estimate, and OO fans see different change probability distributions than I do (PerceptionOfChange). It may be a domain-specific thing, or a personality thing. I have asked for an example of a biz domain case list that changes in lock-stop as claimed and have yet to receive one (other than something that should be a table instead). Thus, I have good reason to remain skeptical. I also find IF statements easier to adjust the GranularityOfVariation on than polymorphism. For example, it's not uncommon in the biz domain for the change request to ask for both features instead of an either/or choice. It's usually less re-coding to change a CASE block into an IF block than it is to de-mutually-exclusivize polymorphism.

{PayrollExampleTwo benefits from polymorphism. It would be quite dire with CASE blocks or IF statements.}
ItDepends on what the future change patterns actually are. See PayrollExampleTwoDiscussion for my comments on it.
{Yes, I recall those. The polymorphic version was clearly superior to the CASE-based equivalent in every respect.}
Clear? Really? It didn't show objective metrics, only unverifiable claims that may not necessarily apply to all domains. One participant on the other "side" stated, "I suppose the only way to fairly and objectively resolve this debate would be to experimentally test a series of maintenance actions against the OO version and an equivalent CASE-based procedural version, and measure error rates, implementation time, and developer perception."
"Clear" to me would be counting errors made, number of key-strokes needed, number of eye-movements needed, minutes needed to make a given change, etc. However, in practice we won't get such info. But, we can use proxies for such, such as a hypothetical "mind dump" of the steps a typical maintainer will take to find and fix stuff along the lines of "He/she sees that the URL has "foo.prog" in it so looks for the file called "foo.prog" in the code folder, then opens the file in the editor and looks for the string "glob date" because the form called the field "glob date"...".

Doug, would you agree that a switch statement in a class may be acceptable if all other dispatch systems were ruled out for some reason provided they did not exit that class?

Sticky problems in CouplingAndCohesion: CouplingAndCohesion, much like layers of abstraction, tends to be excellent for making the programming easier... and awful for optimizations. Many of the most powerful optimizations are those that cross boundaries that very clearly constitute 'unnecessary coupling' and might even be called 'utterly evil coupling'. Furthermore, many of these optimizations cannot be performed by modern optimizers (or even next-generation optimizers...). And these include real, algorithmic optimizations... not just coefficient cost reduction (though you get that, too).

I suggest that future research should consider the possibility of aspected-oriented coupling for optimization purposes. Until we have something that can cross boundaries for us, we'll be doing it by hand, BestPractices be damned.

Example of subjectivity in metrics:

 routineA(...) {
    x = routineB(a, b);
 }

 routineB(x, y) {
    [...]
 }

Is routineA coupled to routineB and visa versa?

Note that if we change routineB to this:

 routineB(x, y, z) {  // add a new parameter
    [...]
 }

in many languages this would "break A" because A calls B with insufficient parameters now. In languages where different parameter counts are allowed such that "z" would return blank or Null, the affects are harder to determine. We'd have to dig around in the algorithm details.

What counts and what doesn't as "coupling", and if all instances are weighed equally is usually a subjective thing. Some situations are even non-deterministic, or at least very expensive to discern because all possible calculation paths can fan out approaching infinity.

It is possible to make it clean by ignoring certain relationships, but this gets back to objective metrics versus useful objective metrics (SovietShoeFactoryPrinciple). If we don't count stuff just because some of the metric calculations are too expensive, we may make a nearly useless metric, or at least diminish its practical utility.

--top

You could also break routineA by deleting routineB, or by making routineB return semantic garbage, or by making routineB return some type-unsafe value for routineA, et cetera. You'll just confuse yourself if you get hung up on petty details like whether the new 'z' parameter can possess a default. It is quite clear that there is a very 'hard' (measurable, provable, graphable) dependency from routineA to routineB, but it is not at all clear what your claim of 'subjectivity' happens to be in this example (are you making this another vector for your ObjectivityIsAnIllusion mantra?). As to the 'vice versa'? Dependencies aren't necessarily bi-directional. And coupling is about mutual interdependence - i.e. you need to demonstrate the 'vice versa' before you can show there is any coupling at all. Doing so would be impossible with the little information you've provided above.

Focusing on the motivation for your spiel:

There may be some sort of 'soft' coupling if routineB is being continuously modified to meet the needs of routineA, in much the same way that helper-routines are coupled to the procedures that need the help. This coupling wouldn't necessarily show up in a dependency graph of function calls; rather, it would show up in the commit history for the project. This is something you might call 'existential' coupling. It would be unreasonable to claim it doesn't exist (people can obviously point at examples of it, therefore it must exist). But one might question its ultimate relevance. Hard coupling causes hard problems like DllHell. This 'soft' or existential coupling does not. Thus, whether 'high existential coupling' is even a problem should be evaluated independently. I do not believe it would be a problem. Heck, you might as well claim that all domain-based classes and routines in a project are 'existentially coupled' in the form of RobertMartin's example - after all, they are all part of the same project, thus if the requirements of the project change they tend to change also. But it doesn't cause problems (or provide benefits), and therefore it isn't relevant. And, completely independent of whether they are easy or difficult to measure, irrelevant things simply don't need to be measured.

Is our goal to solve and prevent problems, or merely measure something? If the first, then why would "soft" and "existential" coupling be omitted? You seem to half-agree that soft/ext. coupling can cause problems. (Only insofar as need for change ever causes problems.) And Martin makes an issue of it. The opening definition seems to include soft coupling. Perhaps we just need a better classification system for "coupling" (or at least "dependency"). Here's a rough draft:

References or associations that can be determined by inspecting the code without studying the algorithms involved. For example, subroutine calls may be coupled by their parameter signatures:

           foo(1, 2, 3);
           ....
           function foo(a, b, c) {.....}
           // We can see that the def of foo requires 3 parameters
           // (Assuming the target language does not auto-nil missing ones).

References or associations that depend on run-time behavior (or at least behavior too complex to be detected by automated basic code analysis).

          function X() {
            fileName = "myfile.txt";
            a = foo(fileName, currentDateTime());
            if (a.length > 0) {
               fileName = alternativeName(a);
            }
            contents = readFileContents(fileName, settings);
            // Function X is coupled to existence of "myfile.txt" if "a" not blank.
            // (let's assume readFileContents() requires file existence.)
          }

References or associations that depend on some possible future change or requirement.
- Example: The potential case-list item addition example given above near the "fuzzy metrics" title above.

Didactic examples would be welcome, top.

Added. See above examples. --top

External Links:

http://www.computing.dcu.ie/~renaat/ca421/LCOM.html

CategoryInfoPackaging, CategoryModelingLawsAndPrinciples