Capitalization Rules

There are many different rules used for scrunching together multiple words. These apply in many programming languages, and in Wiki, and possibly some other places.

The purpose of this page is to define the commonly accepted usages of some terms. In keeping with WikiNature, please add to or edit this list to indicate your own impression of common usage. Some rules have more than one definition; life's messy that way.

UpperCase: All letters capitalized. THIS IS AN EXAMPLE, SOISTHIS.
LowerCase: All letters lowercase. this is an example, soisthis.
BumpyCase, MixedCase, NerdCaps, HumpBackNotation, BiCapitalized, InterCaps, StudlyCaps: Words capitalized and run together. First letter may or may not be capitalized. ThisIsAnExample, soIsThis. Sometimes called CamelCase, although that use is debated. "NerdCaps" is sometimes used derogatively to describe product names dreamed up by people in marketing. This is also called Medial capitals http://www.askoxford.com/asktheexperts/faq/aboutwords/medial?view=uk
CamelCase, LowerCamelCase: First letter lowercase, but after that all words capitalized and run together. thisIsAnExample. A special case of StudlyCaps. See notes on CamelCase.
PascalCase, UpperCamelCase, Proper case, StudlyCaps: All words capitalized and run together. First letter is capitalized. ThisIsAnExample; ThisIsATest. See notes on CamelCase.
WikiCase: Initial letters capitalized and words run together (like PascalCase), but each capital letter must be followed by a lower case letter (hence one-letter words are not accommodated). ThisIsAnExample; ThisIsNotAWikiCaseWord (because of the "AW"). On this wiki, WikiCase words become HyperLinks.
EmbeddedUnderscore: includes UpperCase with underscores and LowerCase with underscores.
UpperCase with underscores: All letters capitalized, words separated by underscores. The CeeLanguage convention of using this case for constants has been adopted by many programmers for other languages as well. THIS_IS_AN_EXAMPLE
LowerCase with underscores, snake_case: All letters lowercase, words separated by underscores. this_is_an_example
UpperCase with hyphens: All letters uppercase, words separated by hyphens. THIS-IS-AN-EXAMPLE.
LowerCase with hyphens: All letters lowercase, words separated by hyphens. this-is-an-example. aka KebabCase
Title case: Initial letters capitalized, but words separated by spaces. After the first word, articles, conjunctions, and prepositions not more than five letters long are all lower case. This Is an Example, So Is This
Sentence case: First letter of first word capitalized, all others lower case except words that are unconditionally capitalized, such as proper nouns and "I". This is an example, So is this.
Mixed case: Most generally, a mixture of upper and lower case letters. Sometimes used synonymously with BumpyCase or CamelCase. This is an example of Mixed Case.
German case: First letter of first word capitalized, as well as all nouns. This is also the case style used in the U.S. Constitution and other 18th- and early 19th-century documents in English. This is an Example, And often the Verbs will also at the End of the Sentence come.
StudlyCaps: Same as UpperCamelCase, but may be carried out in a more random fashion, or to its extreme. (alternating every letter). Originates from bulletin-board systems, where it was used along with numeric or symbolic substitution (l1|<3 th15) and other devices to convey apparent coolness. Now generally used only facetiously. tHiS iS aN eXaMpLe.
CanonicalStudlyCaps: More often than not oNLy VoWeLS aRe LoWeRCaSeD. Used by the dyslexic and ToTaLLy KeWL DuDeS (read, useless gits).
BiCapitalized: Similar to BumpyCase but digits (and punctuation?) may be present.

Discussion

Many languages don't have the concept of upper case and lower case. Languages based on Latin, Cyrillic, and Greek alphabets do, but many others do not. Native speakers of those languages sometimes have difficulty in understanding the CamelCase derivatives which use only capitalization to separate words.

There are, in fact, a good number of people in the world who use languages not constructed from these alphabets. I raised the point only because I have had others correct me for not being very inclusive by assuming that everyone easily and comfortably parsed capitalization. When people correct me for not being inclusive, I try very hard to learn from it; thus my raising the point. I suppose that if a JavaLanguage program were being written by a Chinese-speaking developer, with keywords (necessarily) in English, but variables and documentation in Chinese (since Java allows UniCode source), then CamelCase would be a poor variable-naming standard. -- MichaelChermside

There are a good number of people, true, but how large a fraction of the programmers? If you are at all programming, then you will have some exposure to English, I guess. I may be wildly wrong, of course. -- OleAndersen

Poor compared with what? If the alternative is lowercaseruntogether, that is much, much harder for non-native speakers to parse. Were we to optimize for non-western programmers, I'd be inclined to run with CAPITAL_LETTERS_AND_UNDERSCORES, as capital letters are the canonical forms of the letters. But in practice, at least for Java, I think this is a RedHerring; if a Java program is being written in a language with no letter case, then they obviously can ignore the standard. And if the programmers are non-native speakers writing Java in a roman character set, then they might as well buckle down and learn to deal with CamelCase, as the Java libraries use it extensively. -- William Pietri

On the other hand, lowercase letters are easier to read than uppercase letters (even for native speakers of English), so I'd be inclined to run with lower_case_letters_and_underscores. -- DavidCary

Presumably if the programmer is familiar enough with the alphabet to be able to use the names, he will have picked up on the difference between upper and lower case. [Languages aren't constructed from writing, as transliteration and spelling variation show, and capitalization did not make its way into our alphabets until the middle ages. It didn't get standardized until the 19th century or so, and then is still varies from place to place - standard German will capitalize every Noun.] [Actually the lowercase is the innovation]

It's arguments like this that make me believe that case sensitivity is one of the worst-ever ideas in software architecture. Just have the compiler warn (or the IDE auto-correct) if a variable has multiple capitalisations. Put the _ key front and center, or just don't use the space character as a delimiter. And now, with Unicode making case-insensitivity harder to implement, we'll be stuck with this forever.

Strictly defining these terms in a consistent way may make it much easier to specify and follow consistent standards within a programming group. For instance, the standard (okay, that's debatable, but it's what Sun uses) JavaLanguage capitalization pattern could be specified as follows:

: LocalVariables and InstanceVariables? are in (lower) CamelCase. MethodNames? are also in lower CamelCase, but followed by parentheses. ClassNames? are in PascalCase. PackageNames? are in LowerCase with no separators between words. UpperCaseWithUnderscores? may be used for constant names if desired.

Moved from CisClusion

Organic chemists and field ecologists know that cis- and trans- are opposite prefixes.

Trans means across, and cis means close. Cismontaine is an environment up on this side of the mountains, and transmontaine is on the FarSide of the divide.

This page tries to begin to differentiate what WardsWiki created with WikiCase from what TedNelson's Xanadu strives to do with much longer strings of text.

I think we're using cisclusion when we use WikiCase to suggest a term which may or may not have already been discussed. If a word group ends up being followed by a QuestionMark, it implies that eventually someone will click on it and begin to describe that concept, as did the 'P*PatternRepository?' for so many variations on P*: Portland, Plimpton, Plumpton, ... Thank Ward that we were finally able to delete pages.

Historical notes

Upper and lower case

The term "upper case" refers to the capital letters being kept in the upper tray of the lead type storage. And "lowercase" refers to the minuscule letters being kept in the lower tray of lead type storage. -- Eljay

StudlyCaps

I recall TomChristiansen using the term "StudlyCaps" (disparagingly, of course) on comp.lang.perl.misc several years ago at least. -- JohnDouglasPorter
The GNU Emacs StudlyCaps implementation ("studly.el") has the comment "This is in the public domain, since it was distributed by its author without a copyright notice in 1986". "PossIblY thE esoTeric SiGnifiCanCe? of sTuDlycaPsificaTion escapes you; thAt is, YoU Suffer From autostudlycapsiFiBogotiFicatioN. too bAd."
This offers a bit of insight: http://www.catb.org/~esr/jargon/html/S/studlycaps.html

From a purely linguistic point of view, is capitalization part of the language, or part of the presentation? I am inclined to reflexively answer that it is part of the language. However, it may not be that much different from bold, italics, or underline. It's just a way of presenting the content; in almost all cases, capitalization doesn't change meaning. Comments?

Comment: Caps can alter meaning in a limited number of case March v. march, Polish v. polish, etc.

To make this on-topic, relate it to character set design. If you were going to throw EBCDIC, AsciiCode, UniCode, etc. out the window and start over, would your character set differentiate between lower-case and upper-case? Would your character set differentiate between bold and not-bold? Remember, these issues can be handled by rich editors and appropriate fonts.

See UnderscoreVersusCapitalAndLowerCaseVariableNaming, LinkPattern

Capitalization, and the notion of spelling, generally, is both part of language and presentation. Conceptually and linguistically this is known as orthography. -- Greg Lewis