Logging Discussion

[refactored from FirstRuleOfLogging, 9/8/01, SteveHowell]


Alternatives to Logging

Most things that seem to require log files, don't:


Problem diagnosis actually is important

We put lots of bugs in our code, so we really do need log files -- to keep track of all of them! ;->

Violent and total disagreement. Your world is very simplistic. Yes, we do have bugs in our system. Duh. We try not to but the system is very complicated involving 200+ distributed nodes and multiple hardware and software vendors. Now we can accept that and work with it, or put our head in the sand. A system fails after 4 weeks? Why? With your approach you have no way of diagnosing the problem or fixing it. Customers don't like that. Send an email, so what? You have no clue what happened. The site is a day away by plane. Now what?


A Modest Proposal

Mmmmm... Let's get all idealistic then... the end user runs your code in a runtime system that - in the event of certain kinds of exception - makes a clone of the program state and sends you an email about it. You can then connect to that clone via remote debugging over the Internet to inspect the stack, step forward and test a few things in their live environment without their permission, etc. Oh bliss. -- DanielEarwicker


More objections

I get the feeling someone here has never worked on a distributed system. How can every top level exception be a bug? Certainly, a down sensor or other device is not a bug.

Then again, doesn't it fall under the "environmental" category?


What about the customer?

This is too extreme. What about application logging for the customer? What about for a commercial product that generates a log to aid CustomerSupport? in doing their job? There are many reasons for log files that go beyond the tips of our noses.

Also, log files can be useful tools to help identify a bug. Just don't add them until there's a tough bug that needs them.

Then it's way late. You can't do a build in the field and rerun. The conditions were likely very specific and very hard to reproduce. You need a log that was collected while the problem happened or you are SOL.


Log files give context

I disagree. Log files often are (a) required as a feature by the customer, (b) show the sequence of events between separate processes that triggers an error condition, info that cannot appear on a stack trace, and (c) are often the only output mode available to server/daemon processes. -- PeteHardie

Log files sometimes reveal patterns which are only visible with a lot of data, more than you can absorb running a program interactively.


Log transactions when money is involved

I'd like databases to log transactions so that when the bank has a power outage in the middle of a transaction, I don't lose $1000. Logging is just another interface to the user. If you don't use logging, you need another interface to report status. If you have one, gravy. -- SunirShah


Don't forget auditing

What about auditing? How do you do that without logs?

E.g. Sensitive data is leaked by a user. The leak is detected some time later. Without logs how do you find who was reading the data at that point?

My point of view is that in any distributed system or any system that runs without constant user supervision (daemons, etc.), a lack of logging is a CodeSmell. More than that, is a design flaw.

-- NatPryce



Ok, maybe we do need some logging

Tentative summary - an identified need for logging is a good indicator that your system is a) complex, b) sensitive to untrusted accesses, or c) subject to applications of data mining after millions of uses. If your system has none of these characteristics, consider excessive reliance on log files to be a serious smell.


A second round of objections to logging

(a) "not completely trivial" would be a better description

Most useful systems are not trivial. A number of useful systems are not complex.

(b) auditing is required whether users are trusted or not. Most security breaches in business systems are performed by "trusted" users.

That's "trusted" in the sense of "all functionality available without particular authentication tokens", as in the Wiki which has no untrusted users.

(c) in a distributed system you need to correlate events in different parts of the network to determine the cause of faults. That requires logs, whether you have ten uses or a million uses.

I dunno, HTTP seems to work well enough without needing to correlate what happens in the browser with what happens in the server. If a distributed system is sensitive to small deviations in behaviour on the part of one of its participants, it seems to me that the presence or absence of detailed logs will be less useful than increased robustness in the face of small deviations.

An HTTP server is a centralized system that uses a stateless protocol. It *doesn't need* to correlate what happens in the browser with what happens in the server because (a) the protocol does not synchronize state between the two processes and (b) the protocol considers each request as completely context free - all the necessary context is sent in each request. This is about as simple as you can make a distributed system. In fact, an HTTP server has the same logical structure as a minicomputer running a bank of terminals, which is not what people usually class as a distributed system.

When you have multiple processes running on widely distributed, heterogeneous nodes that must collaborate by communicating with a mixture of protocols and networks (X.25, Inmarsat, GSM, TCP/IP, reliable UDP/IP multicast, CORBA RPC, etc.), then correlating events on different nodes is very important in order to discover what has happened to cause a fault.


Accusations of Naivete

You seem think logging is of little use, but it sounds like you have never had to administer a system or write a system that is administered by others. When you have done either of those without a logging system you'll change your mind! I did.

Does having written three chat servers and having been involved with a multimillion-hit Apache cluster count ? I am most emphatically not asserting that there are no known uses for logging. In particular, if there is a user requirement for logging certain events, esp. for data mining (HTTP servers are a good example) then we'd better be ready to do it.


More discussion

A case in point is the JRun application server's very nice, feature-rich logging system. Perhaps not coincidentally, JRun is a seriously defective piece of software. -- LaurentBossavit

What the FirstRuleOfLogging says - as I read it - is "If you're thinking of adding logging to your system because the blasted thing isn't doing what it should be doing - don't."

That may be good advice, but I don't see any justification for it above. A logfile provides a different perspective on a program's execution than a debugger, which can't capture the past. Why is this bad? -- DanielKnapp

I would like to extend the above comment. Logging is only a persistent form of on screen debugging. If you program perfectly, you don't need either one, but when I screw up, I rely on whatever tools are at hand to resolve the problem. -- WayneMack

Also, is logging really any different than catching and displaying exceptions on screen to the user? -- wm

I seem to make good use of apache's access and error logs. I don't quite understand how you would debug a wide class of problems without these logs. -- Kraken

Marketing departments need HTTP access logs to show advertisers statistics of "hit counts", "eyeballs", "stickyness" and all the other buzzwords they like to see. They need to record the result of normal execution of the server.


Reasons to Log

Logging covers a multitude of different functions:

An application may need to do tracing or journaling, but the other kinds of logging are often handled by middleware such as CiCs or EjbServers.

What are the reasons not to log?


Wiki Logging Example

Logging is useful whenever you simply want information about what happens while your program is running... I mean, there's no real need to log things like "uh oh, there seems to be a bug here" but it would be nice to be able to have some sort of record of the things you didn't expect and plan for.. Like someone trying to circumvent a security measure in your program. You can rely on thinking of everything, or you can put something in place that lets you know what you missed.

Ironically, while I was saving this page, an error occurred in wiki and it said "this information is being logged."

Possibly the best argument that could be made against the FirstRuleOfLogging. Then again, consider - how do you know that message isn't there just to reassure you? ;>

I've seen that Wiki error message dozens of times, for over a year. That bug has never been fixed. Possibly the best argument that could be made for the FirstRuleOfLogging.

Huh? You are saying you kept a personal log of a certain event, and by examination of that log you discover that the event (a) isn't once off, and (b) has been occurring for a long time. Sounds like your log (of that "logged" message) turns out to be useful, and that is somehow an argument that logging isn't useful?


Logging is a code smell

I still think that logging can be a code smell. All too often, I see error logging used in place of real exception handling. I've seen this idiom (hmmm... AntiIdiom?) countless times:

  try {
foo.bar();
  }
  catch (SomeException? e) {
Log.exception("couldn't bar the foo", e);
  }
Irritation at this idiom is what sparked this page: In the spirit of the first of the RulesOfOptimization, what I was trying to say (poorly), was, "If you think you need to logging mechanism, QuestionConventions and see if there's another way." Sometimes it's a good thing to ButcherSacredCows?... judging from the response, I succeeded at that!

That said, I think there were a number of good points raised on this page about when logging is necessary. -- JimLittle

I HaveThisPattern also. I've just wasted the last seven months of my life struggling to debug a mess of an application, where the more I dug into it, the more bugs and design flaws I found. The existing logs (which were huge) were almost completely useless in identifying these latent bugs; they obviously hadn't been reviewed for years, by anyone. I had to comment out dozens of mindless logging statements (almost every intermediate calculation and assignment was logged, double spaced) to get the log down to where I could made sense of it, and then add my own key debugging statements to find and fix the real problems. (How can anyone anticipate what they'll need to have logged in order to solve a bug in the future? If you can anticipate that, you can fix the bug preemptively!)

In my experience, the core functionality in most software only takes about 10 or 20 percent of the code; the rest is necessary to handle all the possible things that can and will go wrong, the exceptional cases. This means you have to spend more than a little mental effort considering what can go wrong, and how to handle it intelligently. This should be handled as a major, integral part of your system design, not an afterthought, and it should be centered on your users' needs. Talk to them about it; ask hard questions, like "What if the system can't get the interest rates from the auxiliary database; what should it do then?" (Users don't particularly like to think about this stuff, but if they don't think about it now, they'll have to think about it down the road when something goes wrong...)

So I would say it this way: Logging may or may not be useful, but it is rarely sufficient by itself, and should not be used in lieu of proper error and exception handling. -- AndyMoore


Removing logging ins an AntiPattern

The code sample that you just gave says nothing about its correctness, it might be the correct thing to do, or it might not. Logging and handling exceptions are not perfectly orthogonal, but let's say they are quite different concerns. Sometimes the correct handling of an exception is to log it. -- CostinCozianu


Logging not really a solution

While I'm not in favour of this thing, the following problem we all know seems to indicate logging is not really a solution...

The WikiWiki Server Can't Process Your Request

Can't open /home/httpd/cgi-bin/wiki.db for update .

This information has been logged. We are sorry for any inconvenience.

Neither are exceptions. Logging and exceptions are both indications that something unexpected happened. Log files, however, are easier to e-mail than exceptions.

I look at my logs. I knew when that message started occurring. I knew when it showed up six times a week. I knew when it grew to six times a day and then six times in a row. I watched the problem grow. I had statistics. But I was never able to establish why. The logs had nothing to say about that. The trouble with logging is that there is no way to ask a follow-up question. -- WardCunningham

Yes there is. Log more. Log until you have data but no information. Then you can start asking the logs questions by sifting through the data to create information. -- SunirShah


Use Logging In A Well-Targeted Fashion

It's certainly true that logging is used too much.

I've seen systems with big files full of junk that no-one ever reads.

I've seen people spend ages designing logging frameworks that never get used.

But sometimes well-targeted logging can be really useful. I've used it as follows:

Unattended distributed stuff. There you are, running your distributed calculation overnight across a whole bunch of machines. Something goes wrong overnight. Because you have logging, you can see it happened to several machines at the same time. So that points you to the network backup grabbing files they need. Very useful.

Verifying Configuration Settings. We have a library framework that can pull in a lot of runtime modules. The setup is very flexible, which means it's a potential source of error and confusion. But we need all the flexibility. So at startup we log what the configurations are collectively asking for and show success/failure in loading the modules. So when over at someone's machine, 9 times out of 10 you can solve any configuration problems by examining the log.

Optional recording/playback. We allow people to switch recording on/off for calls into the library framework. We can play this back on another machine and trap problems in the debugger. So I can sort out a Java client's problem in a purely C++ debugging session. Also, I can compare inputs and outputs for test runs and report call-by-call differences.


It's certainly true that logging is used too much.

Not sure if that is true. Anytime I have a problem with a program I hope there is a log to help debug the problem, but there rarely is.

I've seen systems with big files full of junk that no-one ever reads.

Anything can be done badly. That doesn't mean don't do it.

I've seen people spend ages designing logging frameworks that never get used.

Somehow I doubt this, unless an age for you is a day or two.


Logging is good.

''Your exception/error handling code should also log. This is especially true when you expose an API to programmers who are beyond your control. If they write bad code which swallows your errors, you still need to be able to support your customers. -- DominicCronin

Seconded. I have saved at least couple on-site debugging visit because my module logs when unexpected values (such as nulls) are obtained from another module (especially the flaky ones). Whenever there is a bug reported to me, the first thing I ask for is the log. In some cases I can reply within minutes that XYZ module is returning nulls for calls to method foo(). -- OliverChung


Without a specific need for logging identified by the customer, adding a logging framework is taking time away from solving more pressing problems (See YouArentGonnaNeedIt). Transaction history and audit trail are legitimate examples of needs for a log. Make the customer identify that need before expending the resources to develop them.

"Just in case" logging, like error-detection code, is more prone to introducing errors than functional code.

Disagree strongly. The only side effect of logging code should be its effect on the log itself, and it should always terminate. In some languages this is enforceable; in any case, bugs caused by logging are IME extremely rare.

The details may not be well-defined, or are based only on progammer hunches. Hunches may be incorrect and misleading in an actual bug hunt. Then you have to debug the debug code. Great. Your time will be better spent testing the code for robustness.

Case-in-point: I took over a project in which the system crashed from filling the log.

Limiting the size of a log is hardly rocket science.

Whatever useful information might have been captured in the log was overshadowed by flooding the log with vague system state information ("powerup", "init complete", and "shutdown"). Because the system was power-cycled somewhat frequently, the log overflowed during normal, error-free operation. A bug in the logging code caused the system to crash.

The existence of a bug in one particular logging implementation is not an argument against logging.

Don't gold-plate the code if the customer only needs aluminum. Save logging for after the customer requests it. -- ChrisFay

The customer will probably not request it. Logging is most valuable to developers, not directly to customers.


Why not let the exception itself do the logging? It's always struck me as a waste the way exceptions are usually used... i.e., if you can resolve the exception at some level, why not let the exception object know about it? The exception seems to be a natural choice... it has the potential to know what happened, why it happened, and what's been done about it.


You know, I would not have thought there would be a debate about this. I was on here looking for something else and this discussion caught my eye because my logging architecture is one of the most important pieces of code in my systems. I wondered how others were doing it. The fact that there are people who would argue against logging is a real curiosity.

One has to consider many things when developing software. Small utilities that I write generally don't have logging per se. However, even they have a little message sub-system.

When code grows beyond a certain size, it is near impossible to maintain or even bring to production without logging. I have done most of my work in financial and telecommunications environments, so perhaps standards are a little stricter. However, even when doing my own stuff, logging is an integral part of the system.

Logging serves many important purposes and I am hard-pressed to think of an example of non-trivial software that would not benefit from it. Those purposes include:

I have seen logs grow to enormous sizes in production -- 200MB a day back when the disk space available was not much more than 10GB in total. However, those logs in particular were the only way errors would have been caught, identified and fixed on that particular system (built by nearly 200 people and serving millions).

They do place some kind of a burden on the system and they do take time to build and to care for. However, they definitely pay for themselves. A non-trivial system *with* extensive logging, is definitely cheaper to build, run and maintain than one without. It really is not a contest. Virtually all systems, trivial or not, leave some kind of record of their activities, however small and short lived. It is not a question of whether or not you should have logging. Arguably, you do already. The question is: is your logging adequate? Chances are, it is not.

Earlier on this page, Ward says that his logging system does not tell him how to fix the problem it logs. He says "The trouble with logging is that there is no way to ask a follow-up question." Well... if you really think about it, logging is all about asking a follow-up question. The fact that his logging is not complete enough to do that is not an argument for *less* logging, it is an argument for more. My logging systems give a complete context within which failure happens and they emit enough records to allow rejection of co-incidences and isolation of actual root causes. In fact, really good logging should not require you to ask a follow-up question. It should tell you what happened, why it happened and how to fix it so it won't happen again.

All non-trivial software has defects. If software is *very* buggy, you *must* have some method of dealing with them right away. You really need logging. If software is very clean, then when a defect does arise, you will need a lot of information to find it. It is likely subtle and only manifests under relatively rare conditions. If years have passed, the logging may be the only clue to where and why the system failed. Do those scenarios happen? Yes.

To be clear what I am talking about, I have taken a sketch from the company's internal wiki to describe the general logging system we use:

Given that just about every non-trivial system *has* logging these days, I would say that saying "you don't need logging" is an exceptional claim. Exceptional claims require exceptional proof. People arguing *for* logging have given plenty of examples for why it is a good thing. People arguing *against* logging give reasons that are weak on their face or no reason at all.

Somebody said they hated logging because they had to wade through all kinds of junk that should not have been logged. It is certainly possible to log the wrong things. It is certainly possible to place logging in ill-considered locations that generate a lot of logging. It is certainly possible to have inadequate logging such as a naked message like 'Impossible Condition' with no indication of what that condition is, where it came from and when it happened. The fact that you have shoes that do not fit your feet is not an argument against shoes. Not a good one, anyway. It is an argument for getting shoes that fit.

Finally, (phew), although I realize this is a community predicated upon a certain way of building software, I would respectfully like to submit that the notions of building software here are not entirely realistic. If you think about it, most software that anybody actually uses is not built using the principles that I see on this wiki.

I might have alluded to it somewhere else, but my observation is that software is actually built according to something I call TheReceivedMethodology. I have an attractive graphic for this, but will attempt to sketch it in broad strokes under a page called: TheReceivedMethodology.


Thank you for the effort you have put in to contribute three pages. My guess is that you have been pondering this for a long time. I have a couple of quick comments, one about this community and one about a comparison.

-- JohnFletcher


John -- Thanks for the kind comment. I actually have quite a bit to say about software development. As you might be able to gather from the text, I have rather strong opinions about some of it.

This is an fairly literate community with the strangest blind-spots. Although they have done well to articulate a lot about design patterns and what I might term social development philosophies, there is a lot of bread and butter programming stuff that is missing.

I'm old-school, so I generally do what I call 'object based' rather than 'object oriented' programming. There are some things that can't be done reasonably without OOP, but there are a lot of things that are done more surely and simply with procedural programming. Before going to fool-blown OOP, it is best, IMO to go 'object based'. Object based just means that data and methods go together with decent cleavage lines between objects. Hierarchies and inheritance, if they exist at all, are simple. Object based stuff can be done easily enough in just about any language. The better the scoping regime (in the language and the code), the easier it is to do. If you have a working 'object based' solution, it ports much more cleanly to other languages. In practice, in even medium size real-world systems you have more than one language and it nice if they can use the same paradigm and models.

I would like to see a little more about bread and butter programming that addresses non-trivial working production systems rather than toy examples.

I am a big fan of 'pair programming' and in my experience it has always been done off and on with successful developers. I like the fact that it has been articulated, defended and allowed to come out of the closet. However, pair programming is only one of the host of techniques used to build large systems. When it takes a hundred people to build a system, basing everything on pair programming or any other single paradigm is a recipe for failure.

If I had my choice, I would like to see a little more time given to the wonderful traditions and cultural 'ethic' of the original Unix guys -- K&R&P&P, etc. Similarly, people like Knuth, Wirth, Dijkstra, etc. have contributed stuff that continues to resonate with me. Their work may have had its flaws, but I still am impressed with the spare beauty that is the C language.

Anyway, I have pledged elsewhere to use my RealName on things and publish more. This was a part of living up to that promise.

BobTrower


As someone who makes his living administering computers rather than programming them, I read this page with bewilderment. I need log files to operate my systems successfully, because:

I certainly hope that none of the programmers who write the programs I use read this page and decide that logging is a bad thing. Yes, logs have to be managed, both by the developer and by the admin (whether that be another programmer, an administrator, or the occasional unfortunate nontechnical user), but to me the only real alternative to log files is wild guesses when things go wrong.

CraigPutnam


Craig -- Well said. It is good to get another perspective on this.


See also LoggingBestPractices, LoggingToaQueue, PatternsForLoggingDiagnosticMessages


CategoryLogging


EditText of this page (last edited June 16, 2014) or FindPage with title or text search