Gets Is Dangerous

The gets() system call is part of the C programming language's standard I/O library. It takes a char pointer as its only input, and will try to fill the buffer the pointer presumably points to with a line of text. It is widely considered to be a bad idea, since it will gladly overflow any buffer it is passed (BoundsChecking is not performed, in other words, and this can lead to MemoryCorruption?). Most C programmers regard use of gets() as a sign of general cluelessness on the part of whoever wrote the code.

fgets() is widely advocated as a drop-in replacement to gets(), as fgets() also takes as a parameter the size of the buffer. But this is not entirely satisfactory, either, since if the input exceeds the size of the buffer fgets() will simply null-terminate the string right where the buffer ends and exit, leaving unread characters on standard input.

To put things simply, the C standard I/O library is broken and individual programmers must work to overcome its deficits. Below are some attempts to do just that, often with justifications and/or rants on how inadequate the existing tools are. A lot of them are based around StateMachines? that do all I/O via fgetc() and malloc() and realloc() buffers so they expand to fit the actual input. It isn't a complex concept, but it is vexing that none of this has reached the ears of any relevant standards committee.

Some links on the issue:

I agree that it's not "satisfactory", but what alternative is there? What would you have the standards committee do? What happens when the input exceeds not only all RAM but all (virtual memory) hard drive space as well? If the subroutine doesn't eventually "terminate the string and exit, leaving unread characters on standard input" like fgets(), or "make the program fail in strange an unexpected ways" like gets(), what alternative is there? -- DavidCary


Well, let's see, I've pretty much abandoned the gets() family of functions and taken to writing my own input loops using variations of getc() or getchar() depending on what's available. This has allowed me to write bit-bucket discards to overflow situations.

I gave up on gets(), fgets(), scanf(), fscanf() and anything that smelled like them 20 years ago. I control the input stream and the buffers, and I don't get buffer overflows, ever. If my buffer is 24k and your input is 30k, the last 6k of it is going to get chucked or at best fetched after the first buffer full is safely tucked away.

-- GarryHamilton

So, you have a function that eventually terminates the string, and then exits. It sounds to me like you re-implemented fgets(). Is this function in *any* way better than the standard fgets() function, other than having a name that doesn't smell like the unusable gets() function ? -- DavidCary

Not that I lay any claim to C guruhood, but it sounded like the issue with fgets() was that it simply stopped reading when it had enough, which may well cause issues with later input. In many cases, the proper answer is to indeed clip the input to the size of the buffer, but also to pretend like the whole thing was read. Poor example: Reading a single xml tag into a fixed size buffer off a stream (10 character buffer, for the sake of the argument): <HMM Foo="<BAR Malf<ormed=XML>>>>"<SOMETHING></SOMETHING></HMM>. Now, what happens? Do we read the next tag from "<BAR..." and blow up? Or do we continue reading the tag (even though we can't process it) up to it's logical conclusion? Such is the problem with fgets() (in my understanding at least).

--WilliamUnderwood


CategoryPitfall


EditText of this page (last edited August 9, 2005) or FindPage with title or text search