Smalltalk Is Slow

We had an amusing problem at C3 this week. A data supplier sent us bad records for 1300 employees and told us after we had run the payroll (but before we had cut the checks and EFTs, thank you very much).

Export in the monthly C3 takes a long time to run, so we wanted to rerun the 1300 after fixing the data, and then just export that 10% and put the files back together outside the system.

The export file is on the mainframe and we thought maybe we would write a program to strip out the records, then concatenate the files, etc. But we'd have to be reading both files at once, so the sort utilities couldn't really do it, and it would have taken a COBOL program.

The file is 10,000 12,000-byte records, so we couldn't possibly do the job in Smalltalk.

ChetHendrickson and I FTP'd the file to the production Sun machine. We wrote a tiny Smalltalk object that reads a file, pulls out the key fields (10,000 strings), puts them in a dictionary, with the value being the file position in the read stream, and the readstream itself. Approximately

  position := stream position.
  key := self getkey: stream nextLine.
  dictionary
    at: key
    put: (Array with: stream with: position)
Another method on the object goes through the dictionary, positions whatever stream it finds to the position given, reads the record there, and writes it to a new file.

We read the old file. We read the new file, replacing the corresponding old entries with pointers to the new file. We output the file.

It took three minutes on the Sun. Just for fun, we sorted the file on the mainframe, which we'd have needed to do before starting the merge. It took two minutes.

Guess we're going to have to optimize our object, because SmalltalkIsSlow.

-- RonJeffries


CategorySmalltalk


EditText of this page (last edited March 25, 2010) or FindPage with title or text search