Stuart J. Whitmore on Nostr: OK, that wasn't the OCR's fault, the PDF to TIFF conversion by ImageMagick was ...
OK, that wasn't the OCR's fault, the PDF to TIFF conversion by ImageMagick was terrible. When I did it manually using GIMP and then ran tesseract-ocr, the result was much more usable:
"...it was simpler to call it home. He had only lifted off from Earth's surface two years ago, and he knew his assignment would end after two more, but he was already considering an extension..."
(That's from the same paper source, same scan. The only thing I did here was remove a couple line breaks.)
#amwriting
"...it was simpler to call it home. He had only lifted off from Earth's surface two years ago, and he knew his assignment would end after two more, but he was already considering an extension..."
(That's from the same paper source, same scan. The only thing I did here was remove a couple line breaks.)
#amwriting