Greek scansion tool

1/20/2024

In a stroke of necessity fueled genius I thought, “Why not try a Modern Greek OCR engine?” Since all the letters are the same and there is more impetus to design a high-functionality OCR engine for Modern Greek, it would seem that it would at least accurately give the words (for the most part) and maybe would even do it better than the results shown above. So, after my initial results, I entered a dark-horse candidate into the running.

To put it bluntly, and with all due respect to those who have done the work in getting these OCR engines together, the results were underwhelming. Let’s take a look at the results turned out by these two. I did nothing to clean up the quality of the pictures, like is generally suggested for better OCR results. The first text is obviously cleaner and easier to read. John Moschus “Spiritual Meadows” chapter 7 To test the OCR engines I fed in one picture of clean, neat, very legible Greek text (taken with my phone) and one example of a more difficult example copied from a pdf from the Patrologia Graeca. I’ll cover the third OCR tool later.įor an updated take on using a newer version of Tesseract and its seriously good results (my new default OCR tool), check out the Power of Tesseract. These two recommend themselves as two OCR engines which have been trained specifically for OCR-ing Ancient Greek text.

Antigrapheus: basically an online version of the above software which “aims to be no better or worse than you would get from downloading, installing, and configuring Tesseract, but without the need to do all those things.”.
Here is a description of how Tesseract was trained for Ancient Greek and the difficulties involved.
Ancient Greek OCR: Ancient Greek OCR is a free software (its called gImage Reader) to accurately convert scans of printed Greek into unicode text and PDF files, using the Tesseract OCR engine trained for Ancient Greek typography, syntax, and vocabulary.
My initial set was two OCR engines that have been trained specifically for working with Ancient Greek. I ended up testing three different means of OCR-ing Ancient Greek text. The results were surprising, to me at least. Here I set out a comparison of 3 different means of OCR-ring Ancient Greek texts to see which one gives the most reliable and useful results.

I recently set out to find a good way to OCR Ancient Greek text. What about Ancient Greek? Well, that is a bit more of a problem. OCR works great for English and is easy to get for free. If you don’t know what this it, it is simply the means by which a computer “grabs” text from an image and converts it into editable text. Solution: Enter OCR, or Optical Character Recognition.

What to do to be able to read all these different texts in a more readable format? Problem: Of course, as anyone who has tried to read a long pdf already knows, they have their difficulties, especially when it comes to small screens. In other words, a veritable bounty of Greek texts are available for free online. There is also the Patrologia Graeca which has an estimated 50 million+ words of Greek text, many different editions of the NT, and so forth.

That means that the standard editions for many of these texts are available for free online (thank you !), or at least good copies of the texts. The heyday of scholarly activity for many of the pseudepigraphical texts, for instance, was in the late 1800’s or early 1900’s (though many are picking up again). Fact: there are a lot of Ancient Greek texts available in pdf documents online.

0 Comments

Greek scansion tool

Leave a Reply.

Author

Archives

Categories