I mentioned in another post about my plans for WorldWarIICasualtyProject.org. In short, the U.S. National Archives has scanned pages out of books that list American casualties that took place in World War II. I was curious to find out more, but discovered that those records did not exist in searchable form. I thought it would be interesting to figure out how to scan them (gif files, really!) and read the data as OCR.
I decided to go ahead on my initial plan: use a Python OCR module to read the scan. However … the Python module I tracked down (pytesseract) also required PIL (another module, part of the dependencies) and strongly suggested I install the python science packages. I figured I would need them at some point, so I installed numpy, scipy, matplotlib, scikit-image, scikit-learn, ipython, and pandas. ( https://www.learnopencv.com/install-opencv3-on-macos/)
At this point, I paused. I found several pages that suggested OpenCV be installed with Homebrew. That’s not a big deal because I use Homebrew for python 2/3. It gets confusing here. At one time, OpenCV was kept in a specialized area named “homebrew/science” but was moved to “homebrew/core”. I’m told “homebrew/science” is empty, so there should be no reason to link to it. We’ll see.
Note: use ‘> brew tap’ to list all taps connected for homebrew
Also note: opencv3 does not exist anymore. I think it has been renamed to opencv. Opencv2 has been renamed ‘opencv@2’. … So confusing …
Then there’s the question of linking OpenCV to “… Homebrew Python’s site-packages directory”. What? See https://www.learnopencv.com/install-opencv3-on-macos/
I’m sticking with these instructions: https://robferguson.org/blog/2017/10/06/how-to-install-opencv-and-python-using-homebrew-on-macos-sierra/ except for the part where I tap into homebrew/science (it doesn’t exist any more) and I install opencv3. (It’s been renamed to opencv).
I installed OpenCV through homebrew. Lots of dependencies were installed. Interestingly enough, I can see opencv through the default homebrew python3 install, but not in virtual environment I created for custom work. In other words:
>>> import cv2
However, when I go to the virtual environment set up for ww2cp, I don’t see it.
> source ww2venv/bin/activate
(ww2venv) > python3
>>> import cv2
Traceback (most recent call last):
File “<stdin>”, line 1, in <module>
ModuleNotFoundError: No module named ‘cv2’
So, following the instructions here: (https://robferguson.org/blog/2017/10/06/how-to-install-opencv-and-python-using-homebrew-on-macos-sierra/), I set up a symbolic link between homebrew’s openCV install and the site-packages inside the ww2 venv folder.
Now it works!