In late 2020 or early 2021, for a number of different reasons I decided I wanted to get a better understanding of Natural Language Processing (NLP) and Python's Natural Language Toolkit (NLTK). Maybe it's vestiges of my career in journalism, but the idea of converting a mess of text into something the machine can understand and do something useful with appeals to me.
I was also pretty interested at the time in how government resumes are evaluated, and the need to match language. So given those converging interests it just made sense for me to try NLP to compare the language in a couple of documents -- like a resumé and a job post.
And that was the beginnings of Pear. Over time I've re-written it a couple of times, built a nicer front-end for it and added functionality. I may keep doing that.
I've also found I keep reaching for it, because similar commercial systems seem highly intrusive. I'm always wary about providing data or text to most any app, especially when they seem pushy about wanting me to do so. Are they harvesting and selling information? Are they doing something else with it? What happens to my doc when I hit submit? I like having something that's non-commercial and is clearly not in it for any worrisome reason.
Why did I do it?#
To learn, and to solve a simple need I had to compare docs.
What did I learn?#
- A whole lot about Natural Language Processing. Even at my self-taught level this stuff is fascinating.
- Alpine.js and HTMX -- I'd been wanting to play with these mini-JS libraries and came away quite impressed. Would use again, and recommend.
- More Flask. As a died-in-the-wool Django person, it was really interesting learning more about Flask's approach to things.
Where is it?#
Published