Building DSCovery

How I built DSCovery, the civic tech jobs aggregator, and lessons learned.

I was asked today how I built DSCovery, the aggregator I put together awhile back to help people find jobs in civic tech.

I wish it was a better story.

The reality is that on a technical level it's a not-very complex Django app. I just love Django for this sort of thing. You can get from zero to something in no time at all. Rails has the same advantages, but I know Django much better, and the Django admin made my life a lot easier in this case. It was kind of the killer feature for this project.

But, to back up a bit, aside from Django I already had experience with two things that would be key to DSCovery:

Because I'd spent so much time with ATSs as a hiring manager, I knew there weren't that many of them out there, and most folks congregated around just a few. Which meant that if I wrote the importer for one, I'd written it for all of them.

So I started going down the list of DSC firms using the most common platforms and started building an importer for each platform, as simple as possible:

  1. For each company known to use this ATS platform....
  2. Use the Requests library to get the HTML from the jobs page
  3. Use BeautifulSoup to parse the jobs out
  4. Tuck them away in a python data structure.

This got easier as I added more companies, because more often than not I could re-use an importer I'd already written. When I was fairly happy with the first importers, I built a view to wrap them and do a few things:

From the Django admin I can then keep an eye on the roles, assign them to different practices, set things as featured etc.

And really, that's about it. There are a few basic display views, but not a whole lot more. I've got some more things I'd like to do when I find time, but right now it's still pretty simple.

I recall doing some almost-interesting things with includes in the templates to make them more reusable, and to allow me to more easily use Alpine.js and/or HTMX, but if I'm being honest I can't recall now what it was that I was doing with Alpine.js or HTMX, and I'm not sure if I ever implemented it publicly.

The challenges

There were a few challenges along the way, to be sure, and some of them have persisted.

For example, some of the ATSs are really difficult to parse, usually because they use client-side rendering (side note to ATS vendors: a jobs page is nearly the last place client-side rendering makes any sense). This borks BeautifulSoup because it sees the page as it was delivered from the server, not as it's hydrated client-side. Or, in other words, it sees an empty page.

In some of those cases I've been able to find the raw JSON it's hydrating the page from. In those cases I sidestep BeautifulSoup, grab the JSON and parse that. But in other cases the JSON is also unavailable and I'm stuck. Workable, for example, just seems unparseable, so I can't yet import jobs from Blue Tiger, Friends from the City, or Pixel Creative. That bums me out. They're good folks.

And some ATS are just completely stupid and awful to work with. Usually, but not always, these are old-school "enterprise" offerings that feature lousy markup and horrific user interfaces: ADP, ApplicantPro, UKG, that sort of thing. I really feel for the companies saddled with these systems.

In some cases (like, Ad Hoc, for example) I think the company is important enough that I really want to include them, but their ATS system is so incredibly bad that it's just impossible. All I can do is pull from LinkedIn. It's not ideal. To all the DSC companies (and anyone else who'll listen) listen to this advice: Don't listen to fools trying to sell you these "enterprise" ATSs. They are garbage, and the person trying to convince you to switch to them should not be listened to.

There are also some cases where the ATS is too niche, or the companies are so small, that I'm waiting for some more jobs to import before going through the effort of writing a new importer.

At any rate, I continue to iterate, add more firms, and more functionality.

The whole thing is hanging together pretty well so far, which makes me happy. It'd been awhile since I'd coded up much of anything, really, so I was glad to see I could still do it.

If you'd like to read more about DSCovery and why I built it, I wrote that up awhile back. If you'd like to take a look under the hood, all the code is public.


Photograph is of a Side Corridor on a Dirigible, ca. 1933, from the National Archives

Published