Welcome to the fifth edition of the Data Liberation Project’s newsletter. Inside: Two major datasets liberated, FMCS database documentation, the latest batch of FOIA requests, and a volunteering form.
The US Environmental Protection Agency’s Risk Management Program rule requires facilities handling “certain hazardous substances” to submit “risk management plans” at least every five years, detailing the chemicals they use, the risks that usage poses, what the facility is doing to minimize accidents, and a five-year accident history.
The EPA collects these filings into a central database, which I've mentioned in past dispatches: after the DLP filed a FOIA request for it ... after the request was granted ... and after I received the CD containing the data.
Since then, DLP volunteers and I have worked to understand, document, and process the (fairly complex) dataset. There’s still more work to be done, but we’ve gotten to the point where (I think) we have a decent handle on the data. So, I’m happy to say: it’s now yours to download and explore.
The best place to start is the main documentation. There you’ll find essential context, as well as links to:
I’m eager to help you work with the data: To help you use it to you inform your communities, to help you analyze it, to help you build public-interest tools with it, and more. (To get that help, all you need to do is reply to this email.)
The Animal Welfare Act sets minimum standards of animal care by four main types of licensees: commercial animal dealers, exhibitors (such as zoos), research facilities, and transporters. The USDA’s Animal and Plant Health Inspection Service (APHIS) checks whether licensees meet those standards, and issues citations when they do not.
The agency provides an online portal containing its inspections but, frustratingly, no option to download the full dataset. The structured information provided through the interface also lacks important details, such as the type of inspection and the list of species inspected — information that is available only in the inspection report PDFs.
In the very first DLP Dispatch, I raised the prospect of scraping the APHIS portal to liberate that data. As it turns out, Big Local News’s Ben Welsh was also working on an APHIS scraper. A mutual friend connected the dots a couple of months ago. Since then, Ben and I have been collaborating on code to fetch the 80,000+ published inspection reports, parse the PDFs, and make the records all-around more useful.
Although there’s still (and always) more work to do, we think we now have something useful to share. So, yesterday, we published our APHIS-scraping code and the data it has gathered. Among the main resources you’ll find there (and also automatically updated in the biglocalnews.org portal):
We’ve also uploaded all the inspection report PDFs to a public, searchable project on DocumentCloud.
We’re eager to see what you do with these records, and we’re eager to help you use them. If you have any questions or feedback, don’t hesitate to get in touch.
Earlier this month, the agency’s FOIA office responded with a “partial grant,” providing just one documentation file, and withholding all other records.
The DLP intends to appeal the decision. In the meantime, I’ve uploaded the one file the agency did provide: An Excel spreadsheet named “Prod Schema.xlsx”. Although FMCS did not provide any context for the file when providing it, it appears to be the database schema for the agency’s case management system, or at least a portion of it. It’s fairly detailed, with 5,900+ columns across 100+ tables. I’m hoping it can be useful to the public in (a) understanding how FMCS tracks its cases, and (b) filing high-precision FOIA requests for the system’s records.
A copy of the “comprehensive national water use inventory” mandated by Secure Water Act of 2009 (also co-requested with Varner).
You can read more about each request via the links above. If you have any questions about them, please do ask.
On the DLP website’s Get Involved page, I’ve linked to a form where you can express your interest in volunteering. It’s provides a bit more structure and information than just emailing (which you're still welcome to do).
That’s all for now! Thank you for reading, and don’t hesitate to reply.