US Activist Group Makes FOI Request for Billions of Digital Images to US National Archives

0
18
David Ferriero, National Archives and Records Administration (NARA)
David Ferriero, National Archives and Records Administration (NARA)

Reclaim The Records, a non-profit activist group of genealogists, historians, journalists, teachers, and open government advocates based in the United States, says it has filed its biggest ever Freedom of Information Act (FOIA) request, making possible a world-record-setting request for billions of records in text and digital images from the US National Archives and Records Administration (NARA).

In the request made under the US FOI Act on October 14, 2020, Reclaim The Records is asking NARA for billions of digital images and their associated text metadata created through NARA’s Digitization Partnership Program. It is also asking to be treated as a “media requester” for the purposes of calculating the fees for the FOIA request, arguing that “We are a non-profit organization, not a commercial entity. We do not charge for copies of any of the tens of millions of records we have already acquired from government agencies and released to the public.”

Reclaim The Records said it acquires genealogical and historical databases and images from government sources, including government archives, often through the use of FOI laws and then uploads them to the Internet, without any copyright or usage restrictions or paywalls, making them freely available to the public and returning the taxpayer-funded materials to the public domain.

According to the organization, NARA has for several years managed an innovative public-private partnership program to digitize many of the important historical documents it holds, particularly records that would be useful for family history research.

Such records, it said, include multiple enumerations of the US Federal Census (since 1940), immigration and naturalization records, military and veteran records, tax assessment lists, and more.

The organization stated that more than 400 of these important historical record sets have been digitized so far under this long-running partnership program, with each of the record sets containing hundreds of thousands, or more often millions, of individual documents.

It noted that a “likely-incomplete listing of these record sets” is available on the NARA web page “Microfilm Publications and Original Records Digitized by Our Digitization Partners”, which is said is located at www.archives.gov/digitization/digitized-by-partners.

Reclaim The Records said “The total number of unique historical documents digitized and transcribed through this program is probably in the billions.”

It claimed that in exchange for having private corporations and non-profit organizations agree to become “partners” and digitize the historical records from their original paper or microfilm formats – a massive task that would be largely cost-prohibitive for NARA to conduct on its own – NARA agreed to let the partners have the exclusive use of the newly-digitized materials on their own websites for a certain amount of time, an “embargo period”.

According to the organization, “This grant of supposedly exclusive entitlement to public records was meant to induce these partners to spend their time and money to conduct the digitization and transcription of the records at their own expense, instead of at the taxpayer’s expense. But while well-intentioned, it also meant that these original historical records were often completely removed from public access while the companies worked on them, making the records functionally unavailable to researchers, sometimes for years.”

It also claimed that “even once the digitization and transcription work was finally completed, the exclusivity period for each newly-created digital recordset was also supposed to be time-limited. After the stated embargo period would end for each unique record set, usually within five years but sometimes in three years, NARA would then be able to freely disseminate the now-digitized versions of these public documents, both the images and the text metadata that accompanied them.”

It referred to NARA’s own policies, which state that the agency could and would publish the digital copies through NARA’s own website or official online Catalog or through its official API access or through other means as contained in item number two on the “NARA Principles for Partnerships to Digitize Archival Materials” located at www.archives.gov/digitization/principles.html, which states: “After an agreed-upon period of time, otherwise known as an embargo period, NARA gains unrestricted rights to the digital copies and the associated metadata transmitted to NARA by the partner, including the right to give or sell digital copies in whole or part to other entities if NARA so chooses. If resources permit, we will try to make the digital materials available in our online catalogue within the same year they are no longer in the embargo period.”

Reclaim The Records however alleged that “in practice, this simply hasn’t happened. NARA has never actually posted online the vast majority of these records that were digitized through their partnership program, not to their Catalogue nor indeed anywhere else where the public might be able to freely access and download the now-digital records. This remains the case today, even when the embargo periods for many of these record sets have been expired for more than a decade, sometimes two decades.”

It said: “literally billions of these historical American records remain solely in the hands of NARA’s primary digitization program partner, Ancestry.com. Ancestry is a private corporation, previously co-owned by a private equity firm and the government of Singapore’s sovereign wealth fund, until they were sold to a different private equity firm for $4.7 billion in August 2020.”

According to it, the vast majority of the billions of records digitized through NARA’s partnership program are now available only behind Ancestry’s subscription paywall, or through companies now owned by Ancestry with their own additional subscription paywalls. Annual subscriptions to these websites can cost hundreds of dollars per year per person.”

Reclaim The Records alleged that “the end result of NARA’s digitization partnership program has been that billions of important American historical documents were successfully digitized and transcribed — but then were mostly not made available to the public for decades in any way other than by requiring the public to buy expensive annual data subscriptions benefiting private corporations, primarily a single multi-billion-dollar conglomerate, whose previous owners included a foreign government.”

Reclaim The Records is therefore requesting copies of “every single record created under NARA’s public-private digitization partnership” with the entities Ancestry.com, Fold3.com (formerly known as Footnote, now owned by Ancestry), Archives.com (now owned by Ancestry), and FamilySearch (a non-profit organization).

These include all of the digital images, in their original, full-size, uncompressed, and non-watermarked versions; all of the associated text metadata (names, dates, places, etc.) also created under the partnership agreement, which go along with the images, making them searchable; all copies of finding aids, training materials, handbooks, checklists, formatting guidelines, data dictionaries, data templates, data lists, or other internal documentation that explain more about the digitization of the images and the transcription and compilation of their associated text metadata, and how they relate to each individual data set; and any records that were digitized under NARA’s partnership program that may not have been properly delivered or returned to NARA after their digitization was completed.

 NARA has 20 business days to respond, as required by the US FOI Act.