Loading…
NDF2019 has ended
Back To Schedule
Wednesday, November 20 • 1:35pm - 1:45pm
POSTER TALK. Machine learning for automated transcription of handwritten documents

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Posters will be available for viewing throughout the conference. Come and listen to the speakers present their paper from 1.35-1.45pm and make sure you check out the poster if you can't make the talk.

A focus of our long-term strategy is “taking archives to the people” which means a shift from users having to find us and understand our systems, to pushing information out using channels that are relevant to different communities. Our archival holdings include many documents and registers written in 19th century cursive handwriting. To find a piece of information that may be held within a document or register (for example, a name or an event) people need to visit one of our offices where the item is retrieved from the shelves for them. That’s the easy part. The hard part is then reading through the document or register to find the information; sometimes hundreds of pages. This can be a time-consuming, painful process. Our indexes can point a researcher towards a likely document or register, but there is no way to search within the document other than reading it all. Items are often fragile, and handling may risk causing wear or damage; and for an increasing number of people reading 19th century handwriting is difficult.

Imagine if there was a way to search this content online from anywhere, anytime. Transcribing this type of content manually is slow and resource-intensive, and making digital images available does not make it digitally searchable.

We wanted to explore the feasibility of using machine learning to see if these handwritten archives could be automatically transcribed so that their content can be easily searched, and provide access to a transcription that takes less time to produce and is easier for users.

We tested Transkribus, a cloud-based machine learning platform designed for archives and humanities researchers for the automated recognition, transcription and searching of historical handwritten documents.

Our findings are that machine learning can help make our records easier to find and use, particularly for 21st century users, without the use of intensive manual resources.

Speakers
VC

Vivienne Cuff

Archives New Zealand
JK

Joanne Koreman

Archives New Zealand


Wednesday November 20, 2019 1:35pm - 1:45pm NZDT
Oceania - Food Hall, Meeting Point and Admin Desk Museum of New Zealand Te Papa Tongarewa, Te Aro, Wellington 6011, New Zealand