Despite years of investing in better storage and analytics, many organizations, especially in government, still struggle to make use of their data. As Daniel Castro writes in GovTech, agencies too often have an abundance of “dark data”—data that is undiscovered, underutilized or otherwise untapped. Even if these organizations have fully embraced digitization, for example by converting all paper-based forms to electronic ones, one of the challenges for government agencies is that much of their valuable data is trapped in documents, such as contracts, invoices, policies and meeting minutes, and they have no effective way of getting it out and making use of it.
Artificial intelligence is creating a new option for organizations to make better use of data in their documents. Using natural language processing, deep learning and other methods, AI can help recognize and categorize data in documents and then mark up that data to create a structured document. For example, NASA and the National Science Foundation have partnered with AI startup Docugami to explore how to use its technology to automatically scrape, structure and categorize documents and their elements.