At one of the sessions during PresQT, we discussed the value of auto-selecting metadata data terms. Someone brought up the problem of getting researchers, lawyers, basically anyone, to fill out DMS profiles, choose tags and other tasks associated with attaching metadata. Those groups want to use it after the fact, when they are searching, but they don’t want to take the time to add good metadata. This means that auto-selecting metadata is at least worth talking about.
First, here is another point where academia intersects with the corporate/law firm world.
We finally got to the point of AI. We were discussing having machines read documents, assign metadata and leave the ‘fun’ work to the humans. This is a good solution, when and if it works, but what about language? Someone brought up patents, which are written to talk around the subject. If someone wants to patent a bottle, the word ‘bottle’ is never used. The point was that how would a machine know the precise meaning of certain language constructions?
This was brought up when I read an article by Bob Ambrogi about Judicata, a start-up legal research system that purports to be better that WestlawNext and Lexis Advance. In the article, Ambrogi writes “It does this, he explains, by mapping the legal genome — that is, mapping the law with extreme accuracy and granularity.”
Hhmm.
I am very interested in this process and how it can supplant the mundane time sucking work of lawyers and law firm professionals by automating processes through AI. This process, theoretically, leaves the value added work to the humans. Ravel has done this to a certain degree. Now Judicata is claiming to be good enough at the process to rival Wexis.
The process was compared to the technology used to guide driverless cars and was more fully explained when the article continued with ‘…driverless cars require highly detailed, three-dimensional computerized maps that can pinpoint a car’s location and understand its surroundings.
Judicata has been trying to build that kind of a map for law. “This is different than what you might be thinking about given all the hype around AI,” he explains. “AI and machine learning only do as well as the data that goes into them. We’ve focused on creating better data.” ‘
Better data. Yes. That is what we need all around. We can’t always get it by having people fill out profiles. We need a combination of humans and machines. Humans get bored and frustrated and just want to accomplish their immediate task so they can go have a beer with friends. Humans don’t always think long term to the point where they will need to retrieve their data in the future.
I am looking at this from an access point of view. In my world it is possible to make all information accessible. We just need the right tools and the will to do it. If Judicata can do this for law, perhaps it can be done using similar technology for other disciplines as well. Granted, the law has special needs and requirements for retrieving information, but researchers need what they need in their own disciplines as well.
Perhaps presenting researchers, lawyers, academics, people with a machine produced profile and allowing them to modify it or overlay it with their own terms would be a start to improving accessibility? Ambrogi writes “The curse is that the mapping can be only partially automated and requires a significant level of human effort.” This is absolutely true and requires money. I would love to see researchers be able to provide some effort in that area based on the papers they publish. This is my idealized view of the world where everyone shares and wants universal access to all information. In my dreams, right? Perhaps PresQT can start us down that road and Judicata can take it further.