Business processes require information to produce the desired outcomes. This information comes in various forms and is used in a variety of ways which cannot always be known in advance. However, there is a class of documents where it is necessary to control the format and its use in order to meet compliance requirements. Capabilities to manage these types of documents are needed today just as they were a few decades ago.
However, in today's world of big data and artificial intelligence (AI) managing documents is not seen as important and some would even say no longer necessary. In fact, recently, some technology enthusiasts are proclaiming that electronic/enterprise document management (EDM) is dead or will be in the near future. The approaches and technologies used in the past no longer (if at all) work and should be abandoned in favor of newer technologies.
There are many reasons that are typically given (several of which are well justified) as to why EDM has not provided the promised benefits. One key reason is that users still cannot find the documents they need even using EDM technology. EDM has traditionally relied on indexing documents using a classification scheme to locate documents. Developing and managing classification schemes is considered to be too costly, error prone, and not needed as you can just search the content within the documents directly.
In Part 1, of this blog post we will look at this assertion, the state of EDM and reasons why it has not delivered on its objectives. In, Part 2, we will consider how to address these shortcomings and outline how EDM can be successfully implemented using existing technologies.
HISTORICAL LESSONS
To start, it is helpful to remind ourselves that similar statements about EDM have been made before. This tends to happen whenever newer technologies enter the marketplace. This was the case when full text was first introduced in the 90s. Leading vendors at that time advocated doing away with classification schemes. Forget about trying to manage data because we can find the data for you using our search technology. Today we hear the same argument from those that promote big data and artificial intelligence.
Using full text search to discover information is useful and needed when looking for information stored in vast amounts of content. However, a critical problem with using these technologies is that they assume that the data is self-describing which means that data about the data (which we call metadata) is contained within the object we are looking for. If this is not the case then it becomes almost impossible to locate relevant information.
It is worth noting that after many years of using the web and searching using the content alone we are now investing significant amount of time and money doing Search Engine Optimization (SEO) to improve search results. There are still problems with false positives and search accuracy. We are now inserting classification (i.e. metadata) back into documents in the form of keywords and tags so that the content we are looking for can be found.
Google will get you close but not close enough which is a serious risk to compliance based processes. You can just imagine the consequences of retrieving and using the wrong procedure because the search engine returned a list of close but not exact matches to your query.
Managing document classifications is necessary when the purpose is to deliver exactly the correct document to the correct person at the correct time. This is still something that search engines alone cannot provide and one of the key reasons why you still need EDM.
WHAT IS EDM?
EDM is simply a system to manage documents and is considered part of the overall domain of Enterprise Content Management (ECM).
EDM manages the class of documents that need to be controlled because they are inputs to critical business processes in the same way that raw materials are controlled in manufacturing processes. We find that these documents are still mostly unstructured requiring data describing them to be controlled and managed outside the document itself.
EDM also provides other capabilities to manage important aspects that are critical to compliance which have largely been forgotten. It is common when talking about documents to take a reductionist view and lump them in the general bucket of data. This perspective unfortunately removes important distinctions that characterize the nature of documents which can be seen when considering the following definition for a document:
This definition suggests several characteristics that a document must have in order to be considered as evidence or as a record. These include:
Unalterable
Bi-temporal
Structured
Intentional
And so on
What is very common these days is to hear companies use the concept of a "living" document to describe their documents. These documents are constantly changing, edited in place, and where only the latest version should be used.
This description defines a particularly use case for how documents are edited and retrieved. However, the notion of living documents is seldom if ever used in compliance processes were what is critical is that the user use the "latest approved" version and more correctly the one that he/she was trained on. The use cases for which version should be used is more nuanced, for example:
The latest official release
The latest approved version
The latest approved version in the training system
The latest work in progress version
To effectively manage documents it is necessary to first understand what a document will be used for. It is in managing these intentions where EDM shines. This is very different from how content on the web is used. Content on the web typically is for a single use case and seldom has support for different uses of a document.
EDM systems will have many more capabilities to support what is needed to preserve the integrity of documents across various uses to satisfy business and compliance requirements. These will include:
Life Cycle Management (or workflow)
Metadata Management
Versioning
Electronic Signatures
Markup / Annotations
Multiple Formats
Office Integration
Relationship Management
Release Management
Digital Rights Management
Navigation / Search
The power of EDM comes from managing all the dependent relationships with related information. In this way, EDM is more like a database than it is a file server. These relationships describe the intention for each document and therefore essential from a compliance perspective. For example, a document is:
A Work Instruction, or Policy, or Standard
Effective for the next 24 hours
Superseded by the current version
Controlled or Uncontrolled
The latest approved version
THE STATE OF EDM
EDM has always suffered from an identity crisis. EDM started out as purpose built applications that utilized a relational database back end with an attached file store. This evolved to be more object oriented and over time transitioned to a platform offering in an attempt to become a "content" version of traditional database systems. API standards were developed to address proprietary interfaces and implementations. However, before these could gain traction the web took over. This would in many ways diminish the advances that EDM had up until then provided. It was very much three step forwards and two steps back.
The introduction of the web and later content management did furnish a needed level of standardization along with enabling the shift from client/server technologies to web based architectures. While this was good it sacrificed functionality specific to managing documents in doing so.
After many years of using HTML, creating web pages, and managing web content, most people consider managing content as synonymous with managing documents. Intranet platforms have for the most part replaced document management systems not in terms of capability but in terms of mind share. Many EDM vendors have been sidelined or have pivoted to content management providers. Some of them are doing both.
One of these vendors is Microsoft with their SharePoint platform which is used in many organizations. SharePoint is an intranet platform that has over time added document and record management capabilities. SharePoint is worth mentioning because it has also become the defacto repository for documents in many companies. However, instead of controlling documents using EDM paradigms, instead we find that:
1. Documents are managed as files
Metadata is not used, managed, or controlled
New documents are created for every version
Life cycles are implemented as folders were files are duplicated
2. Documents are managed as web content
Minimal life cycle management
Minimal relationship or Link management
Minimal release management
Minimal security
3. Document management is left to each business process owner
IT is not involved
Lack of consistent practices
Lack of expertise and best practices
4. Documents are stored in communication channels
E-mail
Messaging
5. Documents are stored in collaboration platforms:
File servers
Intranets
Cloud Applications
While data awareness and capabilities have to some degree improved over the years these have been limited to what can be done using spreadsheets and what can be done using content management on intranets. Unfortunately, both of these tools are inadequate to effectively control and manage data and documents.
The hope that content management would catch up to EDM still has not materialized. Many have waited for approaches such as the semantic web and RDF to create self-describing data however these have not advanced far enough to fill in the gaps.
In the meantime, information technology has moved on. Enterprise IT is now preoccupied with moving to the cloud. Application developers are deconstructing workflows and redoing them for mobile. Cloud providers are racing to become the preferred repository for all your data but mostly agnostic to how you use this information. Information technology for all intents and purposes has abandoned the domain of controlled documents and EDM.
WHAT CAN BE DONE?
Given the limited resources available to companies, many are struggling to manage documents needed to support their business processes. Many technologies exist to help but have been largely forgotten, misunderstood, or otherwise neglected. The good news is that the steps to improve the management of documents have largely stayed the same and include:
Identify which documents are critical for compliance.
Conduct a document inventory to locate each document, and how they are used.
Establish a standardized approach to managing these documents.
Leverage existing technologies to manage the document life-cycle
Automate management processes to embed evidence of compliance, streamline approvals, and manage document security.