We just solved the biggest problem with search! Introducing NOMAD™ AI Purely Visual Search. Click the button to the right to tour NOMADTM or schedule a demo today.
20 Metadata Experts Reveal the Most Common Mistakes Organizations Make When It Comes to Metadata Management
If you’re using digital asset management software (or thinking about it), understanding metadata and how to use it to your advantage is paramount. There are several types of metadata, the most commonly used being structural metadata, administrative metadata, and descriptive metadata, and each play an important role in making your digital assets discoverable via the robust search your DAM offers.
While metadata is crucial, not all companies have a firm handle on how to best manage metadata. To help you overcome common obstacles when it comes to metadata management, we reached out to a panel of DAM experts and data pros and asked them to answer this question:
“What are the most common mistakes organizations make when it comes to metadata management (and how can they avoid them)?”
Read on to find out what our panel had to say about the most common metadata management faux pas and how you can avoid making these mistakes.
It gives a description and context to the data which helps to organize, find, and understand data.
Here are the top mistakes that organizations frequently make when it comes to metadata management.
Not analyzing metadata requirements: Defining what objectives and goals need to be accomplished is the first step and a very vital one to have a clear picture in mind about the future scope and applications. Having measurable and definable objectives is imperative and will guide the course of the metadata management process.
Selecting metadata tools without conducting a thorough analysis of the options available: This requires an in-depth analysis in order to understand the different offerings and their pros and cons.
Not having a repository team responsible for metadata control: A repository team performs two very essential tasks: it collects and collates the data, and it provides access to the existing data. The absence of a dedicated repository team results in data mismanagement.
Not going for automation: Maintenance of the metadata also becomes a struggle over time. As the organization grows, so does the data inflow and outflow, making it very difficult to maintain manually.
Difficulty in accessing the metadata repository: If data access and retrieval requires too much effort, it defeats the purpose of metadata management.
Not fully capturing most or all potential metadata to begin with. So, its full potential isn’t realized. For example, a typical business setting uses digital alongside paper document processes. In most cases, paper documents have meaningful data too. It might be an account number, demographic information, item number, or other data point. Businesses should fully collect information from physical documents. Mature technologies, such as optical character recognition (OCR), exist specifically for such purposes. Using OCR to help manage metadata on physical documents is essential for anyone using a document management solution.
“Selecting a metadata tool without conducting an evaluation…”
All of the major metadata vendor tools maintain and control the repository in a different manner. Finding the tool that bests suits your company requires careful analysis. An educated consumer will be the most satisfied one because they understand exactly what they’re buying and what they’re not buying.
“One of the most common mistakes we see made when it comes to metadata management is…”
The absence of a framework for defining and managing the actual metadata values that are applied to assets. This creates a situation where applying metadata becomes something subjective and inconsistent between each person involved in the process. While it often seems like a wide-open approach to applying metadata dimensions brings with it unlimited opportunities, the reality is often a mess of unusable data that doesn’t scale effectively. Without consistency in how metadata is applied, the metadata itself becomes irrelevant since apples-to-apples comparison between assets is not possible.
“The most common mistake that organizations make in regard to metadata management is…”
An inconsistent data model. Without the correct documentation, the same attribute can have different names in different places. For instance, if you are describing objects and you want to refer to some dimension, you can call that either the length or the size. That is a seemingly logical decision since the words are synonyms. But when you have to process the data later, or someone wants to query it, then that is where you discover the problem. The dimension needs to be named either length or size: both need to be named the same thing to be manageable. Hours are wasted in the attempt to retroactively force conformity into the data model. A great way to avoid this kind of mistake is thoughtful design of an extensible data model at the outset.
Companies do not track, trend, or save all of their metadata. Our company is focused on the procurement space and countless customers only track name and dollar amount of purchases. Lost are the suppliers, make, model, color, etc. This can be detrimental to a business and introduce risk to the business.
“Not including, copyright, website link, and author metadata on images, videos, and graphics…”
If your original artwork is picked up or syndicated, you want to have your metadata embedded to maintain the rights and incur the benefits of artwork syndication on the web. Also, for search engine results, new standards allow you to write 320 characters worth of metadata that shows up in search engine results listings.
Having too many manual processes in the architecture of their metadata integration. This process should be as automated as possible, as it makes loading and maintaining the metadata repository a much swifter operation. Having to input the metadata manually is an extremely arduous and time-consuming job. It’s not uncommon for metadata to need manual attention for capturing information, but investing some time in analyzing your process to implement more automation will save a lot of time. Keying in business metadata manually can make it harder and harder to maintain the repository as time goes on.
Luke Wester, Digital Marketing Analyst at Miva, Inc.
“The most common mistake organizations make with metadata management is naming conventions…”
Without strong naming conventions in place, your data becomes exposed to a host of reporting inaccuracies. The best way to prevent this issue is to first solidify how your organization will define is lexicon. Here is a great resource on creating naming conventions. Second, you need to get your organization to commit to the naming conventions. This will help to properly categorize your metadata and give you more accurate reporting. Basically, make sure your organization uses the same naming conventions for metadata in order to reduce duplicates and missing data.
Adam Bowers, JD, eDiscovery Professional at DoeLEGAL
“A major issue with metadata management in any organization is that…”
Some forget to define their projects. That is to say, “What is the purpose of this project, what metadata do we need to preserve, and how do we measure a successful outcome?” Metadata is all around us, and humans create millions of pieces of metadata per day and even per hour. To understand how to manage metadata, it’s important to take a forest through the trees approach. Is the purpose of storing, collecting, and producing metadata well defined? If so, does the team agree on simple things like definitions?
In the context of litigations, metadata is used to fuel the eDiscovery software. When legal and litigation support professionals talk about metadata management, we are concerned with the preservation, collections, and transmittal of defined metadata fields. Litigation support professionals use metadata fields, such as HASH values, to certify that a collection has been done properly, but they also use metadata to ensure things like a proper chain of custody has been maintained.
Some of the most common mistakes we see in metadata management typically rank as follows:
Failure to identify/protect data in a timely fashion (therefore allowing more time for metadata is lost or destroyed)
Failure to properly collect data
Failure to document processes and procedures
As an industry, eDiscovery deals with fairly consistent metadata types, and there is not much ambiguity as to the importance of the metadata, but that does not mean that organizations do not have their problems. Metadata can be altered, destroyed, or even completely ignored if the requesting party does not take the time to ensure that the project is well defined. For any metadata management project, one must define what a successful project would look like and document the processes that are to be used. When proper protocols, controls, and procedures are not followed precisely – metadata management fails.
Another way metadata management can fail is when practitioners do not document the actual work being done and what did not function well. This all goes back to good communication skills, which is something that the IT industry has historically not been well known for. Metadata management is truly a study of the human nature, a product of the experience. To avoid the common and costly mistakes often found in metadata management, the advice is sound: define, document, and decode the results of every project to ensure your process produces successful and defensible outcomes.
Not maintaining your own internal database and backing it up as often as possible. Rather than just relying on the KPIs your provider offers you, having your own database allows you to come up with your own actionable insights, create dashboards, and even use machine learning for large quantities of data.
“Metadata management is often viewed as a panacea for organizations…”
They fall into the if only trap. If only I had a handle on my metadata I could…what? Organizations need to define the what first.
Step back and take a dispassionate view of what organizational objectives or improvements could be accomplished if metadata extraction, correlation, and visibility could feed into the organization’s broader objectives. Too often, organizations jump on a marketing message only to find that they failed to maximize the promise of the new technology.
One can avoid this trap by realizing that metadata isn’t new – it has been hiding in your files, folders, and file systems for decades. Metadata also lacks discipline. Ask any of today’s data scientists how much time they spend on data normalization versus data connections and you will find that it’s not so easy to determine if DR stands for Doctor (as in a profession) or Drive (as in a street), or Data Reduction (as an acronym).
I had to learn this the hard way, but the metadata problem isn’t just the tools and training; it’s also about how your organization interacts with the users of its data and includes the workflow strategy and philosophy governing this interaction. In short, how the people in your organization handle this information ethically and correctly.
You have some people who try to document every data table possible to the point of needlessly wearing themselves out, and on the other end, you have people that try to cram all their information into one data record which will end up driving you and anyone else who has to deal with the metadata crazy. You should aggregate like data and split data groups into different sources.
This seems to be an issue when it comes to metadata management for a lot of businesses. I’ve had to look at my own metadata and figure out what’s going on. When using the metadata parser, we tend to trust it to do the job perfectly, but at the end of the day, human review is the only true way to really understand what is good and bad metadata – the parser is just an aid. Make sure you’re looking at indentation problems and unrecognized, misplaced, missing, or empty elements.
Having the right tools to correctly describe your files so it can be effective when dealing with metadata is very important. If you don’t do this properly, other systems won’t be able to import and see your asset’s information correctly. So, to combat this, you have to be sure you’re using DAM solutions that can successfully write metadata for various formats, that it writes the data correctly so that it can be read the same way it’s written, and that it supports a wide variety of metadata standards.
“There are three common mistakes I see organizations make when it comes to metadata management…”
First, the definition of metadata is often poorly understood, which leads to metadata and transactional data being mixed up.
Second, there is often a lack of executive support for metadata management, which leads to issues with timeliness, completeness, and quality of the data as all organization departments need to be on-board with maintaining and updating metadata.
Third, there is often either non-existent or insufficient strategy and governance around metadata management, which leads to a poor understanding of the business benefits and impact of metadata, and the appropriate measures to put in place to ensure quality, completeness, and timeliness of the data.
The best way to avoid these common mistakes is to identify your objectives and use cases with metadata and incorporate it into an information management and analytics strategy. It is very important to obtain executive buy-in, and it can be helpful to hire an expert such as a consulting firm who can help guide on effective metadata management for your organization.
“The most common mistake that organizations make when it comes to metadata management is…”
Not using a common schema. A common schema refers to labeling an object uniformly across multiple databases.
An example of not using a common schema would be to label a house as home in one database, a house in another, a domicile in another, and a shelter in yet another. The problem is then that when the databases are queried for say house, only a fraction of the relevant results will be returned.
Another common example is labeling a location slightly differently in different databases or programs, such as USA, U.S., US, and United States.
Organizations can prevent this problem by agreeing on common labels for searchable terms in all of their databases. Additionally, ETL tools, like Logstash, can be used to make data uniform. For instance, an ETL tool can convert USA, U.S., US, and United States all to a common US in a central database.
“One of the most common mistakes I run across is the idea that they don’t have any weight when it comes to SEO…”
SEO encompasses so much, including UX. While a meta description might not have a place in RankBrain, it has an enormous place in CTR from SERPs to your website. How much do you want someone to click through to your site? Your metadata better reflect that.
Company usernames embedded in authorship. This can be files sent outside the organization or on the website. WordPress, for example, defaults new blog posts to the poster’s username as the author. Another common issue is if an employee’s username is default across internal and external systems. Something as simple as creating a Word document can embed a username on the document. All of this can make a hacker’s/scammer’s job easier.