MerlinOne’s NOMAD™: The First of the “Fifth Generation” Visual AI Systems

Ever since I was a kid I have been fascinated by military aircraft. Their design choices, their capabilities, how some of them changed history, are all of interest (revived recently with the release of the new “Top Gun: Maverick” movie). I happened upon a YouTube recording of a panel a few years back that tried to explain what “fifth-generation fighters” (like the F-22 and F-35) bring to the table.

The panel was composed of elite US military pilots who grew up in the fourth-generation world (F-15, F-16, F-18) and who then transitioned to fifth-generation aircraft. They explained that in the fourth-generation world the way you decided what was a winning fighter was by how fast it could fly, how well it could turn in a fight, and how many weapons it could carry. One speaker went on to say that the fifth-generation F-22 is incredibly fast, can turn in ways no one had ever seen before, and could carry a large weapons load. But those are the things lowest on the list of what makes it a great aircraft: the old criteria are obsolete!

F-22 (U.S. Air National Guard photo by Tech. Sgt. Steven Tucker)

F-22 (U.S. Air National Guard photo by Tech. Sgt. Steven Tucker)

What fifth-generation fighters bring to the table are two leaps forward: they are stealthy and can survive going where no other fighter can go (and thus find information that was previously undiscoverable), and they are incredible data platforms. They not only have amazing sensors that work together in “sensor fusion” but they can network in real time with other friendly aircraft, so everyone on that side can rapidly collaborate and make better decisions because they get great information in an easy to understand way.

All the pilots said having this wealth of easy-to-use information at their fingertips is a huge game changer. In fighter pilot speak, they are better defensively, and have much greater lethality. Indeed, most of the pilots on the panel said the aircraft should not be referred to as “fighters” anymore: they are in an entirely new category.

How does this play out in the real world? It happens a cousin of mine was the Boeing project manager for the F-22, and when the first prototypes were ready they took them to Nellis Air Force Base and ran exercises against the best pilots in a fourth-generation aircraft, the F-16. And they would send up four F-16’s (or more) against each F-22. My cousin was having lunch that week in the cafeteria and one of the F-16 pilots sat across from him, so he asked the pilot how his week was going. The pilot replied it has been really easy: he gets up in the morning, straps in the fighter, goes into the test area and flies around looking for an F-22. After a few minutes a light on his panel comes on that tells him he is dead. He lands and has lunch, then repeats the same thing in the afternoon. He never once saw an F-22. For fourth-generation fighters the lesson was “You fly, you die”. Stay on the ground if a fifth-generation fighter is in the area. It is that huge a leap forward.

What can this possibly have to do with systems that house still and video images (not just DAM systems but all sorts of applications)? Turns out, a lot.

The Fifth-Generation Fighters of AI Systems

Conventional AI systems that handle visual objects like still photos and video clips are typically called “DAMs” (for Digital Asset Management) and “fourth-generation” DAMs are generally judged on search speed, number of concurrent users they can support, and interfaces to other business systems like content management systems (CMS), Adobe applications etc.

Visual AI enabled systems are the “fifth-generation fighters” when it comes to handling visual objects. They can “sense” images and videos in your collection you cannot possibly find any other way (like when you have zero metadata: they go where no other search system can go). They too use “sensor fusion” to enable multi-modal searching, matching your text queries with a deep understanding of visual content. AI systems can even find previously undiscoverable objects you need but did not know you have! AI systems make your objects findable in an easy-to-understand way, no longer dependent on you guessing the exact right word from arbitrary (or scarce) metadata. They help working collaboratively, letting you instantly either find what you are looking for, or jointly examine choices and refine the concept of what you need.

Visual AI enablement is totally a game changer. Just describe what you want, no hoops to jump through. Tasks that took hours previously now are handled in seconds. These capabilities are a huge leap forward.

And there is much more. Interestingly all the fifth-generation pilots drew parallels between their aircraft and the iPhone, when it was first launched by Steve Jobs. At that event Jobs told us the iPhone would do three things for us:

  • It was a phone
  • It could reach the Internet
  • It could play music

And while those things were remarkable, as new ideas came along the iPhone proved to be an incredible platform for all sorts of new needs. Fitness guru? Carpenter’s level? Banking system? Navigation system? Calculator? Facebook? Instagram? You bet, all those and more. Similarly fifth-generation aircraft can be extended by software to do all sorts of things their designers never imagined.

In the same way we are just scratching the surface of Visual AI capabilities: they can be massively extended to satisfy new needs.

Long story short: designed correctly, AI enabled visual systems that deal with still and video objects are a game-changer, massively increasing system adoption, user satisfaction, and reducing the time-sinks and cost of effectively getting the job done. Welcome to “fifth-generation” DAM!


David Tenenbaum
MerlinOne, Inc.