New Research Sorts Through Big Data to Quickly Identify Threats

New Research from Sandia National Laboratory in the U.S.

It is now even more obvious how important it is for our security agencies to be able collect and analyze large blocks of data to be able identify patterns and point investigators in the right direction before things happening on the ground. The recent attacks in Paris point to the need for better intelligence, sharing of date,analysis of data and the ability to quickly make connections and generate leads to follow.

In this information age, national security analysts often find themselves searching for a needle in a haystack. The available data is growing much faster than analysts’ ability to observe and process it. Sometimes they can’t make key connections and often they are overwhelmed struggling to use data for predictions and forensics.

Analyists gather large amounts of data

Sandia National Laboratories manager Kristina Czuchlewski says Pattern Analytics to Support High-Performance Exploitation and Reasoning, known as PANTHER, has developed software that can represent remote sensor images, couple them with additional information and present them in a searchable form. 

(Photo by Randy Montoya)

Sandia National Laboratories’ Pattern Analytics to Support High-Performance Exploitation and Reasoning (PANTHER) team has made a number of breakthroughs that could help solve these problems. They’re developing solutions that will enable analysts to work smarter, faster and more effectively when looking at huge, complex amounts of data in real-time, stressful environments where the consequences might be life or death.

PANTHER’s accomplishments include rethinking how to compare motion and trajectories; developing software that can represent remote sensor images, couple them with additional information and present them in a searchable form; and conducting fundamental research on visual cognition, said Kristina Czuchlewski, PANTHER’s principal investigator and manager of Sandia’s Intelligence Surveillance and Reconnaissance Systems Engineering and Decision Support.

The PANTHER team looked at raw data and ways to pre-process and analyze it to make it searchable and more meaningful. The project’s fundamental research in cognitive science will inform the design of software and tools to help those viewing the data and make information of interest or trends easier to uncover.

PANTHER, which was funded by Sandia’s Laboratory Directed Research & Development program, is gleaning deeper insights from complex data sets in minutes instead of months, and covering hundreds of square miles instead of dozens.

“PANTHER developed the foundation for transforming how massive, complex data sets can be quickly analyzed to provide the nation’s decision-makers with new perspectives on situations and circumstances,” said Anthony Medina, director of Sandia’s Radio Frequency & Electronic Systems Center.

If an analyst is collecting information on a specific location over time and learns that something of interest might be occurring there, they probably don’t have the tools they need to quickly gather and analyze information from all relevant data sets that might corroborate the forecast. But PANTHER is probably the nation’s best bet right now to get to that point quickly.

Tracktable code automates observation of motion, trajectories

Mark Rintoul, a Sandia data scientist, developed the Tracktable code along with Sandia researcher Andy Wilson and others to automate the observation of motion and trajectories. The code could be applied to any problem that involves movement, such as airliners, ships or people.

Current approaches to getting meaningful information from trajectories focus on comparing one trajectory to another. If you have millions of trajectories to consider, that could mean trillions of comparisons, which takes a lot of time and computer power, Rintoul said.

U.S. air traffic map
This award-winning image by Sandia researcher Andy Wilson shows PANTHER’s geometric and temporal trajectory analyses of air traffic patterns from 43,000 flights over the continental United States on April 4, 2014. In this image, which is far more intricate than what we see from the ground, white lines represent level flight, orange lines indicate ascent and blue lines show descent. Around the edges are smaller views of most of the busiest airports to show the wide variety of traffic patterns. The image was runner-up in an international contest sponsored by IEEE Visualization and Graphics Technical Committee in 2014. (Image by Andy Wilson) Click on the thumbnail for a high-resolution image.

“We’ve developed a way to store and represent trajectories so that computers can compare them all at once in a very fast and effective manner,” he said. Instead of trillions of comparisons, the software does the same job in millions of comparisons, which is manageable.

An analyst concerned about the number of airliners stuck in holding patterns could ask Tracktable about aircraft trajectories that made a certain pattern of turns. Tracktable then calculates geometric features, such as the number of 90-degree turns an aircraft flew or the length of a straight line. By associating a type of motion with these features and assigning a number to each feature, the computer can quickly group flights that behave in similar ways and show them to the viewer for interpretation.

If you have millions and you’re not interested in precise comparisons, but general groupings of them, this is very effective.

PANTHER also examined the predictive capability of the information buried in data. If an analyst looks at the first half of a flight, considers historical data about similar flight paths and then looks at the second half of the flight, any deviation from the pattern might cue an analyst to take a closer look. Finding that outlier from millions of flights that have flown before takes about a second with Tracktable, Rintoul said. The analyst is alerted because PANTHER team members are using the advances in cognitive science to design visual results that will highlight the odd behavior of the single aircraft. By studying how analysts use visual data, Sandia researchers are figuring out ways to make an outlier pop out of a screen full of detail to demand an analyst’s attention.

The team is now looking at integrating motion and trajectories into a system called GeoGraphy.

GeoGraphy helps analysts search for items of interest, shows changes over time

GeoGraphy, initially funded by the National Nuclear Security Administration, is a software system that converts remote sensing images expressed in pixels into nodes and edges in a graph to show changes over time and make the data searchable, said Randy Brost, a Sandia computer scientist who led the team that developed the software. Nodes are analogous to the beige hubs in Tinkertoys, while edges are the colored connecting rods.

GeoGraphy breaks the images into categories, such as buildings, trees or rivers. This pre-processing creates a graphic resembling a complex paint-by-number that shows the categories of everything in the image. The program uses nodes and edges to describe relationships between objects, such as distance or time, Brost and Czuchlewski said.

In addition to the imagery, the software package could include such information as phone books or county records, producing a single searchable database of all the information that shows what’s changed over time.

For example, to find a high school, the analyst tells the program to search for large buildings near regions that look like parking lots, football fields and tennis courts and defines those items. The analyst then can choose from among the results the computer provides.

The system is hierarchical, so once analysts identify high schools, they can ask the program to find high schools the next time without describing them. And should they doubt that something is a high school, the software makes the raw data available so they can verify the results, Brost said.

“The purpose of these codes — GeoGraphy and Tracktable — is to assist humans, not to replace them or to automatically do their jobs. It’s to enhance their ability to do their jobs well and to allow them to be more effective in dealing with large sets of evidence,” Brost said.

In the end, basically they are suggestion systems that say, ‘Hey, based on what you told me you’re interested in, you ought to look here, here and here.’

The PANTHER team also included researchers focused on enhancing the viewer experience. Researcher Laura Matzen and others are conducting cognitive science experiments to learn how analysts’ expertise affects their visual cognition and to create a model of how top-down visual attention — when a user approaches an image with a goal in mind — works. The researchers hope to use the answers they find to such fundamental cognitive science questions to inform the design of new tools that will improve interactions between humans and computers, Matzen said.

The prototype products and ideas developed under PANTHER are ready for the next step in their development: to be tested in real-world environments, Czuchlewski said.

Sandia researchers have proposed research into new problems illuminated by PANTHER, while other agencies are solidifying the foundation PANTHER has developed. Other projects will use PANTHER’s ideas to address real-world problems, the researchers said.

“We went into PANTHER thinking we were going to do one thing, we’re going to improve the lives of image analysts,” Czuchlewski said

And, in the research process, we did a whole lot more.


Featured Image by Andy Wilson