Improving Big Data Querying and UX:

How We Created DIVER

The problem

At FINRA, it’s not only critical to collect, process, and manage data, but also to analyze it. If someone reports potential manipulation, does the market data support the claim? Should there be an investigation? Dedicated analysts look through transaction data to help decide. Making sense of this data isn’t easy: Analysts need information on specific stocks and trading timeframes from the markets. To do this, analysts have relied on a tool, Web Integrated Audit Trial (WIAT), for over a decade to pull information.

Yet, it couldn’t always solve the problem. “It had limits on how much data you could pull in a single request” Matt Cardillo, a Senior Director in FINRA technology said. “If the user asked a question that was too large, it would only return with an error.” For larger questions, analysts would have to break up the query into smaller request, making the process time consuming.

Part of the issue was the hardware: the data that it queries against were stored in static data appliances. It had a predefined limit that couldn’t scale either storage or computing power. To expand either storage or computing power required an additional appliance, an expensive approach for incremental growth in support of usage expansion.

In addition, the user experience was difficult to understand. “I’d asked analysts what different checkboxes of various transactions were for,” Matt said. “There were transactions within their areas of specialty they never used, and some were told to ‘avoid these transactions.’” It had a steep learning curve because it required a deep understand of the underlying data to figure out which particular transaction to use by selecting the appropriate checkbox in the application.

The old application also functioned as a downloading client, pulling data from a physical server and using a spreadsheet for analytics. For larger scale data, however, the spreadsheet couldn’t handle the millions of rows of data that analysts needed to use. These limitations forced additional workarounds which hampered productivity.

With analysts in various departments needing to pull larger and larger requests every day, something needed to change. “It was time to rethink user analytics for accessing the market data.” Matt said.

Solution: using AWS and Hive to build DIVER

Instead of updating the older application, Matt and his team began looking at ways to create a new application, DIVER. They wanted to create a solution that could access unprecedented amounts of data for analytics while providing a better experience. “We wanted an architecture that could pull millions of rows for analysts,” Matt explained, “and a user experience that was simple and more intuitive.”

To begin, various technology leaders inside FINRA began to look at potential technologies to use for the new system. “We wanted to move from a physical database to the cloud,” Matt said, focusing on the cloud’s ability to provide flexibility and scale for both storage and performance needs. DIVER created unique challenges for cloud storage and performance. “FINRA is working with an amount of data unlike most cloud customers,” said Matt. FINRA Technology leaders wanted to be sure that the technologies chosen would fit with our unique big data needs.

“I’ve replaced a lot of systems, but none that had this kind of constraint.”

In the end, Amazon Web Services was chosen as the best cloud service for FINRA’s needs. Using Amazon’s S3 file service, it’s possible to easily store both firm order data along with the data from the various exchanges on a daily basis which is measured in the billions. On top of this data, Matt and his application development team built DIVER. They chose to use Hive for querying data residing in S3 along with Amazon Redshift for creating private data marts for end users to conduct their analysis against.


Although DIVER is currently in a pilot phase of the product lifecycle, numerous benefits were realized.

Large data retrieval

Previously, retrieving more than 100k rows of market events for a single transaction was not possible. Analysts were forced to run smaller queries, wasting time retrieving information instead of understanding it. Today, analysts using DIVER can pull millions of rows of data without any issue. This result even surprised Matt: “I never would have imagined the application would be able to handle pulling back hundreds of millions rows.”

Powerful data query and analytics

To make analytical work easier, DIVER was created to help analysts reducing massive amounts of data to find irregularities. DIVER is able to do this with tools such as post filtering, aggregation, pivoting, and charting to provide summarized results. Thus, users are able to find initial answers to queries and begin to work with more manageable data sets for analysis.

DIVER’s functionality enhances the analysts’ experience. Finding pertinent data becomes an easier and faster experience. This allows them to do their work more easily and create strong case files for regulating aspects of the American financial markets.

Reduced costs

Even as DIVER has stronger computation power, it’s reduced costs with a more flexible system. Due to WIAT’s static nature, it was always running at the same power, regardless of demand. DIVER can provide more power during core business hours and then be reduced during off hours like adjusting your thermostat to save when you’re not home. The flexibility of the application allows for more powerful queries while reducing cost.

More than just processing costs, DIVER has also boosted productivity. Previously, large requests were broken into smaller ones to process. It was time consuming and tedious for analysts. In addition, many relied on IT staff to pull data via SQL directly. Today, DIVER breaks down the barrier to this big data. On DIVER data is easy to access. So, analysts spend less time querying and rely less on technical support. With less SQL requests, IT staff is freed to focus on other projects.

Just the beginning

While the results are impressive, the application development team knows there is more work to do. Work includes the full transition from WIAT to DIVER, which will occur over the next year. Ensuring DIVER can handle the variety of analysts’ tasks is critical to long term success. Today, DIVER is focused on the needs of the Equities’ analysts.

However, the long term goal is that DIVER will be a tool used across multiple products in FINRA such as Fixed Income and Options. To do so, Cardillo and his team will be continuing to look into new big data technology to make DIVER faster and able to support even more users. At the same time, the team works to create a simple user experience. Matt explains that they’re, “building a Swiss army knife that has a feature for everyone.” With a more intuitive user experience, analysts can get started more quickly and spend more time on reporting and less time wrestling to get at the data.

For now, fine tuning DIVER’s user experience includes frequent collaboration with various teams. There, analysts test DIVER in regards to the work they have today. Matt and his team work to relate DIVER’s features to the analysts’ daily needs. This collaboration is working: user response so far has been positive.

Of course some analysts are hoping for further updates on other applications, but that’s a story we’re only just beginning.