Hadoop is one of the current IT buzzwords of the day and for good reason – it allows an organization to get meaning and actionable analysis out of “big data” that was previously unusable because it was too big (size constraints). This technology certainly solves a lot of problems but………
What happens if your problem doesn’t easily fit into the the Hadoop framework ?
Most of the work that we do in the financial sector falls into this category. It just doesn’t make sense to re-write existing code to fit into the Hadoop paradigm. Example case study here and blog post here.
As in any business, new ideas lose their ‘edge’ as they sit on the shelf or due to delays in the idea execution stage – primarily because of opportunity costs and increased chances of a competitor creating a product around the idea. The faster a concept can be brought to market, the larger the advantage to be had by the creator. This is especially true in the financial trading tech sector where advancements are measured in minutes/hours/days vs. weeks to months. Because of this, we’re always looking for new and creative ways to solve data and “big data” problems quicker.
Enter Apache Drill
One of the more interesting articles we came across recently focused on a new Apache project that aims to reduce the time to get answers out of a large data set. The project is named Apache Drill and here is a quick overview slide deck.
The Apache Drill project aims to create a tool similar to Google’s Dremel to facilitate faster queries across large datasets. Here is another take on the announcement from Wired. We’re excited about this because of the direct way this will impact our work and specifically the workloads that require real-time or near real-time answers.