Apache Hadoop is a software framework that supports data intensive distributed applications under a free-to-use license. Hadoop has been the driving force behind the Big Data industry. It enables applications to work over massive quantities of data with built-in features enabling enhanced reliability, scalability, and functionality over old style data tools.
If your legacy data storage methods are costly, or if you need to access your data fast, Hadoop's Distributed File System (HDFS) may be for you. Hadoop’s ability to implement new MapReduce methods may be exactly what you need if you have to make sense of large quantities of data. Storing and interacting with data using familiar SQL-type commands and traditional tools are also made easier by Hadoop framework’s hBase and Hive.
Most programs will want these and other tools provided in a way where they are all configured and proven to work together. The free-to-use Cloudera Distribution featuring Apache Hadoop (CDH) is the most widely used method to achieve this goal.
Please join CTOvision.com publisher Bob Gourley as he provides context on the emerging Big Data discipline and discusses the genesis of the MapReduce concept. Omer Trajman, VP of Technology Solutions will follow with an update on CDH and Cloudera Enterprise, two of the most popular capabilities for fielding Hadoop in production environments.
Contact Name: Edward Walinsky
Sign up for our newsletter.
Nazzic Keene has led SAIC for just over one year and she shares how her early priorities continue to drive the company forward, plus how COVID-19 is complicating everything.
In this episode of the Project 38 podcast, Senior Staff Writer Ross Wilkers talks with executives from the big 3 of government telecom about how they have kept agencies and people connected during the coronavirus pandemic.