Here is a dictionary of the big data technologies I most commonly run into. It is by no means, a complete list of all big data terms.

Name Type Definition
Voldemort Database A NoSQL database, based on DynamoDB developed at LinkedIn. It focuses on fast lookups for large distributed clusters.
Spark Framework A framework, written in Scala, for in-memory computing across computer clusters.
Solr Search An enterprise search platform, built on the Lucene Java library.
Scala Language A computer programming language used with big data applications.
Riak Database A NoSQL database, written in Erlang, that focuses on high availability and fault tolerance.
Redis Database A NoSQL, in memory database that uses a key/value store.
Python Language A programing language used for analytical computation with big data technologies.
Pig Service A high-level platform, designed for Hadoop, that supports SQL-like queries with a language called Pig Latin.
Oozie Service A job control system for workflow scheduling.
NoSQL Definition A term meaning “non SQL” or “not only SQL” for databases that have models differing from relational databases.
MongoDB Database A NoSQL, document-oriented database with JSON like objects.
MapReduce Programming Model An implementation for processing large data sets in parallel, across a distributed computing cluster.
MapR Company A company that provides a commercial distribution of Hadoop for enterprises.
Lucene Search A Java library for indexing and searching documents.
Kudu Service A columnar storage engine that runs on Hadoop.
Kafka Service A service for handling a large number of events related to real-time data feeds.
JSON Definition A notation for storing data objects.
Java Language A computer programming language often used for building and connecting big data technology.
Impala Service A SQL query engine, developed at Cloudera, that runs on Hadoop.
Hypertable Database A NoSQL database, based on BigTable, written in C++, that focuses on performance.
Hortonworks Company A public company that provides a commercial distribution of Hadoop.
Hive Service A data warehouse interface, developed at Facebook, that supports SQL-like language.
Heroku Server A cloud platform for hosting web applications, that focuses on scalability and ease of deployment.
Hbase Database A NoSQL database, written in Java, modeled after BigTable that uses a data structure of keys, column families, and column names.
Hadoop Distributed File System File System Commonly called HDFS. A file system written in Java for the Hadoop framework.
Hadoop Framework A framework, written in Java, for distributed storage and processing of data across computer clusters. It consists of the Hadoop Distributed File System, for storage, and MapReduce for data processing.
Google App Engine Server A cloud platform, owned by Google, for hosting web applications.
Flume Service A service that focuses on information gathering for log data.
ElasticSearch Search A search engine platform, built on Lucene that focuses more on web applications.
EC2 Server A cloud platform owned by Amazon that allows users to rent virtual computers or running large scale applications. Stands for Elastic Compute Cloud.
DynamoDB Database A proprietary, NoSQL database developed at Amazon. It is offered as part of the Amazon Web Services portfolio.
CouchDB Database A NoSQL, document-oriented database with a JavaScript interface that uses JSON to store data.
Cloudera Company A company that provides a commercial distribution of Hadoop along with support, services, and training to customers.
Cassandra Database A NoSQL database, originally developed at Facebook, that has a distributed key/value store.
BigTable Database A proprietary datastore developed by Google, accessible through the Google App Engine.
Azure Server A cloud platform, owned by Microsoft, for running large scale applications.
Accumulo Database A NoSQL database developed and open sourced by the National Security Agency that provides cell-level access labels.

Something missing? Let me know.