Big Data Research


Big Data and NoSQL

Big data features in four V’s, namely volume (huge quantity), velocity (high speed), variety (rich formats), and veracity (uncertain data). It usually exceeds the processing capability of any current single computing and storage architecture. My research in big data focuses on two major fields including NoSQL and heterogeneous platforms. The later topic aims to build a platform which provides service of both big data and cloud computing.

NoSQL (or Not Only SQL) refers to the databases which store data not necessarily in relational restrictions. NoSQL databases are comprised of a large set of databases and widely used in big data. NoSQL databases have a big subset called column-store databases. A notable feature of column-store databases is that they store data in columns instead of in rows. This innovative design results in a faster data reading speed and higher data compression rate compared with traditional row-based databases. However, optimizing write operations on column-store databases has always been a well-known challenge. Most existing works on write performance optimization merely focus on the in-memory environment which is hardly applicable to big data.