首页 > 解决方案 > Why are Column based Databases faster to query?

问题描述

I am researching on the approach to use for a new database which will be used for our new application. The amount of data won't be huge like 100 million records but will be about 10 million record before it gets archived. Will using column based approach for this much data be very useful than row based database? Why are column based database faster to query?

标签: database

解决方案


Essentially, the main difference between row oriented and column oriented DBs is the model they use to serialize data. Row oriented DBs store the data in each row as contiguous blocks. Column oriented DBs store the data of each column in contiguous blocks. Further, those blocks are typically compressed and sorted.

So, the difference in performance really depends on the type of query. Queries that use less I/O seeks will perform faster and reading from a contiguous block is a single seek. So if your query relies on aggregations or sorting on a column (or multiple columns) then a column store can perform much faster because reading all values in the column can be a single seek under optimal conditions. If your query is searching for a handful of customer records in a huge customer table based on an indexed column like a customer ID then a row store DB might actually be faster.


推荐阅读