Vertica

From Wikipedia, the free encyclopedia
Vertica
IndustryDatabase management & Data warehousing
Founded2005
FounderAndrew Palmer and Michael Stonebraker
HeadquartersCambridge, MA, United States
Key people
Colin Mahony
(SVP and General Manager)
ProductsVertica Analytics Platform Enterprise Edition, Vertica SQL on Hadoop, Vertica Analytics Platform Community Edition
ParentMicro Focus
Websitewww.vertica.com

Vertica Systems is an analytic database management software company.[1][2] Vertica was founded in 2005 by database researcher Michael Stonebraker and Andrew Palmer. Palmer was the founding CEO. Ralph Breslauer and Christopher P. Lynch served as later CEOs.

Lynch joined as Chairman and CEO in 2010 and was responsible for Vertica's acquisition by Hewlett Packard in March 2011.[3][4] The acquisition expanded the HP Software portfolio for enterprise companies and the public sector group.[5] As part of the Micro Focus-Hewlett Packard Enterprise merger, Vertica joined Micro Focus in September, 2017.[6]

Products[]

The column-oriented Vertica Analytics Platform was designed to manage large, fast-growing volumes of data and provide very fast query performance when used for data warehouses and other query-intensive applications. The product claims to greatly improve query performance over traditional relational database systems, and to provide high availability and exabyte scalability on commodity enterprise servers. Vertica is infrastructure-independent, supporting deployments on multiple cloud platforms (AWS, Google Cloud, Azure), on-premises and natively on Hadoop nodes. Vertica in Eon Mode separates compute from storage and leverages low cost S3 object storage and the ability to apply compute to variable workloads, capitalizing on cloud economics. Vertica in Eon Mode is available on AWS, Google, and MS Azure clouds and on premises with Pure Storage Flashblade, Scality RING, Dell EMC ECS, and other S3 compatible hardware. Vertica claims that its Eon Mode architecture is the only analytics platform that brings the advantages of compute and storage separated cloud architecture to on-premises data centers.[7]

Its design features include:

  • Column-oriented storage organization, which increases performance of sequential record access at the expense of common transactional operations such as single record retrieval, updates, and deletes.[8]
  • Massively parallel processing (MPP) architecture to distribute queries on independent nodes and scale performance linearly.
  • Standard SQL interface with many analytics capabilities built-in, such as time series gap filling/interpolation, event-based windowing and sessionization, pattern matching, event series joins, statistical computation (e.g., regression analysis), and geospatial analysis.
  • In-database machine learning including categorization, fitting and prediction to enhance processing speed by eliminating the need for down-sampling and data movement. Vertica offers a variety of in-database algorithms, including linear regression, logistic regression, k-means clustering, Naive Bayes classification, random forest decision trees, XGBoost, and support vector machine regression and classification. It also allows deployment of ML models to multiple clusters.
  • High Compression, possible because columns of homogeneous datatype are stored together and because updates to the main store are batched.[9]
  • Automated workload management, data replication, server recovery, query optimization, and storage optimization.
  • Native integration with open source big data technologies like Apache Kafka and Apache Spark.
  • Support for standard programming interfaces, including ODBC, JDBC, ADO.NET, and OLEDB.
  • High-performance and parallel data transfer to statistical tools and built-in machine learning algorithms.[10][11]

Vertica's specialized approach aims to significantly increase query performance in data warehouses, while reducing the total cost of ownership by reducing the hardware footprint.[12]

In late 2011, the Vertica Analytics Platform Community Edition was made available for free with certain limitations, such as a maximum of one terabyte of raw data, three-node (servers) cluster, and community-based support.[13]

At the Vertica Unify event in July, 2021, Vertica Accelerator,[14] a Software as a Service version of Vertica, was announced initially on Amazon AWS only. Vertica Accelerator differs from similar analytics database services like Snowflake, Amazon Redshift, and Google BigQuery, in that it is software only. The infrastructure is not bundled in with a built in markup. Organizations are expected to already have an Amazon account and pay Amazon directly whatever amount they have negotiated. They pay Vertica only for the software and management service.

Optimizations[]

Several of Vertica’s features were originally prototyped within the C-Store column-oriented database, an academic open source research project at MIT and other universities. The system's architecture is described in a 2012 VLDB paper.[15]

The Vertica Analytics Platform runs on clusters of Linux-based commodity servers. It is also available on the Amazon Web Services , Microsoft Azure, Google Cloud Platform, and Alibaba Cloud, ensuring no infrastructure or platform lock in. The product integrates with Hadoop[16] to leverage HDFS via External Tables with ORC and Parquet Readers and can be installed on Hadoop nodes in a co-located manner as Vertica for SQL on Hadoop (a separate offering, priced by per node). These combined capabilities allow users to choose where to analyze their data, including across multiple data lakes.

In 2018, Vertica introduced Vertica in Eon Mode, a separation of compute and storage architecture, which is available on Amazon, Google, and Microsoft clouds. The Eon architecture allows for elastic increase and decrease in compute capability as needed for workload elasticity. It also allows instantiation of multiple isolated sub-clusters dedicated to different workloads while maintaining a single shared data repository. It operates on shared object storage in the cloud, and also runs on object storage compatible hardware on-premises for private cloud implementations. Examples of on-premises object storage hardware supported by Vertica in Eon Mode include Pure Storage Flashblade, Dell EMC ECS, and Scality Ring, or object storage simulation software like MinIO on commodity servers, providing maximum portability.

Recent versions of Vertica, 10.1.1 and 11,[17] have introduced Docker containerization and Kubernetes Statefulsets support, with tested containerized versions now released on Dockerhub and Github. This allows more automated deployment, testing, elastic scaling, and broader deployment support.

A range of BI, data visualization, and ETL tools are certified to work with and integrate with the Vertica Analytics Platform. Vertica also offers a certified and secure interface with the popular Kafka message bus, allowing streaming data ingestion. This capability combined with Vertica's high performance analytics supports use cases like Internet of Things, Edge Analytics and near real time Fraud Prevention.

Open Source[]

Vertica has a history of creating, contributing to, and integrating with open source technologies. There is an "awesome-vertica" curated list of Vertica open libraries, tools, and resources.

In 2018, Uber handed over management of vertica-python to Vertica. It replaced Vertica's previous proprietary Python client, and became their first officially supported open source database client.[18]

VerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica using the vertica-python client, thus taking advantage of Vertica’s speed and built-in analytics and machine learning capabilities.

Recent versions of Vertica, 10.1 and 11, have introduced Docker containerization and Kubernetes Statefulsets support, with tested containerized versions now released on Dockerhub and Github.

In 2021, Vertica contributed a new Spark connector to open source, replacing their previous proprietary connector.

Other Vertica sponsored open source projects include an integration with Grafana, Helm, Go, and Distributed R, among others. The Vertica Github repository also contains several machine learning and geospatial examples.

Company events[]

In January 2008, Sybase filed a patent-infringement lawsuit against Vertica.[19] In January 2010, Vertica prevailed in a preliminary hearing,[20] and in June, 2010, Sybase and Vertica resolved the suit, with the court dismissing all infringement claims.[21] Under the leadership of Colin Mahony, Vertica has sponsored various technological events in the database industry.[22]

In August 2013, Vertica held its first Big Data conference[23] event in Boston, MA USA. This event was held again in 2014, 2015, 2016, 2017and virtually in 2020 due to the COVID-19 pandemic. In 2021, the event was renamed to Vertica Unify.

In 2016, Vertica published with O'Reilly The Big Data Transformation: Understanding Why Change is Actually Good for Your Business.

References[]

  1. ^ Network World staff: "New database company raises funds, nabs ex-Oracle bigwigs”, [1] LinuxWorld, February 14, 2007
  2. ^ Brodkin, J: "10 enterprise software companies to watch", [2] Archived 2007-05-18 at the Wayback Machine Network World, April 11, 2007
  3. ^ HP News Release: “HP to Acquire Vertica: Customers Can Analyze Massive Amounts of Big Data at Speed and Scale” Feb. 2011
  4. ^ HP News Release: “HP Completes Acquisition of Vertica Systems, Inc.” March 22, 2011.
  5. ^ ComputerWorld.com: “Update: HP to buy Vertica for analytics.” Kanaracus. Feb. 2011.
  6. ^ SiliconAngle: "Vertica survives software industry turmoil to emerge as key cloud and big data player" Albertson.
  7. ^ Press Release: "Micro Focus Announces Vertica in Eon Mode for Pure Storage" Sept 17, 2019
  8. ^ Monash, C: "Are row-oriented RDBMS obsolete?" [3] DBMS2, January 22, 2007
  9. ^ Monash, C: "Mike Stonebraker on database compression – comments”,[4]DBMS2, March 24, 2007
  10. ^ Gagliordi, Natalie. "HP adds scale to open-source R in latest big data platform". ZDNet. Retrieved 17 February 2015.
  11. ^ Prasad, Shreya; Fard, Arash; Gupta, Vishrut; Martinez, Jorge; LeFevre, Jeff; Xu, Vincent; Hsu, Meichun; Roy, Indrajit (2015). "Enabling predictive analytics in Vertica: Fast data transfer, distributed model creation and in-database prediction". ACM SIGMOD International Conference on Management of Data.
  12. ^ One Size Fits All? Part 2: Benchmarking Results (sect. 3.1)
  13. ^ "Vertica Announces Community Edition Version of Vertica Analytic Database". Archived from the original on July 4, 2015. Retrieved August 17, 2016.
  14. ^ PR Newswire: "Vertica Announces Early Access of Vertica Accelerator" Micro Focus. June 15, 2021.
  15. ^ "The Vertica Analytic Database: C-Store 7 Years Later" (PDF). VLDB. August 28, 2012.
  16. ^ "Vertica-Hadoop integration". DBMS2. October 12, 2010.
  17. ^ Vertica Blog: "Vertica 10.1.1 Goes Beyond Analytics with Support for Azure Cloud, Kubernetes, and Containers" Healey. April 30,2021
  18. ^ Vertica Blog: "vertica-python Becomes Vertica’s First Officially Supported Open Source Database Client" Wall. August 14, 2018
  19. ^ Sybase, Inc. v. Vertica Systems, Inc. (Texas Eastern District Court January 30, 2008).Text
  20. ^ Monash, C: "Vertica slaughters Sybase in patent litigation”,[5]DBMS2, January 14, 2010
  21. ^ Vertica Press Release, "Vertica Resolves Sybase Patent Lawsuits" http://www.vertica.com/news/press/vertica-resolves-sybase-patent-lawsuits/
  22. ^ http://www.vertica.com/news/events/
  23. ^ HP Vertica Big Data Conference 2013 http://www.vertica.com/hp-vertica-big-data-conference-2013/

External links[]

Retrieved from ""