Skip navigation
Help

View

warning: Creating default object from empty value in /var/www/vhosts/sayforward.com/subdomains/recorder/httpdocs/modules/taxonomy/taxonomy.pages.inc on line 33.

Abstract
This paper presents Polybase, a feature of SQL Server PDW V2 that allows users to manage and query data stored in a Hadoop
cluster using the standard SQL query language. Unlike other database systems that provide only a relational view over HDFSresident data through the use of an external table mechanism, Polybase employs a split query processing paradigm in which
SQL operators on HDFS-resident data are translated into MapReduce jobs by the PDW query optimizer and then executed on the Hadoop cluster. The paper describes the design and implementation of Polybase along with a thorough performance evaluation that explores the benefits of employing a split query processing paradigm for executing queries that involve both structured data in a relational DBMS and unstructured data in Hadoop. Our results demonstrate that while the use of a splitbased query execution paradigm can improve the performance of some queries by as much as 10X, one must employ a cost-based query optimizer that considers a broad set of factors when deciding whether or not it is advantageous to push a SQL operator to Hadoop. These factors include the selectivity factor of the predicate, the relative sizes of the two clusters, and whether or not their nodes are co-located. In addition, differences in the semantics of the Java and SQL languages must be carefully considered in order to avoid altering the expected results of a query.

Link to the paper

0
Your rating: None

Screenshot of new Lite pager option in Views

The Views Litepager module solves a problem of scalability for sites with large amounts of content. Drupal's core pagination system creates a pager navigation that shows exactly how many pages of content exist for the content list. This requires that a COUNT query be executed based on the query used to generate the list.

While COUNT queries are blazingly fast on tables with MySQL's MyISAM engine, they are painfully slow when using InnoDB tables which is the recommended engine type for high traffic Drupal sites. The COUNT queries quickly degrade the more rows a table has.

The Views Litepager module solves this problem for Views pagination by providing a pager option that does not require a COUNT query to be executed. This "Lite" pager is only slightly less useful than Drupal's core pager in that it does not allow you to navigate to the "last" page and does not show how many total pages of content there are. But for large sites, this small cost in features is worth the boost in performance by ridding your pages of the painfully slow (and sometimes crippling) COUNT queries.

0
Your rating: None

Views preview with explain info appended.

Views Explain is a very simple module that runs EXPLAIN on the query generated by views, and appends that information to the views preview table.

0
Your rating: None