Skip navigation
Help

SQL

warning: Creating default object from empty value in /var/www/vhosts/sayforward.com/subdomains/recorder/httpdocs/modules/taxonomy/taxonomy.pages.inc on line 33.

This aricle, F1: A Distributed SQL Database That Scales by Srihari Srinivasan, is republished with permission from a blog you really should follow: Systems We Make - Curating Complex Distributed Systems.

With both the F1 and Spanner papers out its now possible to understand their interplay a bit holistically. So lets start by revisiting the key goals of both systems.

0
Your rating: None

With both the F1 and Spanner papers out its now possible to understand their interplay a bit holistically. So lets start by revisiting the key goals of both systems.

0
Your rating: None

Abstract
This paper presents Polybase, a feature of SQL Server PDW V2 that allows users to manage and query data stored in a Hadoop
cluster using the standard SQL query language. Unlike other database systems that provide only a relational view over HDFSresident data through the use of an external table mechanism, Polybase employs a split query processing paradigm in which
SQL operators on HDFS-resident data are translated into MapReduce jobs by the PDW query optimizer and then executed on the Hadoop cluster. The paper describes the design and implementation of Polybase along with a thorough performance evaluation that explores the benefits of employing a split query processing paradigm for executing queries that involve both structured data in a relational DBMS and unstructured data in Hadoop. Our results demonstrate that while the use of a splitbased query execution paradigm can improve the performance of some queries by as much as 10X, one must employ a cost-based query optimizer that considers a broad set of factors when deciding whether or not it is advantageous to push a SQL operator to Hadoop. These factors include the selectivity factor of the predicate, the relative sizes of the two clusters, and whether or not their nodes are co-located. In addition, differences in the semantics of the Java and SQL languages must be carefully considered in order to avoid altering the expected results of a query.

Link to the paper

0
Your rating: None
Original author: 
Stack Exchange

Stack Exchange

This Q&A is part of a weekly series of posts highlighting common questions encountered by technophiles and answered by users at Stack Exchange, a free, community-powered network of 100+ Q&A sites.

Dokkat appears to think that databases are overused. "Instead of a database, I just serialize my data to JSON, saving and loading it to disk when necessary," he writes. "All the data management is made on the program itself, which is faster AND easier than using SQL queries." What is missing here? Why should a developer use a database when saving data to a disk might work just as well?

See the original question here.

Read 18 remaining paragraphs | Comments

0
Your rating: None
Original author: 
Stack Exchange

Stack Exchange

This Q&A is part of a weekly series of posts highlighting common questions encountered by technophiles and answered by users at Stack Exchange, a free, community-powered network of 100+ Q&A sites.

Ankit works in J2SE (core java). During code reviews, he's frequently asked to reduce his lines of code (LOC). "It's not about removing redundant code," he writes. To his colleagues, "it's about following a style." Style over substance. Ankit says the readability of his code is suffering due to the dogmatic demands of his code reviewers. So how to find the right balance of brevity and readability?

See the original question here.

Read 20 remaining paragraphs | Comments

0
Your rating: None