Azure HDInsight Performance Benchmarking: Interactive Query, Spark, and Presto

%3CLINGO-SUB%20id%3D%22lingo-sub-138907%22%20slang%3D%22en-US%22%3EAzure%20HDInsight%20Performance%20Benchmarking%3A%20Interactive%20Query%2C%20Spark%2C%20and%20Presto%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-138907%22%20slang%3D%22en-US%22%3E%3CP%3EFast%20SQL%20query%20processing%20at%20scale%20is%20often%20a%20key%20consideration%20for%20our%20customers.%20In%20this%20blog%20post%20we%20compare%20HDInsight%20Interactive%20Query%2C%20Spark%2C%20and%20Presto%20using%20the%20industry%20standard%20TPCDS%20benchmarks.%20These%20benchmarks%20are%20run%20using%20out%20of%20the%20box%20default%20HDInsight%20configurations%2C%20with%20no%20special%20optimizations.%20For%20customers%20wanting%20to%20run%20these%20benchmarks%2C%20please%20follow%20the%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3E%3CA%20title%3D%22%22%20href%3D%22https%3A%2F%2Fgithub.com%2Fhdinsight%2Ftpcds-hdinsight%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Eeasy%20to%20use%20steps%3C%2FA%3E%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3Eoutlined%20on%20GitHub.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThe%20TPC%20Benchmark%20DS%20(TPC-DS)%20is%20a%20decision%20support%20benchmark%20that%20models%20several%20generally%20applicable%20aspects%20of%20a%20decision%20support%20system%2C%20including%20queries%20and%20data%20maintenance.%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3E%3CA%20title%3D%22%22%20href%3D%22http%3A%2F%2Fwww.tpc.org%2Ftpcds%2F%22%20target%3D%22_blank%22%20rel%3D%22nofollow%20noopener%20noreferrer%22%3EAccording%20to%20TPCDS%3C%2FA%3E%2C%20the%20benchmark%20provides%20a%20representative%20evaluation%20of%20performance%20as%20a%20general%20purpose%20decision%20support%20system.%20A%20benchmark%20result%20measures%20query%20response%20time%20in%20single%20user%20mode%2C%20query%20throughput%20in%20multi-user%20mode%20and%20data%20maintenance%20performance%20for%20a%20given%20hardware%2C%20operating%20system%2C%20and%20data%20processing%20system%20configuration%20under%20a%20controlled%2C%20complex%2C%20and%20multi-user%20decision%20support%20workload.%20The%20purpose%20of%20TPC%20benchmarks%20is%20to%20provide%20relevant%2C%20objective%20performance%20data%20to%20industry%20users.%20TPC-DS%20Version%202%20enables%20emerging%20technologies%2C%20such%20as%20big%20data%20systems%2C%20to%20execute%20the%20benchmark.%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3E%3CSTRONG%3EPlease%20note%20that%20these%20are%20unaudited%20results.%3C%2FSTRONG%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSTRONG%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20style%3D%22width%3A%20910px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Fgxcuf89792.i.lithium.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F25936iB0C9105926F9CBDA%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20alt%3D%2254731fa9-f742-4b80-8fbf-0c43f2e55987.png%22%20title%3D%2254731fa9-f742-4b80-8fbf-0c43f2e55987.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FSTRONG%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3ERead%20about%20it%20in%20the%20%3CA%20href%3D%22https%3A%2F%2Fazure.microsoft.com%2Fen-us%2Fblog%2Fhdinsight-interactive-query-performance-benchmarks-and-integration-with-power-bi-direct-query%2F%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3EAzure%20blog%3C%2FA%3E.%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-138907%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3Ebig%20data%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EData%20%26amp%3B%20Storage%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Community Manager

Fast SQL query processing at scale is often a key consideration for our customers. In this blog post we compare HDInsight Interactive Query, Spark, and Presto using the industry standard TPCDS benchmarks. These benchmarks are run using out of the box default HDInsight configurations, with no special optimizations. For customers wanting to run these benchmarks, please follow the easy to use steps outlined on GitHub.

 

The TPC Benchmark DS (TPC-DS) is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. According to TPCDS, the benchmark provides a representative evaluation of performance as a general purpose decision support system. A benchmark result measures query response time in single user mode, query throughput in multi-user mode and data maintenance performance for a given hardware, operating system, and data processing system configuration under a controlled, complex, and multi-user decision support workload. The purpose of TPC benchmarks is to provide relevant, objective performance data to industry users. TPC-DS Version 2 enables emerging technologies, such as big data systems, to execute the benchmark. Please note that these are unaudited results.

 

54731fa9-f742-4b80-8fbf-0c43f2e55987.png

 

Read about it in the Azure blog.

0 Replies