Crunch, Data Conference, October 16-18, 2019 Budapest
Piotr Findeisen

Piotr Findeisen

Co-founder at Starburst

Bio:

Piotr is a Software Engineer at Starburst and member of the company founding team.  He contributes to Presto code base and is also active in the community.  He has been involved in significant features like cost-based optimizer, spill to disk, correlated subqueries and plethora of smaller enhancements.  Before Starburst, Piotr worked at Teradata and became top external Presto committer of the year.  Prior to that, he was a Team Leader at Syncron (provider cloud services for supply chain management), responsible for product technical foundation and performance.  Piotr holds M.S. in Computer Science (and B.Sc. in Mathematics) from University of Warsaw.

Workshop:

Presto: SQL-on-Anything, hands-on workshop

Topics:
data engineering
BI
SQL
ETL
data warehouse
analytics
Level:
Beginner

Description

Presto has become the ubiquitous open source software for SQL on anything. Presto is heavily used by Facebook, Netflix, Airbnb, LinkedIn, Twitter, Uber, and many others for low-latency querying large amounts of data, wherever it resides (Hadoop, AWS S3, Cassandra, Postgres, etc). Presto was engineered from the ground up for fast interactive SQL analytics against disparate data sources ranging in size from GBs to PBs.

Join Wojciech Biela for this full-day workshop to learn about Presto’s concepts, architecture and explore its many use cases and best practices you can implement today. Learn how to setup and use Presto through various hands-on exercises (those who don’t want to participate in the exercises can follow along).

Target audience

Roles: data engineers, data architects, software engineers, and those in IT

Prerequisite knowledge

A basic understanding of SQL, databases, Hadoop, and distributed systems.
Basic command line (Bash) skills.

Materials or downloads needed in advance

A laptop with a browser.

Agenda

Rough outline of the training, including slides and labs (hands-on exercises):

    • Presto architecture and technical concepts
    • Lab 1 - Manual Presto deployment
    • Presto query execution
    • Presto Ecosystem, Connectors and Connectivity
    • Migrating from Hive
    • Administering Presto
    • Presto in cloud environments
    • Lab 2 - Query S3 Data using Presto
    • Lab 3 - Query PostgreSQL using Presto
    • Lab 4 - Query Federation using Presto
    • Instructor lab demonstrations:
      • Lab 5 - Using Presto w/ AWS Glue Data Catalog
      • Lab 6 - Scaling Presto on AWS
    • Lab 7 - Presto and BI tools (connecting from Superset)
    • Query Performance, Cost-Based Optimizer
    • Lab 8 - Cost-Based Optimizer in Action
    • Security in Presto
    • Joining the Presto community