WORKSHOPS

Patrick Hall

Patrick Hall

Senior Director of Product at H2O.ai

Practical Techniques for Interpretable Machine Learning

Topics:
machine learning
data science
prediction
interpretability
h2o
python
Level:
Intermediate

Transparency, auditability, and stability of predictive models and results are typically key differentiators in effective machine learning applications. Patrick will share tips and techniques learned through implementing interpretable machine learning solutions in industries like financial services, telecom, and health insurance. Using a set of publicly available and highly annotated examples, he teaches several holistic approaches to interpretable machine learning. The examples use the well-known University of California Irvine (UCI) credit card dataset and popular open source packages to train constrained, interpretable machine learning models and visualize, explain, and test more complex machine learning models in the context of an example credit-risk application. Along the way, Patrick draws on his applied experience to highlight crucial success factors and common pitfalls not typically discussed in blog posts and open source software documentation, such as the importance of both local and global explanation and the approximate nature of nearly all machine learning explanation techniques.

Who is this presentation for?

Researchers, scientists, data analysts, predictive modelers, business users and other professionals, and anyone else who uses or consumes machine learning techniques

Prerequisite knowledge

A working knowledge of Python, widely used linear modeling approaches, and machine learning algorithms.

Materials or downloads needed in advance

A laptop with a recent version of the Firefox or Chrome browser installed. (This tutorial will use an Aquarium environment.) As a backup, tutorial materials are available on GitHub: https://github.com/jphall663/interpretable_machine_learning_with_python

What you'll learn

The audience will learn several practical machine learning interpretability techniques and how to use them with Python. They will also learn the best way to use these techniques and common pitfalls to avoid when applying them.


Get Workshop Tickets
Zoltan C. Toth

Zoltan C. Toth

CTO at Datapao

Hands on Deep Learning with Keras, TensorFlow, and Apache Spark™ (Official Databricks Workshop)

Topics:
deep learning
data science
spark
tensorflow
keras
python
Level:
Intermediate

This course offers a thorough, hands-on overview of deep learning and its integration with Apache Spark.
This course covers the fundamentals of neural networks and how to build distributed TensorFlow models on top of Spark DataFrames. Throughout the class, you will use Keras, TensorFlow, Deep Learning Pipelines, and Horovod to build and tune models. This course is taught entirely in Python.

Objectives

Upon completion, students will be able to:
Build a neural network with Keras
Explain the difference between various activation functions and optimizers
Track experiments with MLflow
Apply models at scale with Deep Learning Pipelines
Perform transfer learningBuild distributed models with Horovod

Audience

Primarily directed towards the practicing data scientist who is eager to get started with deep learning and its integration with Apache Spark

Prerequisites

Python (numpy and pandas)
Apache Spark™ for Machine Learning and Data Science or equivalent experience

Get Workshop Tickets
Elena Verna

Elena Verna

Growth Advisor (prev. SurveyMonkey, Malwarebytes)

Why, What, How, and When behind growth teams (Half day workshop)

Topics:
growth
metrics
organisation
success
Level:
General

In this workshop we will cover why so many companies decide to take a plunge to start a Growth team & invest into Growth mindset. Let's brainstorm if your business knows what Growth means for you (acquisition, conversion, engagement, retention?) - it is all about the data! How to understand which metrics are doing well vs. require improvement. What Growth means within marketing and product teams. What does Growth team mean for your organizational structure? How to ensure Growth team is successful - leadership buy in, data availability, alignment. Most importantly, when it is the right time to start it (or not).

Get Workshop Tickets
Piotr Findeisen

Piotr Findeisen

Co-founder at Starburst

Wojciech Biela

Wojciech Biela

Co-founder at Starburst

Presto: SQL-on-Anything, hands-on workshop

Topics:
data engineering
BI
SQL
ETL
data warehouse
analytics
Level:
Beginner

Description

Presto has become the ubiquitous open source software for SQL on anything. Presto is heavily used by Facebook, Netflix, Airbnb, LinkedIn, Twitter, Uber, and many others for low-latency querying large amounts of data, wherever it resides (Hadoop, AWS S3, Cassandra, Postgres, etc). Presto was engineered from the ground up for fast interactive SQL analytics against disparate data sources ranging in size from GBs to PBs.

Join Wojciech Biela for this full-day workshop to learn about Presto’s concepts, architecture and explore its many use cases and best practices you can implement today. Learn how to setup and use Presto through various hands-on exercises (those who don’t want to participate in the exercises can follow along).

Target audience

Roles: data engineers, data architects, software engineers, and those in IT

Prerequisite knowledge

A basic understanding of SQL, databases, Hadoop, and distributed systems.
Basic command line (Bash) skills.

Materials or downloads needed in advance

A laptop with a browser.

Agenda

Rough outline of the training, including slides and labs (hands-on exercises):

    • Presto architecture and technical concepts
    • Lab 1 - Manual Presto deployment
    • Presto query execution
    • Presto Ecosystem, Connectors and Connectivity
    • Migrating from Hive
    • Administering Presto
    • Presto in cloud environments
    • Lab 2 - Query S3 Data using Presto
    • Lab 3 - Query PostgreSQL using Presto
    • Lab 4 - Query Federation using Presto
    • Instructor lab demonstrations:
      • Lab 5 - Using Presto w/ AWS Glue Data Catalog
      • Lab 6 - Scaling Presto on AWS
    • Lab 7 - Presto and BI tools (connecting from Superset)
    • Query Performance, Cost-Based Optimizer
    • Lab 8 - Cost-Based Optimizer in Action
    • Security in Presto
    • Joining the Presto community


Get Workshop Tickets
András Fülöp

András Fülöp

Solutions Architect at Datapao

Apache Spark™ Overview (Official Databricks Workshop)

Topics:
data-engineering
big data
open source
etl
sql
streaming
Level:
Beginner

This 1-day course is for data engineers, analysts, architects, data scientist, software engineers, IT operations, and technical managers interested in a brief hands-on overview of Apache Spark.

The course provides an introduction to the Spark architecture, some of the core APIs for using Spark, SQL and other high-level data access tools, as well as Spark’s streaming capabilities and machine learning APIs. The class is a mixture of lecture and hands-on labs.

Each topic includes lecture content along with hands-on labs in the Databricks notebook environment. Students may keep the notebooks and continue to use them with the free Databricks Community Edition offering after the class ends; all examples are guaranteed to run in that environment.

Get Workshop Tickets