Transparency, auditability, and stability of predictive models and results are typically key differentiators in effective machine learning applications. Patrick will share tips and techniques learned through implementing interpretable machine learning solutions in industries like financial services, telecom, and health insurance. Using a set of publicly available and highly annotated examples, he teaches several holistic approaches to interpretable machine learning. The examples use the well-known University of California Irvine (UCI) credit card dataset and popular open source packages to train constrained, interpretable machine learning models and visualize, explain, and test more complex machine learning models in the context of an example credit-risk application. Along the way, Patrick draws on his applied experience to highlight crucial success factors and common pitfalls not typically discussed in blog posts and open source software documentation, such as the importance of both local and global explanation and the approximate nature of nearly all machine learning explanation techniques. Who is this presentation for? Researchers, scientists, data analysts, predictive modelers, business users and other professionals, and anyone else who uses or consumes machine learning techniques Prerequisite knowledge A working knowledge of Python, widely used linear modeling approaches, and machine learning algorithms. Materials or downloads needed in advance A laptop with a recent version of the Firefox or Chrome browser installed. (This tutorial will use an Aquarium environment.) As a backup, tutorial materials are available on GitHub: https://github.com/jphall663/interpretable_machine_learning_with_python What you'll learn The audience will learn several practical machine learning interpretability techniques and how to use them with Python. They will also learn the best way to use these techniques and common pitfalls to avoid when applying them.
This course offers a thorough, hands-on overview of deep learning and its integration with Apache Spark.
This course covers the fundamentals of neural networks and how to build distributed TensorFlow models on top of Spark DataFrames. Throughout the class, you will use Keras, TensorFlow, Deep Learning Pipelines, and Horovod to build and tune models. This course is taught entirely in Python.
Objectives
Upon completion, students will be able to:
Build a neural network with Keras
Explain the difference between various activation functions and optimizers
Track experiments with MLflow
Apply models at scale with Deep Learning Pipelines
Perform transfer learningBuild distributed models with Horovod
Audience
Primarily directed towards the practicing data scientist who is eager to get started with deep learning and its integration with Apache Spark
Prerequisites
Python (numpy and pandas)
Apache Spark™ for Machine Learning and Data Science or equivalent experience
In this workshop we will cover why so many companies decide to take a plunge to start a Growth team & invest into Growth mindset. Let's brainstorm if your business knows what Growth means for you (acquisition, conversion, engagement, retention?) - it is all about the data! How to understand which metrics are doing well vs. require improvement. What Growth means within marketing and product teams. What does Growth team mean for your organizational structure? How to ensure Growth team is successful - leadership buy in, data availability, alignment. Most importantly, when it is the right time to start it (or not).
Description
Presto has become the ubiquitous open source software for SQL on anything. Presto is heavily used by Facebook, Netflix, Airbnb, LinkedIn, Twitter, Uber, and many others for low-latency querying large amounts of data, wherever it resides (Hadoop, AWS S3, Cassandra, Postgres, etc). Presto was engineered from the ground up for fast interactive SQL analytics against disparate data sources ranging in size from GBs to PBs.
Join Wojciech Biela for this full-day workshop to learn about Presto’s concepts, architecture and explore its many use cases and best practices you can implement today. Learn how to setup and use Presto through various hands-on exercises (those who don’t want to participate in the exercises can follow along).
Target audience
Roles: data engineers, data architects, software engineers, and those in IT
Prerequisite knowledge
A basic understanding of SQL, databases, Hadoop, and distributed systems.
Basic command line (Bash) skills.
Materials or downloads needed in advance
A laptop with a browser.
Agenda
Rough outline of the training, including slides and labs (hands-on exercises):
This 1-day course is for data engineers, analysts, architects, data scientist, software engineers, IT operations, and technical managers interested in a brief hands-on overview of Apache Spark.
The course provides an introduction to the Spark architecture, some of the core APIs for using Spark, SQL and other high-level data access tools, as well as Spark’s streaming capabilities and machine learning APIs. The class is a mixture of lecture and hands-on labs.
Each topic includes lecture content along with hands-on labs in the Databricks notebook environment. Students may keep the notebooks and continue to use them with the free Databricks Community Edition offering after the class ends; all examples are guaranteed to run in that environment.