Time: Thursday 2-Jul-2020 16:00 (This is a past event.)
Motivation / Abstract
Why should you attend this talk? Using RAPIDS and GPUs users can see their data science models run 100x faster or more, with little to no code changes required. - The RAPIDS suite of open-source software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. - Seamlessly scale from GPU workstations to multi-GPU servers and multi-node clusters with Dask. - Accelerate your Python data science toolchain with minimal code changes and no new tools to learn. - Increase machine learning model accuracy by iterating on models faster and deploying them more frequently. - Drastically improve your productivity with more interactive data science tools like XGBoost. - RAPIDS is an open-source project. Supported by NVIDIA, it also relies on numba, apache arrow, and many more open source projects.
- Introduction to GPUs and how it is possible to get such incredible speedups with minimal code changes. - Overview of popular RAPIDS tools such as GPU-accelerated Pandas (cuDF) and Sci-Kit Learn (cuML). - Guidance on how and where to get started.
- Understand the GPU performance parallel computing metrics. - RAPIDS performance on large scale data-sets. - Rapids syntax is similar to Pandas syntax and it makes it really easy for data scientists to transition. - Spark 3.0 GPU accelerated capabilities. - Resources to learn RAPIDS and Spark 3.0