[RAPIDS] Massively Accelerated Modern Data-Science with RAPIDS.ai

Time: Thursday 2-Jul-2020 16:00 (This is a past event.)

Discussion Facilitator:

Artifacts
slides: please to see content

Motivation / Abstract
Why should you attend this talk?  

Using RAPIDS and GPUs users can see their data science models run 100x faster or more, with little to no code changes required.

- The RAPIDS suite of open-source software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. 

- Seamlessly scale from GPU workstations to multi-GPU servers and multi-node clusters with Dask.

- Accelerate your Python data science toolchain with minimal code changes and no new tools to learn.

- Increase machine learning model accuracy by iterating on models faster and deploying them more frequently. 

- Drastically improve your productivity with more interactive data science tools like XGBoost.

- RAPIDS is an open-source project. Supported by NVIDIA, it also relies on numba, apache arrow, and many more open source projects.
Questions Discussed
- Introduction to GPUs and how it is possible to get such incredible speedups with minimal code changes.
- Overview of popular RAPIDS tools such as GPU-accelerated Pandas (cuDF) and Sci-Kit Learn (cuML).
- Guidance on how and where to get started.
Key Takeaways
- Understand the GPU performance parallel computing metrics.
- RAPIDS performance on large scale data-sets.
- Rapids syntax is similar to Pandas syntax and it makes it really easy for data scientists to transition. 
- Spark 3.0 GPU accelerated capabilities.
- Resources to learn RAPIDS and Spark 3.0 

Stream Categories:
 ML Engineering and Ops