Free download Mastering Large Datasets with Python: Parallelize and Distribute Your Python Code by John T. Wolohan. Published by Manning. English | 296 Pages | True (PDF + EPUB, MOBI), CODE Files | ISBN: 978-1617296239
Description of Mastering Large Datasets with Python
With an emphasis on clarity, style, and performance, author J.T. Wolohan expertly guides you through implementing a functionally-influenced approach to Python coding. You’ll get familiar with Python’s functional built-ins like the functools operator and itertools modules, as well as the toolz library.
Mastering Large Datasets teaches you to write easily readable, easily scalable Python code that can efficiently process large volumes of structured and unstructured data. By the end of this comprehensive guide, you’ll have a solid grasp on the tools and methods that will take your code beyond the laptop and your data science career to the next level.
Python is a data scientist’s dream-come-true, thanks to readily available libraries that support tasks like data analysis, machine learning, visualization, and numerical computing.
What’s more, Python’s high-level nature makes for easy-to-read, concise code, which means speedy development and easy maintenance—valuable benefits in the multi-user development environments so prevalent in the realm of big data analysis. Python achieves superbly with features like its map and reduce functions.
What you will learn
- An introduction to functional and parallel programming
- Data science workflow
- Profiling code for better performance
- Python multiprocessing
- Practical exercises including full-scale distributed applications