Bask - Shop now
Add Prime to get Fast, Free delivery
Amazon prime logo
Buy new:
$64.99
FREE delivery Monday, April 14
Ships from: Amazon.com
Sold by: Amazon.com
$64.99
FREE Returns
FREE delivery Monday, April 14
Or fastest delivery Thursday, April 10. Order within 20 hrs 41 mins
Only 1 left in stock (more on the way).
$$64.99 () Includes selected options. Includes initial monthly payment and selected options. Details
Price
Subtotal
$$64.99
Subtotal
Initial payment breakdown
Shipping cost, delivery date, and order total (including tax) shown at checkout.
Ships from
Amazon.com
Amazon.com
Ships from
Amazon.com
Sold by
Amazon.com
Amazon.com
Sold by
Amazon.com
Returns
30-day refund/replacement
30-day refund/replacement
This item can be returned in its original condition for a full refund or replacement within 30 days of receipt.
Payment
Secure transaction
Your transaction is secure
We work hard to protect your security and privacy. Our payment security system encrypts your information during transmission. We don’t share your credit card details with third-party sellers, and we don’t sell your information to others. Learn more
Support
Product support included
What's Product Support?
In the event your product doesn't work as expected or you need help using it, Amazon offers free product support options such as live phone/chat with an Amazon associate, manufacturer contact information, step-by-step troubleshooting guides, and help videos. By solving product issues, we help the planet by extending the life of products. Availability of support options differ by product and country. Learn more
$11.52
Get Fast, Free Shipping with Amazon Prime FREE Returns
Excellent condition. Pages are crisp and clean with no markings. Ships direct from Amazon! Excellent condition. Pages are crisp and clean with no markings. Ships direct from Amazon! See less
FREE delivery Saturday, April 12 on orders shipped by Amazon over $35
Or Prime members get FREE delivery Wednesday, April 9. Order within 18 hrs 41 mins.
Only 1 left in stock - order soon.
$$64.99 () Includes selected options. Includes initial monthly payment and selected options. Details
Price
Subtotal
$$64.99
Subtotal
Initial payment breakdown
Shipping cost, delivery date, and order total (including tax) shown at checkout.
Access codes and supplements are not guaranteed with used items.
Kindle app logo image

Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.

Read instantly on your browser with Kindle for Web.

Using your mobile phone camera - scan the code below and download the Kindle app.

QR code to download the Kindle App

Follow the author

Something went wrong. Please try your request again later.

Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning 1st Edition

4.2 out of 5 stars 82 ratings

{"desktop_buybox_group_1":[{"displayPrice":"$64.99","priceAmount":64.99,"currencySymbol":"$","integerValue":"64","decimalSeparator":".","fractionalValue":"99","symbolPosition":"left","hasSpace":false,"showFractionalPartIfEmpty":true,"offerListingId":"O2jgq3OiBg8eTPj915xfqHw3kbp9IHBTn9eR0Vbxu8ck4UdQFla6aFBz4KShSgL3n7byWSfYTa6R21E1MffhLmglQkvI9I9pt48M%2BT7Nt7NRitvbVNr2k1ys0l%2FwuXxRuQCiUY9ZT6k9kY2MZUAOdA%3D%3D","locale":"en-US","buyingOptionType":"NEW","aapiBuyingOptionIndex":0}, {"displayPrice":"$11.52","priceAmount":11.52,"currencySymbol":"$","integerValue":"11","decimalSeparator":".","fractionalValue":"52","symbolPosition":"left","hasSpace":false,"showFractionalPartIfEmpty":true,"offerListingId":"O2jgq3OiBg8eTPj915xfqHw3kbp9IHBTSh%2FRl1OkrddVdn6Za2qUoRUoSR%2FKkQtu9Pos7Ux%2FUtep6uIzEYpaB1Od%2B4%2BL1LBB3PYGH0cJMyaC5jYglnRVK9JIA1hT%2BfVrojTsgIBH%2FDTLqz5j6fxTbq1RcxL9dCjYuw3ctfEFOFS3pZvW44mNqr3OgoMuEST8","locale":"en-US","buyingOptionType":"USED","aapiBuyingOptionIndex":1}]}

Purchase options and add-ons

Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP. Through the course of the book, you’ll work through a sample business decision by employing a variety of data science approaches.

Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.

You’ll learn how to:

  • Automate and schedule data ingest, using an App Engine application
  • Create and populate a dashboard in Google Data Studio
  • Build a real-time analysis pipeline to carry out streaming analytics
  • Conduct interactive data exploration with Google BigQuery
  • Create a Bayesian model on a Cloud Dataproc cluster
  • Build a logistic regression machine-learning model with Spark
  • Compute time-aggregate features with a Cloud Dataflow pipeline
  • Create a high-performing prediction model with TensorFlow
  • Use your deployed model as a microservice you can access from both batch and real-time pipelines

There is a newer edition of this item:

Frequently bought together

This item: Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning
$64.99
Get it as soon as Monday, Apr 14
Only 1 left in stock (more on the way).
Ships from and sold by Amazon.com.
+
$23.04
Get it as soon as Saturday, Apr 12
Only 7 left in stock (more on the way).
Ships from and sold by Amazon.com.
+
$42.85
Get it Apr 16 - 23
In stock
Usually ships within 4 to 5 days.
Ships from and sold by allnewbooks.
Total price: $00
To see our price, add these items to your cart.
Details
Added to Cart
Some of these items ship sooner than the others.
Choose items to buy together.

From the Publisher

Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From In

From the Preface

In this book, we walk through an example of this new transformative, more collaborative way of doing data science. You will learn how to implement an end-to-end data pipeline-we will begin with ingesting the data in a serverless way and work our way through data exploration, dashboards, relational databases, and streaming data all the way to training and making operational a machine learning model. I cover all these aspects of data-based services because data engineers will be involved in designing the services, developing the statistical and machine learning models and implementing them in large-scale production and in real time.

Who This Book Is For

If you use computers to work with data, this book is for you. You might go by the title of data analyst, database administrator, data engineer, data scientist, or systems programmer today. Although your role might be narrower today (perhaps you do only data analysis, or only model building, or only DevOps), you want to stretch your wings a bit-you want to learn how to create data science models as well as how to implement them at scale in production systems.

Google Cloud Platform is designed to make you forget about infrastructure. The marquee data services-Google BigQuery, Cloud Dataflow, Cloud Pub/Sub, and Cloud ML Engine-are all serverless and autoscaling. When you submit a query to BigQuery, it is run on thousands of nodes, and you get your result back; you don’t spin up a cluster or install any software. Similarly, in Cloud Dataflow, when you submit a data pipeline, and in Cloud Machine Learning Engine, when you submit a machine learning job, you can process data at scale and train models at scale without worrying about cluster management or failure recovery. Cloud Pub/Sub is a global messaging service that autoscales to the throughput and number of subscribers and publishers without any work on your part. Even when you’re running open source software like Apache Spark that’s designed to operate on a cluster, Google Cloud Platform makes it easy. Leave your data on Google Cloud Storage, not in HDFS, and spin up a job-specific cluster to run the Spark job. After the job completes, you can safely delete the cluster. Because of this job-specific infrastructure, there’s no need to fear overprovisioning hardware or running out of capacity to run a job when you need it. Plus, data is encrypted, both at rest and in transit, and kept secure. As a data scientist, not having to manage infrastructure is incredibly liberating.

The reason that you can afford to forget about virtual machines and clusters when running on Google Cloud Platform comes down to networking. The network bisection bandwidth within a Google Cloud Platform datacenter is 1 PBps, and so sustained reads off Cloud Storage are extremely fast. What this means is that you don’t need to shard your data as you would with traditional MapReduce jobs. Instead, Google Cloud Platform can autoscale your compute jobs by shuffling the data onto new compute nodes as needed. Hence, you’re liberated from cluster management when doing data science on Google Cloud Platform.

These autoscaled, fully managed services make it easier to implement data science models at scale-which is why data scientists no longer need to hand off their models to data engineers. Instead, they can write a data science workload, submit it to the cloud, and have that workload executed automatically in an autoscaled manner. At the same time, data science packages are becoming simpler and simpler. So, it has become extremely easy for an engineer to slurp in data and use a canned model to get an initial (and often very good) model up and running. With well-designed packages and easy-to-consume APIs, you don’t need to know the esoteric details of data science algorithms-only what each algorithm does, and how to link algorithms together to solve realistic problems. This convergence between data science and data engineering is why you can stretch your wings beyond your current role.

Rather than simply read this book cover-to-cover, I strongly encourage you to follow along with me by also trying out the code. The full source code for the end-to-end pipeline I build in this book is on GitHub. Create a Google Cloud Platform project and after reading each chapter, try to repeat what I did by referring to the code and to the Readme file in each folder of the GitHub repository.

Editorial Reviews

About the Author

Valliappa (Lak) Lakshmanan is currently a Tech Lead for Data and Machine Learning Professional Services for Google Cloud. His mission is to democratize machine learning so that it can be done by anyone anywhere using Google's amazing infrastructure, without deep knowledge of statistics or programming or ownership of a lot of hardware. Before Google, he led a team of data scientists at the Climate Corporation and was a Research Scientist at NOAA National Severe Storms Laboratory, working on machine learning applications for severe weather diagnosis and prediction.

Product details

  • Publisher ‏ : ‎ O'Reilly Media; 1st edition (January 16, 2018)
  • Language ‏ : ‎ English
  • Paperback ‏ : ‎ 404 pages
  • ISBN-10 ‏ : ‎ 1491974567
  • ISBN-13 ‏ : ‎ 978-1491974568
  • Item Weight ‏ : ‎ 1.42 pounds
  • Dimensions ‏ : ‎ 7 x 0.83 x 9.19 inches
  • Customer Reviews:
    4.2 out of 5 stars 82 ratings

About the author

Follow authors to get new release updates, plus improved recommendations.
Valliappa Lakshmanan
Brief content visible, double tap to read full content.
Full content visible, double tap to read brief content.

Lak is Head for Data Analytics and AI Solutions on Google Cloud. His team builds software solutions for business problems using Google Cloud's data analytics and machine learning products. He is the author of Machine Learning Design Patterns, Data Science on GCP (O'Reilly), BigQuery the Definitive Guide (O'Reilly). He founded Google's Advanced Solutions Lab ML Immersion program. Before Google, Lak was a Director of Data Science at Climate Corporation and a Research Scientist at NOAA. He's the original author of several Coursera specializations including Machine Learning on GCP, Advanced Machine Learning on GCP, and Data Engineering.

Follow him on Twitter at @lak_luster.

http://www.vlakshman.com/

Customer reviews

4.2 out of 5 stars
82 global ratings

Review this product

Share your thoughts with other customers
Data analysis and engineering is democratized for all
5 out of 5 stars
Data analysis and engineering is democratized for all
Wow. A true tour of data science and engineering on the cloud.It's been a few years since I've worked with tools in this field, but this book was a clear level-headed view for data engineers looking to derive and drive insights from data. Using a core example use case and following it end to end through the entire book (and indeed cloud tools integrated with each other) helped me keep track of what was going on, and kept things from becoming a book on theory rather than one of accomplishment and answers. The purpose and process for each tool was clear, and I also appreciated the explanations of trade-offs and the value added for the choices made. The practice of data science is a LOT easier now with cloud/serverless tools than eight or nine years ago, and I feel this brought me back to the state of the art.
Thank you for your feedback
Sorry, there was an error
Sorry we couldn't load the review

Top reviews from the United States

  • Reviewed in the United States on November 29, 2019
    I knew this book for me just a few pages into the first chapter. This book by Lake is unlike many other books of data science and particular technology that just enumerate the how-to's of the particular technology. Lak starts with a concrete user problem strongly anchored in probabilistic outcomes, and then steps through a typical data science process of discovery, refinement, and then converting to a production pipeline. While teaching about GCP technologies along the way, the book stays strongly anchored in the original user-problem. There is not a corner of GCP that is needed for a full production data science product that goes untouched in this book. The material is well covered, with pointers to deeper material and user manuals.

    I received the first edition. As GCP technology evolved, Lak was posting updates to his blog on Medium so that everyone could take understand the updates to GCP and how to use them. I was pleasantly surprised by getting these updates and made having the book that much more valuable.
    2 people found this helpful
    Report
  • Reviewed in the United States on January 16, 2018
    Wow. A true tour of data science and engineering on the cloud.

    It's been a few years since I've worked with tools in this field, but this book was a clear level-headed view for data engineers looking to derive and drive insights from data. Using a core example use case and following it end to end through the entire book (and indeed cloud tools integrated with each other) helped me keep track of what was going on, and kept things from becoming a book on theory rather than one of accomplishment and answers. The purpose and process for each tool was clear, and I also appreciated the explanations of trade-offs and the value added for the choices made. The practice of data science is a LOT easier now with cloud/serverless tools than eight or nine years ago, and I feel this brought me back to the state of the art.
    Customer image
    5.0 out of 5 stars
    Data analysis and engineering is democratized for all

    Reviewed in the United States on January 16, 2018
    Wow. A true tour of data science and engineering on the cloud.

    It's been a few years since I've worked with tools in this field, but this book was a clear level-headed view for data engineers looking to derive and drive insights from data. Using a core example use case and following it end to end through the entire book (and indeed cloud tools integrated with each other) helped me keep track of what was going on, and kept things from becoming a book on theory rather than one of accomplishment and answers. The purpose and process for each tool was clear, and I also appreciated the explanations of trade-offs and the value added for the choices made. The practice of data science is a LOT easier now with cloud/serverless tools than eight or nine years ago, and I feel this brought me back to the state of the art.
    Images in this review
    Customer image
    15 people found this helpful
    Report
  • Reviewed in the United States on August 2, 2019
    While Lak’s conversational style can be a turn off to some who just want an answer and don’t care about how, I liked this book. Many times with books like this you get an answer or a recipe and you’re done. What happens when your answer or recipe isn’t right for the situation? I’m glad Lak explains his rationale and let’s it be known that there’s more than one way to do it. Could the book have been condensed without the explanations? Yes. Would it have been like almost every other book in the space? Yes. Check out this book if you want a well thought out answer and maybe alternates. If you just want the “right answer”, then buy something else.
    6 people found this helpful
    Report
  • Reviewed in the United States on January 29, 2021
    I do not understand the high reviews for this book, especially ones written in 2020. I'm only into chapter 2 and the code to download the files fails. There is a supplement on the github page that allowed me to copy the bucket. But, the explanation, like many things is vague and not accurate (you don't provide the path to your bucket, but just the name of the bucket). I assumed this book was an introduction to using the Google Cloud Platform for data science. So I am expecting an introduction. This book has detail where it doesn't need it, and lacks detail where it does. It just assumes you have already been using GCP, but if that were the case this book isn't really needed then.

    Major Problems:
    1. Code is not working.
    2. Code is not explained in any detail.
    3. Vague details about how to navigate GCP (chapter one has you create a bucket, but doesn't explain what a bucket is, and how to create it, yet there are three pages about the definition of a data engineer).
    4. Inconsistent assumptions about your background knowledge.

    Good parts:
    1. The use of a case study for learning.
    6 people found this helpful
    Report
  • Reviewed in the United States on June 11, 2019
    The book is easy to follow with detailed descriptions of each step followed to build a project from start to end on the Google Cloud Platform.

    The book is also accompanied by a code repository which lets the readers try out the project themselves.

    Strongly recommended for data scientists learning to use the platform.
    2 people found this helpful
    Report
  • Reviewed in the United States on January 7, 2020
    Narrative structure in a technical book is hard to find, and this was executed last masterfully, with lots of code examples for you to follow along with on your own. Highly recommended.
    2 people found this helpful
    Report
  • Reviewed in the United States on May 21, 2020
    Excellent book for learning which GCP services can be used for what portions of data analytic pipelines. From data acquisition all the way to model revalidation.
  • Reviewed in the United States on May 5, 2021
    This product is more akin to a course than a reference book. I tried flipping over to the chapter on Cloud-SQL (actually the author only goes into BigQuery so I ended up scrolling through Stack Overflow anyway.) When I finally found the relevant chapter, it was impossible to disentangle the SQL code from the class objects built in the proceeding 6 chapters. Do not buy this book if you have any intention other than reading every single page in order. Otherwise, you'll end up doing what I did, which reading stack overflow and medium articles to mixed effect.

Top reviews from other countries

Translate all reviews to English
  • KMoreno8
    5.0 out of 5 stars Para cualquiera que quiere introducirse o conocer del tema con GCP
    Reviewed in Mexico on May 10, 2022
    Cualquier persona que trabaje en ámbito de datos potencialmente usará Google Cloud. Este libro te da un buen fundamento para ello.
    Report
  • 宗教好き
    5.0 out of 5 stars 一通り学ぶのに適している
    Reviewed in Japan on July 6, 2018
    良い商品です、英語ですが一通りのことを学べるように書いてあります。
    グーグルクラウドでデータサイエンスをしようと思っている人間ですが良い入門書となりました。
  • Chandra Shekhar Singh
    4.0 out of 5 stars Get to the heart of data science straight away 😊👍
    Reviewed in India on October 10, 2019
    Its sets very clear direction for aspiring data engineers / scientists as well what is expected out of them.
  • Amazon Customer
    5.0 out of 5 stars Great
    Reviewed in Canada on November 21, 2018
    Great
  • Wolfgang Giersche
    5.0 out of 5 stars Very knowledgeable author. Balanced and informative reasoning
    Reviewed in Germany on May 17, 2018
    Very knowledgeable author. Balanced and at time beautiful reasoning, presented in an understandable way. Definitely a must-read for Google cloud practitioners.