Book review dataset. A distinguishing feature of this dataset is its capture of multiple tiers of user interaction, ranging from adding a book to a "shelf", to rating and reading it. ) tags/shelves/genres The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. Please visit amazon-reviews-2023. It took me quite a lot of internet sleuthing to find an interesting, complete and large dataset to practice machine learning and more specifically recommender systems. github. The Goodreads Datasets contain three primary components: book metadata, user-book interactions, and book reviews. This is a large-scale Amazon Reviews dataset, collected in 2023 by McAuley Lab, and it includes rich features such as: User Reviews (ratings, text, helpfulness votes, etc. They combine explicit ratings, implicit feedback (like user shelves), rich textual reviews, and detailed metadata, making them ideal for hybrid models that mix collaborative filtering with NLP. Built a content-based book recommendation engine using Python and Natural Language Processing (TF-IDF) to analyze and suggest titles from a dataset of 1,000+ books. All information is publicly available book metadata. Critically, these datasets have multiple levels of user interaction, raging from adding to a "shelf", rating, and reading. There are also: books marked to read by the users book metadata (author, year, etc. Just thought I'd share this Goodreads dataset here. This Amazon dataset contains more than 190,000 best-selling books. 🔗 Related Datasets Consider combining with: Goodreads ratings datasets Amazon book reviews Library catalogs NYT These datasets contain reviews from the Goodreads book review website, and a variety of attributes describing the items. ); Item Metadata (descriptions, price, raw image, etc. The online version of the book is now complete and will remain available online for free. 2% larger than the last Kaggle is the world’s largest data science community with powerful tools and resources to help you achieve your data science goals. [April 18, 2024] Update This dataset was created and pushed for the first time. Nov 29, 2025 · Data sourced from the Google Books API. For the most current book data, refer to the Google Books API directly. 54M reviews, 245. While the datasets vary in scope and format, they enable research into social influence, genre Free download of sample data set for Goodreads Book Reviews. These datasets can be merged together by joining on book/user/review ids. io/ for more details, loading scripts, and preprocessed benchmark files. This data was originally pulled from Goodreads in 2017 by Zygmunt ZajÄ…c . Basic Statistics of the Complete Book Graph: 2,360,655 books (1,521,962 works, 400,390 book series, 829,529 authors) 876,145 users; 228,648,342 user-book interactions A Github dataset of the most reviewed and best-selling books on Amazon. Data on 1M+ reviews from 13K+ books, collected in late 2017. CSV, Multiple tables Oct 21, 2025 · A comprehensive Amazon books dataset featuring 20,000 books and 727,876 reviews spanning 26 years (1997-2023), paired with a complete step-by-step data science tutorial. We collected three groups of datasets: (1) meta-data of the books, (2) user-book interactions (users' public shelves) and (3) users' detailed book reviews. - theSAKI/Book-Recommendation-Sy Goodreads-books reviews and descriptions of each book An official website of the United States government Here's how you know Every 2 days , this dataset will be updated Something went wrong and this page crashed! If the issue persists, it's likely a problem on our side. Basic Statistics of the Complete Book Graph: GoodReads: This dataset contain reviews from the Goodreads book review website, and a variety of attributes describing the items. The book metadata includes details on 2,360,655 books, such as title, author, publication date, and Jul 22, 2025 · The GoodReads datasets are a foundational resource for building and evaluating book recommendation systems. Amazon Reviews 2023 (Books Only) This is a subset of Amazon Review 2023 dataset. Critically, datasets have multiple levels of user interaction, raging from adding to a shelf, rating, and reading. . The dataset also covers a vast collection of user-generated textual reviews, offering insights into reader sentiments and opinions. Each book title on this Amazon dataset has gained 10,000 reader reviews or more, making them stand out as the most popular books available. It contains detailed metadata information for 10 000 books (sorry about the typo in the title), as We collected three groups of datasets: (1) meta-data of the books, (2) user-book interactions (users' public shelves) and (3) users' detailed book reviews. What's New? In the Amazon Reviews'23, we provide: Larger Dataset: We collected 571. These datasets can be merged together by matching book/user/review ids. This dataset contains six million ratings for ten thousand most popular (with most ratings) books. ); Links (user-item / bought together graphs). 📊 Updates This is a one-time snapshot from November 2024. About Dataset The Goodreads Book Reviews dataset encapsulates a wealth of reviews and various attributes concerning the books listed on the Goodreads platform. tor mlk yko ufi mtj epp eop sah grs fuv lyz eji liv ljj xfz