This action will force synchronization from lcx666/WebScraping_AmazonReviews, which will overwrite any changes that you have made since you forked the repository, and can not be recovered!!!
Synchronous operation will process in the background and will refresh the page when finishing processing. Please be patient.
As a self-proclaimed tech-enthusiast, I've been following the tech review community, especially on YouTube, for quite a while. During that time, I recognized a certain pattern emerge after every new iPhone release: highly popular videos (as well as articles) would be released criticizing initial problems with the new iPhone.
However, Apple's sales numbers do not seem to be impacted by this negative atmosphere around its release date. This made me wonder who is most impacted by these videos and articles and whether or not they affect Apple's customer satisfaction. Should these reviews, in fact, impact Apple's customers, a more loosely defined release schedule, giving Apple more time to refine new features of the iPhone, might increase Apple's customer satisfaction.
I chose to scrape amazon.co.uk for several reasons. Firstly, Europe, besides China, is arguably the most important foreign market for the iPhone. Thus, customer reviews from the UK present relevant information for Apple. Secondly, Amazon enabled me to not only gather information regarding the reviews themselves (Rating, Title, Text, Helpful Votes) but also regarding the Amazon user that published the review. By clicking on the username, I was able to collect information about each user, such as the total number of helpful votes and reviews as well as all published reviews.
In conducting my research, I chose to exclusively focus on reviews for the iPhone X from November 2017 to September 2018 to only gather reviews for the newest iPhone at that point in time. Future research could apply the same concept to older iPhones.
I used Selenium to scrape amazon.co.uk, mainly because of its flexibility in navigating between different websites.
After having scraped the data, I prepared and cleaned the data, mostly using Pandas, NumPy, and RE. This step included identifying and appropriately dealing with missing values, adapting my code, and reformatting the gathered data to enable further processing and analysis.
Then, I manipulated and analyzed the data, breaking it down into subgroups and comparing their characteristics.
Finally, I visualized the results of my analysis with matplotlib, seaborn, and wordcloud.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。