Easy way to perform Reddit scraping with Python.

Reddit scraping.


Below are several use cases showing you how to implement Reddit scrapping with Python Reddit API Wrapper(PRAW), a special Python library designed for that process. PRAW is a Python wrapper for the Reddit API, which lets you to scrape data from subreddits, create a bot and much more things.

Reddit scrapping with Python.



IMPORTANT: don't forget to install PRAW by pip install praw or conda install praw depending on your IDE and environment.
At the very start first you need to get Reddit account and also create an application-script on https://www.reddit.com/settings/privacy. After that you are ready to go with Python scripts below, don't forget to fill necessary data.


Scrape Reddit post and comments:



import praw

reddit = praw.Reddit(user_agent="your agent name", client_id="your client ID", 
client_secret="your app secret")

url = "https://www.reddit.com/r/dadjokes/comments/sm6ikx/a_young_woman_was_standing_outside_her_car/"

post = reddit.submission(url=url)
print(post.title)
print(post.selftext)

print(len(post.comments))
for comment in post.comments:
  print(comment.body)

OUT: your comments for selected post.



Scrape fresh Subreddit posts into a textfile:



import praw
from datetime import datetime, timedelta

reddit = praw.Reddit(user_agent="your agent name", client_id="your client ID", 
client_secret="your app secret")

subreddit = reddit.subreddit("HistoryMemes")

posts24h = []

with open('postoutput.txt', 'w') as file:
  for post in subreddit.new():
    current_time = datetime.utcnow()
    post_time = datetime.utcfromtimestamp(post.created)

    delta_time = current_time - post_time
    # print(delta_time)
    if delta_time <= timedelta(hours=24):
      posts24h.append((post.title, post.selftext, post_time))
      file.write(f'{post.title}\n{post.selftext}\n\n')

print('Subreddit posts for last 24H are saved to text file !!!')

OUT: Subreddit posts for last 24H are saved to text file !!!



Submit new post to Subreddit:



import praw


reddit = praw.Reddit(user_agent=True, client_id="YOUR REDDIT APP ID", 
  client_secret="YOUR REDDIT APP SECRET", username='YOUR REDDIT USERNAME', password='YOUR REDDIT ACCOUNT PASSWORD')

subreddit = reddit.subreddit("WeirdJokes")
subreddit.validate_on_submit = True

title = 'It should be allowed )'
content = """
People who make sound while eating food must be slapped without asking why.
"""

subreddit.submit(title=title, selftext=content)

print('New post was submitted to selected Subreddit !!!')

OUT: New post was submitted to selected Subreddit !!!


Bot replying new posts in Subreddit containing certain word in title and body:



import praw
from datetime import datetime, timedelta

reddit = praw.Reddit(user_agent=True, client_id="YOUR REDDIT APP ID", 
  client_secret="YOUR REDDIT APP SECRET", username='YOUR REDDIT USERNAME', password='YOUR REDDIT ACCOUNT PASSWORD')

subreddit = reddit.subreddit("Joker")


for post in subreddit.new():
  current_time = datetime.utcnow()
  post_time = datetime.utcfromtimestamp(post.created)
  delta_time = current_time - post_time
  if delta_time <= timedelta(hours=48):
    if "joke" in post.title.lower():
      # print(post.title)
      # post.reply('Here's another joke !)
      for comment in post.comments:
        if "joke" in comment.body.lower():
          comment.reply("Here's another joke !")

print('Replies are provided when needed !!!')          

OUT: Replies are provided when needed !!!





See also related topics: