Hi! So this is my first shot at anything like this. I’ve never coded much before completing the Python course here, just some basic MATLAB and a couple incomplete Codecademy courses. It’s definitely not perfect but it’s in a relatively complete state right now, so I’d like to share. My project scrapes the most recent 3200 tweets from any Twitter user’s profile (as long as it is public), or it can also scrape the 3200 most recent tweets about any search term or hashtag you enter. It then uses these tweets and Codecademy’s Markov Chain generator function to generate a new tweet 20 words long! Have a look here:
A couple caveats: Since this uses the Tweepy library to access Twitter’s APIs, you have to have Tweepy installed in order to run it. Even bigger though is that you also need API keys from Twitter. I can’t supply my own keys because that would give you access to muck around with my personal Twitter account. If you don’t have keys you’ll have to apply for a Twitter dev account to get some before you can actually run this script. If you already have keys just copy and paste them into the appropriate spot in fetch_data.py. I’m not really sure of a better way to do it without having to rethink how the script would work. Also, Twitter’s APIs are pretty heavily rate limited, meaning running this too much within a 15 minute period will spit out error code 429. There is nothing I can do about this, you just have to wait 15 minutes to run it again.
Overall I am very happy with how this turned out. Going into it I had no idea how to use APIs or install new libraries. Completing this project forced me to learn how to set up a dev environment on my Windows machine (Codecademy’s course assumes a Unix environment), how to manage libraries using pip, how to use a library to access Twitter’s APIs, and how to think about organizing my project and making it clean and readable so I could share it. I had also never looked up or read documentation online before, and boy did I have to do a lot of that to figure out how to use Tweepy and the Twitter APIs.
As for possible improvements: One big idea I had was to make this into an actual tweetbot that Twitter users could interact with rather than just printing output to the console, but doing so would require me to create a new public Twitter account, get new API keys for that account, and rethink a large portion of the code. Otherwise I would be tweeting random Markov chains to my personal account, which is not ideal. I may come back and try to do it in the near future though, since it would still be extra experience. There is one error I get only occasionally that I can’t explain, and its not consistently reproduceable. When it happens you can simply re-run the script, give it the exact same commands, and it goes away. I don’t think there’s much if anything I could do to fix it, similar to the rate limit error. There are also no capital letters or punctuation included in the generated tweets, but I don’t really see this as a big problem considering their nature. Finally, I think I would like to make it so that rather than re-running the script, it would just continue running and taking new inputs until the user chose to stop. But again, as of now I’m happy with it and I may come back to add this in the future.
Let me know what you think!