(This post was a side project, done in my own free time. Neither it nor the following blogpost represent the opinions of my employer.)
In this post I’ll be talking about a tool I made recently, Baleet, which allows you to peruse your entire Twitter archive and delete any unwanted tweets in a quick, easy, and low-friction manner.
Before I get into that, I want to talk about some of my motivations for making Baleet. One of the things that bugs me about most social media is that most services make it really hard to manage your archive. For services where you can switch your posts to private or just for a group (like Facebook–or, to date myself, LiveJournal), it’s a little bit less of an annoyance. As I add new Facebook friends, for example, I can choose the group I want to put them in: if I have a premade set of privacy settings for coworkers, I can put any new set of coworkers in that group and be pretty confident that they won’t see anything I don’t feel like letting them see.
But let’s say my Twitter account is public, then becomes private, then becomes public, then private, and then I get a new job with a whole set of coworkers or friends or students or whatever. Before I go public again, I have to feel confident that my archive doesn’t have any dumb shit that I don’t want those new coworkers or friends or students to see. (In my case, nothing particularly scandalous; I do have a penchant for tweeting while tipsy, though.) How do I complete that task?
One way of doing it is deleting my entire archive. But as someone who has been on Twitter for a while, and who uses the internet to ply her trade, that’s not attractive. My archive has fond memories and, more practically, proof that I published certain ideas or projects before other people. Deleting the whole thing isn’t an option, so I have to somehow go through it myself.
This is where I thought the Twitter archive downloader would come in. (In case you didn’t know, you can download your entire Twitter archive.) I got mine and got to work.
But, as it turns out, the web interface for the Twitter archive is a pain! Or at least it was at the time of writing. There was no delete button on the web interface, so every tweet I wanted to delete, I had to open in a new tab and delete manually. If I didn’t open it in a new tab, I risked losing my spot in the archive. Plus, after deleting it, there was no change in the archive, making it easy to get lost after putting it to the side and then coming back.
I was pretty annoyed that Twitter put me in between a rock and a hard place (namely: delete everything or spend ages opening a billion new tabs). Rather than just complain about it, I decided to make a tool myself.
How Baleet Works
Baleet is a Node.js app that uses ReactJS on the frontend. It’s a relatively small app, really. In order to use it, you need to get API keys from Twitter (these keys are what allow you to tell Twitter “hey, delete this thing”) and your Tweet archive by month. Those files are stored in the
data/js/tweets directory in your archive.
Once you have those two things, you put your API keys into Baleet, and (after stripping out an unnecessary line with a Terminal command) put your tweet archives into Baleet’s
tweets folder. Baleet loads up the earliest month it finds, and presents a page with all that month’s tweets on it with big old delete buttons. When you click a delete button and the tweet gets deleted, the tweet is deleted from the page. (If you refresh, it will come back, which is a bit annoying–but if you delete a tweet you’ve already deleted, it’ll just tell you so.) When you’re done with a month, you delete it, so you won’t see that month again. It’s pretty simple, really, but having it streamlined in that way saved me several days of mindless tab-opening and cursing after losing my place, I’m sure.
What I Learned
Baleet was super interesting because its iterations taught me a ton about how Twitter works.
The most interesting revelation came from the
GET statuses/user_timeline API endpoint. In the documentation, this endpoint has the following caveat:
This method can only return up to 3,200 of a user’s most recent Tweets.
“That’s fine,” I figured. “If I start deleting from the most recent tweets, that window will always be moving back. As long as I don’t have more than 3200 tweets, I can hit the end.”
Or so I thought! After deleting about 3200 tweets, however, I started hitting the end of the road. I couldn’t go back any further. I was super confused–why wouldn’t I be able to see my new 3200 most recent tweets, now that all the other crud had been deleted? Was it a caching issue? Had I hit a speed limit?
Then I read the documentation more closely:
The value of count is best thought of as a limit to the number of tweets to return because suspended or deleted content is removed after the count has been applied.
Meaning: Twitter doesn’t actually delete your tweets, I guess. I suppose they just mark them as deleted, but keep those tweets you don’t want anymore to themselves. If that’s the case, that’s pretty creepy to me–and also kind of hilarious, since I’m pretty sure not deleting deleted tweets is against their TOS. Here’s what Twitter said while talking about Politiwoops, a service that showed politician’s deleted tweets that Twitter shut down:
“Imagine how nerve-racking — terrifying, even — tweeting would be if it was immutable and irrevocable?” Twitter reportedly told the OSF. “No one user is more deserving of that ability than another. Indeed, deleting a tweet is an expression of the user’s voice.”
Baleet hasn’t had a proper code review yet, but it works! I used it to go through my Twitter archive and junk all the cruft before I un-privated myself. (Stalkers who are now desperately angry that they will never know my full archive: it’s not that interesting, I promise.) It took me about a day to go through about 4 years of tweets, or just under 10k tweets, which is far, far faster than the old method. Plus, now I have a tool I can share with other people, and some knowledge that Twitter will keep everything I say. That’s a pretty effective way to dissuade goofing around on the service.
I think services that let you manage your archives in a straightforward and easy way, especially those that allow you to have different intimacy levels on the same service (think Facebook lists), are going to get more and more important as time goes on. We haven’t totally figured out what public space or public knowledge is on the internet, and I’m hoping we’ll make more and more tools to help us make sense of it.