I've heard other similar tools only grab the first 1000 comments/posts does this grab everything? Or does it have a similar limitation?
Technology
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
If you are requesting directly from Reddit, it will be everything. Just keep in mind they are going to be flooded with responses right now so 30 days is probably optimistic.
Theoretically, they are required by the GDPR to respond within one calendar month from the day they receive the request. Let's see if they can keep up.
This is the official GDPR data request form, so it includes everything. Here's the link for convenience: https://www.reddit.com/settings/data-request
Thank you for the direct link.
I tested the tool out. Looks like I only got back 1000 comments but there are ways to get the full data. https://github.com/xavdid/reddit-user-to-sqlite
I had a 3 prong approach to getting my stuff.
The python tool mentioned in the wired article, the reddit is fun newly added export feature, and redditmanager.com
Between those 3, I was able to get my public stuff (comments and posts), my stared/saved stuff, and all my rif clicked links (quite shocking/surprising the amount of data, but not that large mb wise)
I used my clicked links from the rif export to help build my RSS feed list
1000 comments and a separate 1000 posts for the python tool
Thanks for sharing. I put a request in and I'll see how comprehensive the data is.
I have 17 years of posts and comments and have been active, i don't have much trust in those kinds of tools specially given that the API has hard limits to what it can reach, still will check
Edit: Welp, from starters we have issues, it requires the absolute latest Python version, as a sysadmin i hate when things demand the latest and greatest just to be installed, i'm in a version still with full support...
Oh well, we'll see...