admin

joined 1 year ago
MODERATOR OF
[–] [email protected] 0 points 4 months ago

I'll consider your opinion.

[–] [email protected] 3 points 1 year ago

Looks like it's working. Time for a beer!

9
Migration complete (lemmit.online)
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 

The good news: The migration is complete, and I've even managed to update the version to 0.18.5 (was stuck on 0.18.4-beta8 for the longest time).

~~The sad news: Cloudflare is having some issues, so nobody is able to access the new server at this time. Oh well. It'll probably be fixed Saturday morning, and I'll turn the bot back on.~~

Migration complete, and the bot has caught up on the 24 hour gap that it was offline. It only took like 12 hours this time, while in the past it was closer to taking an entire day. It probably helped that the new VM is dual core, even though the bot itself only ever makes 1 request at a time, so I didn't expect this much of an improvement.

 

The server is becoming a tad bit too big for the VM it's running on, so I'll be moving it this weekend. Until it's back up and running, I have paused the bot.

[–] [email protected] 1 points 1 year ago (1 children)
[–] [email protected] 2 points 1 year ago

Ah, I guess I must have overlooked that part. There are several reasons for not wanting to allow signups.

One is quite simple, cost. Right now this is running on a small, single core instance. It often stutters (especially when handling video updates), and that is not an issue, since that just means it's going to take small while before updates are sent out. But you wouldn't want to have that delay for actual users. Right now the costs are quite manageable, if I have to scale up in order to provide a fluent experience for its users, not so much.

Most of the other reasons come down to the responsibility of having to provide a home to any outside users that sign up. I don't have the interest or time to maintain a community of people, nor to guarantee the uptime that such a server would require. It also wouldn't work. The largest Lemmy instance in existence, lemmy.world, has defederated from this instance. So any users that sign up here, would be devoid from content on there. And as you said, any other instance can decide to do so at any time (in fact, I very much suggest they do so in the FAQ).

I could go on, but I think you get my drift.

[–] [email protected] 2 points 1 year ago

Heya,

I still need to create some tools to make to easily add new subreddits to the bot. I'll probably get around to that this weekend, and then I'll add /r/theyknew and notify you. As far as I'm concerned, it's a great contender for synchronisation/archiving.

[–] [email protected] 3 points 1 year ago

Can't blame you for that. Personally, I still think it excels at content where communication with OP is irrelevant, like [email protected], [email protected] or [email protected]. And by far best example of this, if you look at the subscriber count, is nsfw content.

[–] [email protected] 3 points 1 year ago

Nope. That would be very hard to implement, and probably very confusing and disliked by other lemmy users.

[–] [email protected] 4 points 1 year ago

I don’t know how the karma thresholds work behind the scenes, but might I suggest for the bot to do a “top for” sort instead? Like it will only repost top content for the past 6 hours only. This will also help get more quality content as well and avoid reposting low effort/quality posts.

This is effectively already kinda how it works. For each subreddit it periodically (anywhere between every 30 minutes to every 12 hours, based on subscriber count and posts per day) requests the "hot" content feed. It then checks each post if it has at least 20 upvotes, and a 80% upvote to downvote ratio. Those numbers are configurable, but that's what they're currently set to - I believe they're a good mix between filtering out the complete garbage while still making sure it doesn't miss good content is.

[–] [email protected] 1 points 1 year ago (2 children)

@[email protected] @[email protected] @[email protected] @[email protected] @[email protected] @[email protected] @[email protected] @[email protected]: First of all: those are some wonderful usernames. Secondly: I have taken your concerns to heart and made some changes. See my update here: https://lemmy.ml/post/6190779.

 

A few months ago, I launched the Lemmit instance and bot (@[email protected]). Primarily, this was to help me stay up to date with some of the content I'd leave behind on Reddi. Additionally, I wanted to give back to the community, so I made it possible for anyone to request the archiving of subreddits to the Lemmit instance.

However, this came with some unintended consequences. Notably, the most subscribed community on the instance has been [email protected]. Even though it should have been obvious that there is no way to communicate with the Original Poster, given they're on Reddit.

The pushback against the bot and the instance has increased over time. A recent post, This bot is bad for Lemmy, highlighted these concerns. I've also received similar feedback from admins of major Lemmy Instances and through direct PMs.

As a response, last week I stopped accepting requests for archiving new subreddits. This weekend, I went a step further by discontinuing the archiving of a large amount of "interactive subreddits"—communities primarily centered around Q&A or communication with the Original Poster. This includes subs like [email protected] and [email protected], as well as niche and support communities. Such discussions are better hosted on Reddit or Lemmy's equivalent spaces.

I've also adjusted the post karma thresholds to curb spam posts. While this probably won't appease everyone, it should reduce the bot's posting frequency.

Perhaps this might prompt some admins to rethink their choice to defederate from the Lemmit instance, or the banning of the bot. I'm not expecting anyone to, and won't take it personally if you don't, but I wanted to give the community this update nonetheless.

In [email protected] there's a sticky post of all the Actively archived communities on the server (including NSFW ones, since that is not public without logging in), as well as the list of communities for which archiving is now disabled.

Cheers!

[–] [email protected] 2 points 1 year ago (4 children)

What.

You want to mirror a Lemmy community onto Lemmit? :s

Also, see sidebar.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (2 children)

Funnily enough, it initially was the intention to have the bot check up on everything it posted, to see if it got deleted. In that way, it would outsource moderation to Reddit. I never got around to that, and am not sure I ever will.

So for now, handing out moderation to others is a good workaround. ~~In order for me to make you a mod, you'll need to leave a comment in the community, and mention me @[email protected].~~

Actually, checking out the subreddit in question, that's exactly the kind of content I want to avoid on here. Most of the posts on there are to invoke discussion, either with the OP or other members. You'd be better off starting a new community on your own instance.

[–] [email protected] 2 points 1 year ago (1 children)

Voila:

2023-10-07 17:23:54,906 - root - INFO - Community Boise is ENABLED, has 67 subscribers and 0 posts per day.

I understand your argument, and fully agree. There's over 800 communities that I have to check though, so mistakes will be made.

10
Community cleanup (lemmit.online)
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 

Since its inception, the Lemmit instance has been controversial. That might be an understatement, but let's roll with it for now. One of the major issues people have with the bot, is the cross-posting of "interactive" Reddit posts, ie posts where the value lies in interacting with the OP, like AskMen, AskWomen, and AmITheAsshole. Personally, I fully agree with that viewpoint, but I didn't feel like interfering with supply and demand - in the sense that AmITheAsshole, for some reason, is the most subscribed community on this server.

That might change though. Earlier this week, I disabled the posibility to request new subreddits. This weekend I will follow that up by disabling the scraping of so-called interactive communities. So in order to facilitate that, I created a list of all the communities on this server (posted separately in [email protected]), and I will check each of them to see if they should be disabled. The goal is to keep a list of "content only" (or at least "content mostly") communities, where the value lies in the link that's provided or in the body of the self-post - not in the comment section. I'm sure this is going to be a disappointment to some people, but I do agree with the sentiment that this is better for Lemmy as a whole.

Edit: It is done. All 816 communities have been checked and and 110 of those have been purged from updates. I am sure some mistakes were made - that some communities have been disabled or left intact when they shouldn't have. If that's the case, reach out to me, and I'll fix it.

9
List of communities. (lemmit.online)
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 

Since it's impossible to see all the communities on here without logging in (mandatory NSFW filter), and I'm the only one with an account on here, here's a list of them. This list will be a snapshot, so the subscriber count will not be up to date, but I'm sure you'll figure it out.

Ident NSFW Subscribers
1000ccplus NSFW 14
2000ccplus NSFW 8
2007scape 13
2137 41
2meirl4meirl 29
2westerneurope4u 254
3dprinting 90
80sdesign 64
90sdesign 54
AskReddit 222
Boise 67
Erotica NSFW 153
IdiotsInCars 140
InternetIsBeautiful 73
MuseumOfReddit 71
PerfectlyCutMeows 54
ProgrammerHumor 275
SFWdeepfakes 57
Superstonk 295
Ultralight 85
abandonedporn 242
aboringdystopia 56
acmilan 26
adviceanimals 124
afterplayio 20
akaimpc 29
aleague 34
algarve 34
alternateangles 59
altgirls NSFW 362
amateurroomporn 11
amazonposition NSFW 62
anal NSFW 344
anal_yiff NSFW 50
analog 18
analogcommunity 20
androiddev 43
ani_bm 38
animalsbeingderps 15
anime 157
animemes 127
anormaldayinrussia 63
antimeme 15
antiwork 153
apksapps 45
apple 58
applevisionpro 35
appnotes 31
arizona 40
arizonatrail 51
armenia 8
arsenalwfc 30
artificial 10
artisanvideos 34
artsakh 7
asiansgonewild30plus NSFW 56
asiantraps NSFW 90
assettocorsa 8
atbge 6
audioproductiondeals 26
australia 40
autism 12
ayaneo 47
backyardfarmers 5
badscificovers 30
banano 33
bangmybully NSFW 32
bapcsalescanada 43
baseball 75
battlebots 54
battlestations 153
bbw NSFW 132
bbw_banging NSFW 18
bbwbutthole NSFW 50
bbwgw NSFW 33
bbwhardcore NSFW 75
bbwmilf NSFW 184
bbwtits NSFW 57
bdsm NSFW 70
bdsm_smiles NSFW 36
beamng 47
beautifulfemales 120
beefcurtains NSFW 70
beer 43
bestoflegaladvice 109
bestofredditorupdates 151
bicycletouring 42
bigareolalover NSFW 45
bigboobsgonewild NSFW 17
bigdickgirl NSFW 47
biggerbabes NSFW 34
bigtittygothgf NSFW 50
bikeporn 42
biketouring 43
biology 29
bioniclelego 22
blockedandreported 29
blunderyears 68
blursedimages 29
bmw 31
bobstavern 4
bois NSFW 129
bonnebouffe 9
boxing 38
brasil 47
brownchickswhitedicks NSFW 142
buildapcsales 156
burdurland 3
bustynaturals NSFW 275
buttsandbarefeet NSFW 198
cantonese 27
carpentry 5
cars 48
castiron 34
casualpt 39
celebhub 40
celebnsfw NSFW 199
celebnudedebut NSFW 55
celebritylegs 15
celebs 109
celebswithpetitetits 215
cfb 50
cfs 47
chastitycouples NSFW 81
chickflixxx NSFW 73
chloe 46
chubby NSFW 78
chubby_hentai NSFW 55
cirkeltrek 50
citiesskylines 122
citypop 53
codzombies 29
collapse 29
combatfootage 94
comedycemetery 7
comedyheaven 7
comedyhomicide 11
comedynecromancy 7
comedynecrophilia 10
comics 162
comicstriphistory 56
comicwalls 42
completeanarchy 4
condomtobarebackmfmt NSFW 34
coolguides 144
copypasta NSFW 17
corgigifs 7
coronavirus 51
couplesgonewildplus NSFW 33
cozyplaces 5
crackwatch 57
crboxes 34
creampie NSFW 46
cremposting 7
crossstitch 58
cruelcaptions NSFW 18
cryptocurrency 81
cuckold NSFW 158
cuckoldpregnancy NSFW 57
cuckoldstories2 NSFW 26
cumcoveredfucking NSFW 99
cumdumpsters NSFW 184
cumshotgifs NSFW 201
cumsluts NSFW 629
curatedtumblr 31
cursedcomments 65
daintywilderfans NSFW 41
dallas 48
dankmemes 28
daresgonewild NSFW 76
datahoarder 76
dataisbeautiful 364
dccomics 66
deadmau5 18
decor 3
deeprockgalactic 74
depthhub 33
destiny2 71
deusex 39
dewalt 8
dinosaure 5
discordvideos 8
diwhy 90
diwhynot 37
documentaries 69
dota2 73
dross NSFW 7
drugscirclejerk 59
edmonton 18
electricians 5
emogirlsfuck NSFW 402
enculerlesvoitures 10
engineeringporn 172
engorgedveinybreasts NSFW 127
entertainment 51
enterthegungeon 3
epicgamespc 55
erotichypnosis NSFW 66
eroticliterature NSFW 64
eu4 8
eu_nvr 23
everything_gripe 31
exmormon 58
extrafabulouscomics 24
extramile NSFW 213
eyebleach 90
facepalm 37
factorio 87
fairytaleasfuck 11
fakealbumcovers 36
fatwomenlove NSFW 42
fedora 91
feedthebeast 94
femboy 39
femboymemes 20
femboys NSFW 149
femboys4real NSFW 30
feral_yiff NSFW 41
feralpokeporn NSFW 49
fighters 6
fixedgearbicycle 52
flagporn 5
flatchests NSFW 88
food 124
forbiddensnacks 11
forgottenbookmarks 36
formula1 30
formuladank 133
fortcollins 39
fortyfivefiftyfive NSFW 247
fosscad 70
fossdroid 137
foundryvtt 32
framework 17
freegamefindings 118
frogbutt NSFW 115
functionalprint 103
funny 342
furry_irl 165
furryhits 30
futadomworld NSFW 7
fuzzypeeks NSFW 28
gadgets 53
gamedeals 212
gamedealsfree 82
games 164
gamingleaksandrumours 44
gardening 10
garfieldminusgarfield 42
gay_irl 10
gaytwerking NSFW 16
generative 31
genshin_impact 53
gentlemanboners 293
geocaching 37
germanshepherds 28
gfur NSFW 98
gfurcomics NSFW 16
ghosts 30
ghoststories 28
ginger NSFW 156
girlsfarting NSFW 8
girlsfinishingthejob NSFW 553
girlsjoy NSFW 105
girlsjustwanttobefuck NSFW 121
girlsmasturbating NSFW 239
girlswholovetobefuck NSFW 178
giscardpunk 9
globaloffensive 53
gme_meltdown 6
gmecanada 39
gnome 5
gocommitdie 7
godot 13
godpussy NSFW 401
gonemildplus NSFW 36
gonewild NSFW 508
gonewild30plus NSFW 424
gonewildaudio NSFW 254
gonewildchubby NSFW 89
gonewildcolor NSFW 40
gonewildhairy NSFW 29
gonewildplus NSFW 36
gonewildstories NSFW 278
goodanimemes 127
googleplaydeals 119
gooned NSFW 168
gothstyle 3
granturismo 33
gravelcycling 47
greenhouses 6
grimdank 70
grime 8
grool NSFW 120
guildwars 10
guildwars2 65
guiltygear 49
gunners 50
hackernews 18
hardware 53
haunted 29
havanese 31
hayastan 7
hearthstone 41
hentai NSFW 31
hentaihumiliation NSFW 75
hfy 58
highstrangeness 86
hmmm 5
hobbydrama 18
hockey 63
holdmybeer 7
hololive 57
holup 34
homeassistant 430
homedecorating 4
homelab 146
homelabsales 23
homestead 7
homesteading 8
honkaistarrail 22
horror 73
hotwifetexts NSFW 38
hoyas 40
humiliationcaptions NSFW 34
humongousaurustits NSFW 23
hyruleengineering 67
icecreamery 42
ich_iel 5
idm 5
imaginarybestof 77
imaginarywarhammer 72
impregnation NSFW 18
imsorryjon 26
imthemaincharacter 88
inceltear 48
incest_captions NSFW 104
incestsexstories NSFW 75
incorgnito 7
india 46
indonesia 6
interestingasfuck NSFW 40
interiordesign 4
iphone 68
itookapicture 314
jailbreak 38
japanese_adult_video NSFW 57
japantravel 22
jav NSFW 131
java 19
javdreams NSFW 49
jerkofftoceleb NSFW 58
karengillan 3
keep_track 30
kgbtr 35
kiernanshipka 3
kinkycaptions NSFW 23
koreannsfw NSFW 56
kpopfap NSFW 49
labiadangling NSFW 59
ladyladyboners 119
lasercutting 41
latvia 3
learncantonese 28
learnjapanese 17
legalcatadvice 32
leopardsatemyface 136
lesbianpov NSFW 69
lesbians NSFW 140
lifehacks 53
lifeprotips 138
link_dies 33
linustechtips 118
linux_gaming 225
liverpoolfc 32
livestreamfail 80
loveforboozecruisers 39
luxembourg 42
machinelearning 8
machineporn 40
mademesmile 93
magictcg 5
malelivingspace 60
maliciouscompliance 130
manga 79
mapporn 98
marchagainstnazis 87
margotrobbie 6
mathgifs 33
maybemaybemaybe 21
mazda3 36
mcmansionhell 54
me_irl 17
mealtimevideos 13
meirl 21
meme 14
memes 52
merdasdoolx 37
microsoft 43
middleeasternhotties NSFW 38
midjourney 80
mildlyinteresting 155
minecraft 5
misterfpga 7
mmgirls NSFW 81
modcoord 80
mommymilkersnsfw NSFW 152
monero 40
monstermusume 32
mortalkombat 52
mousereview 4
movieposterporn 63
movies 110
mpcusers 27
mre 32
musicthemetime 36
mylittlepony 7
mylittleredacted 10
nadinejansen NSFW 4
nanocurrency 42
natalee NSFW 11
nativeamericangirls2 NSFW 26
nba 57
neovim 14
newcastleupontyne 33
newzealand 83
nexdock 46
nextcloud 40
nfa 63
nfl 75
nicechips 34
nightofthefullmoon 25
nirvannatheband 6
nixos 42
noncredibledefense NSFW 10
nordictrackandroid 46
northkoreapics 27
nosleep 187
nostalgia 16
notkenm 36
nsfw_caption NSFW 105
nsfw_gif NSFW 308
nsfw_japan NSFW 85
nsfwcelebs NSFW 210
nsfwcosplay NSFW 148
nsfwcyoa NSFW 8
nudecelebsonly NSFW 102
nukedmemes 8
nyc 40
offgrid 11
offgridcabins 6
oilporn NSFW 106
okbuddyhololive 34
okbuddyphd 13
okbuddyretard 90
okdraudzindauni 3
onebag 14
onepiece 43
onepunchman 110
onguardforthee NSFW 53
onmww NSFW 63
onoffcelebs NSFW 84
opensource 77
opensourceapps 43
orgasms NSFW 116
osr 11
outmanga 10
owlhousemystery 5
paradoxplaza 5
paranormal 40
paranormalencounters 18
pastaemportugues 19
pathofexile 15
paymoneywubby 61
pcmasterrace 475
peloton 76
permaculture 9
perth 55
philadelphia 41
pillowtalkaudio NSFW 18
piracy 101
piracyarchive 42
plotterart 20
plumbing 5
plussizedhotwives2 NSFW 41
polandball 6
politicalcompassmemes 23
politicalhumor 55
polska 44
portugal 73
portugalcaralho 61
preggohentai NSFW 59
presscumference NSFW 67
prettygirls 144
prettygirlsuglyfaces 37
programminglanguages 43
projectceleste 36
pronebone NSFW 24
publicfreakout 106
pussywallet NSFW 56
quebec 59
quilting 46
raining 44
rance 13
randonneuring 23
rareinsults 36
rarepuppers 21
realcivilengineer 41
redlettermedia 46
redneckengineering 83
retroussetits NSFW 146
rg35xx 23
riae_ NSFW 10
riaesuicide NSFW 15
rickandmorty 17
ringfitadventure 2
roms 50
roomporn 5
rpclipsgta 44
rpi 39
rule34 NSFW 331
running 8
rust 36
sailing 5
salmacian 5
samsungdex 51
save3rdpartyapps 6
sbcgaming 24
scams NSFW 8
science 107
sdforall 36
sdnsfw NSFW 286
selfhosted 157
seltinsweety NSFW 48
sffpc 61
sfwredheads 92
sharktits NSFW 24
shecame NSFW 47
shefuckshim NSFW 382
shegothands 48
shibbysays NSFW 47
shitposting 16
shorthairchicks NSFW 137
shorthairedwaifus 52
simpsonsshitposting 110
singularity 140
sissyinspiration NSFW 54
sissyperfection NSFW 79
skamtebord 8
skiing 28
slimthick NSFW 55
slink NSFW 16
slutoon NSFW 31
sluttyconfessions NSFW 92
smallboobs NSFW 208
soccer 64
solardiy 5
solaropposites 12
solidworks 23
sonicporn NSFW 20
sonicthehedgehog 65
space 101
spanking NSFW 21
specializedtools 61
speedrun 61
spicykittens 29
squaredcircle 96
stablediffusion 95
starcraft 49
stationeers 37
steamdeals 192
steamdeck 266
stevenuniverse 54
stihl 5
stolendogbeds 24
strapon NSFW 144
strava 37
stuffers NSFW 8
submechanophobia 54
suctiondildos NSFW 53
surface 39
sweatypalms 12
sweden 16
swiftui 13
swtor 51
taboocaptions NSFW 60
talesfromretail 34
talesfromtechsupport 53
tasker 54
tech 74
technicallythetruth 70
technology 163
techsupportgore 16
telegrambots 38
television 148
tentai NSFW 106
teslamotors 46
tf_irl 45
thalassophobia 49
thanksihateit 8
thedeprogram 95
thegrandtour 34
thenetherlands 57
theowlhouse 7
therewasanattempt 95
therian 42
thick NSFW 110
thisismylifemeow NSFW 15
thisismylifenow 38
thomastheplankengine 43
threesome NSFW 192
throatpies NSFW 38
tifu 412
tihi 8
tiktokchallenge NSFW 25
tiktokthots NSFW 164
tili 2
tinyawoos 7
titanfall 32
todayilearned 786
transformation NSFW 51
transmedical 35
traps NSFW 148
truefmk NSFW 31
truescarystories 25
truespotify 15
truscum 52
tucson 56
turkeyjerky 5
tvplus 38
twinks NSFW 135
twisthearthstone 4
u_icky_peach NSFW 10
ufo 42
ufos 130
ukraine 131
ukrainewarvideoreport 40
ukrainianconflict 203
unbgbbiivchidctiicbg 117
undertoys NSFW 25
unexpected 76
unfilteredcaptions NSFW 19
unixporn 74
unixsocks 40
upliftingnews 107
vegetablegardening 34
videos 35
vim 7
visionpro 43
vore NSFW 6
vxjunkies 23
vyos 26
wagnervsrussia 16
wallstreetbets 8
walmart 12
washingtondc 38
watchitfortheplot NSFW 120
weightlifting 47
weightroom 23
wetpussys NSFW 512
wholesomeyuri 86
whoopsgoesthecondom NSFW 24
widaczabory 7
wildhearthstone 3
windsynth 27
woodworking 13
wordsonanimegirls 7
worldnews 593
wow 51
wowservers 32
wroclaw 28
wrx 45
wtf 51
xsome NSFW 210
yiff NSFW 159
yiffbondage NSFW 32
yiffcomics NSFW 42
youseeingthisshit 44
youtubedrama 36
yuri NSFW 48
yuri_jp 42
zeldass NSFW 67
zillowgonewild 27
ziplyfiber 25
 

I think there's enough for now. If anything - there's going to be some heavy pruning in the amount of subs that are being maintained.

 

As discussed here, I have implemented a minimum level of upvotes that a post needs to have on reddit, as well as a minimum ratio of upvotes to downvotes.

Right now I have those configured to require at least 5 upvotes, and more upvotes than downvotes (0.51). At first glance this already seems to be great improvement. There might be some tweaking later.

As a side note I have now switched from using the reddit RSS feed, to using the JSON feed. This was required in order to get easy access to the upvote/ratio properties. So there might be some new and interesting new bugs introduced because of that. It's a brave new world.

Needless to say, the first thing I'll do after releasing this, is plop down on the couch with a beer, and hope this doesn't crash. Fingers crossed!

 

I'd like to hear some feedback on this, or approach vectors.

Right now the bot is rather spammy. I was hoping that by using Reddits HOT feed, it would return have some level of quality control (I know, right?). Unfortunately, it seems that in most cases, it will just return anything that's new. The downside of this is that a lot of garbage gets through, and the bot spends a lot of time scraping the underlying page to get the details.

I propose to only archive reddit posts that have a karma score of 5 or higher. In case of subs that hide the karma scores of posts for a certain time, they'd have to be at least 2 hours old, so that the Reddit moderators can weed out garbage on our behalf.

Do you folks have any thoughts on this?

Secondly, I want to put sticky comments on each community, with links to native Lemmy communities that cover the same subject. For this I would need some kind of API, or a master list of... oh, I see sub.rehab has just the thing I need. So expect that somewhere this week :).

 

See you on the other side!


So the update is done, but the bot was offline for 6 hours, and needed to catch up.

Unfortunately, another update slipped through, which switched the default feed from www.reddit.com to old.reddit.com, which has the side effect of changing all the urls in the posts as well. On one hand this is great, because new reddit sucks. On the other hand, this is terrible, because for every post the bot encounters, it checks if it already exists on lemmit... based on the url.

So for every post the bot encountered, it went like "old.reddit.com/r/blabla/123? Haven't seen that one yet, there's an www.reddit.com/r/blabla/123, but that must be something completely different, let's post it again!"

This also meant that the bot took over a minute and a half to update each community because it takes a couple of second per post. When I went to bed last night, I figured it was just posting a lot of content because it had so much catching up to do. But this morning I figured something was off because it still hadn't caught up.

Anyway, the fix is out now. Sorry for all the duplicates. I need coffee now.

 

ChatGPT, write a post for the stuff that I have in my head and want to get out as an update.

Hmm. No brain implant yet. Guess I'll have to write this the hard way.

Syncing update

It has been an eventful week. I successfully deployed the initial version of smarter content syncing, and have made some adjustments to algorithm since then. Most notably, communities with only 1 subscriber (the bot) will no longer receive updates, and communities with fewer than 5 subscribers or with a low posting frequency will only be updated twice a day. Furthermore, for the highest update priority (every 10 minutes), a community must have a minimum of 50 subscribers. Implementation details can be found in the decide_interval() method over here.

Being a developer is fun

Meanwhile... Damnit, bot is stuck again.

2023-07-08 10:13:39,945 - utils.syncer - INFO - Scraping subreddit: bustynaturals. Last time  2:30:48 ago, interval 120 minutes
2023-07-08 10:13:40,653 - utils.syncer - INFO - 'latina bodies are the best' at https://www.reddit.com/r/BustyNaturals/comments/14twww8/latina_bodies_are_the_best/ updated: 2023-07-08 07:14:13+00:00
2023-07-08 10:13:45,324 - utils.syncer - ERROR - Error trying to retrieve post details, try again in a bit; Couldn't retrieve post detail page
2023-07-08 10:13:46,333 - utils.syncer - INFO - Scraping subreddit: bustynaturals. Last time  2:30:54 ago, interval 120 minutes
2023-07-08 10:13:48,581 - utils.syncer - INFO - 'latina bodies are the best' at https://www.reddit.com/r/BustyNaturals/comments/14twww8/latina_bodies_are_the_best/ updated: 2023-07-08 07:14:13+00:00
2023-07-08 10:13:51,227 - utils.syncer - ERROR - Error trying to retrieve post details, try again in a bit; Couldn't retrieve post detail page
...

1 bugfix and deployment later:

2023-07-08 10:46:42,836 - utils.syncer - INFO - Scraping subreddit: bustynaturals. Last time  3:03:51 ago, interval 120 minutes
2023-07-08 10:46:43,573 - utils.syncer - INFO - 'latina bodies are the best' at https://www.reddit.com/r/BustyNaturals/comments/14twww8/latina_bodies_are_the_best/ updated: 2023-07-08 07:14:13+00:00
2023-07-08 10:46:48,327 - utils.syncer - ERROR - Couldn't find post on https://old.reddit.com/r/BustyNaturals/comments/14told8/latina_bodies_are_the_best/, skipping.

Defederation

Meanwhile, the folks at https://lemmy.world reached out to me to tell me they're defederating Lemmit. They are not fond of high volume of posts made by the bot, and the fact that there are now (quick check) 462 communities on this server all being moderated by a single person. They have already received a couple of complaints about spam, and it didn't help that some requests for NSFW subreddits were not marked as NSFW. Occasionally, those subreddits had explicit thumbnails that appeared in the 'All feed' without warning.

I had a good talk with the LemmyWorld admin, wherein they explained their point of view, and I explained mine. I understand their decision to disassociate with Lemmit, and appreciate their attempt to contact me. Other instances like Beehaw, and some smaller ones have also reached the same decision.

This does mean that you will no longer be able to get new community updates on those servers. So make sure to check the blocked instances list on your home server if you were subscribed to Lemmit. At the same time I have removed all the subscriptions of users from those servers, in order to not affect the sync priority mentioned above. This does mean, that if LemmyWorld, Beehaw, etc ever decide to connect to Lemmit again (however unlikely), you will need to un- and re-subscribe from there.

Meanwhile, I've added a feature in the bot that will remove request posts for NSFW subreddits, if the post itself is not marked for NSFW. This should prevent explicit thumbnails showing up where they are not wanted.

Server growth

Last night I got an alert from my server monitoring that the disk is 80% full. Unfortunately, the disk is only 60 GB, so that doesn't leave much room for expansion. On the bright side, a good chunk of that is from Lemmys very verbose logging (like, 4 GB a day, which gets cleaned up daily), so it should last throughout the weekend if I tune that down. Furthermore, most of the storage growth is from from pictrs, the image upload part of Lemmy, and that can utilize an S3 bucket, rather than using the VM's storage like it is now. Using an S3 bucket offers a cost-efficient solution for expanding storage. Initial estimates indicate a monthly cost of around $5 for 1000 GB of storage, which should be sufficient for a while *fingers crossed*.

In the early days of Lemmit (literally, as the server is less than a month old) image uploads were limited to a default setting, which was something around 40 megabytes. That did add up quickly (thanks to half-minute porn gifs), and so I had to limit the max filesize to 1 MB, and later 0.5 MB. Once the server has switched to S3 storage, I can probably up that limit a little, although not too much.

Finally, Lemmy v0.18.1 has been released, and it contains even more performance boosts compared to v0.18.0, so if there's time left this weekend (and I can verify the Lemmit Bot is compatible), I will probably perform the upgrade.

 

You know, on account of me upping that one setting in the admin which I should have thought of long ago.

view more: next ›