To keep #OpenStreetMap.org up and running while we're being deluged by scrapers, we've blocked 320,000+ primarily residential IPv4 addresses in the last 24 hours (+ 100,000 IPv6) involved in scraping.
If you need OSM data, please don't scrape the website - use the official downloads at https://planet.openstreetmap.org
🙏🌍 #AI #Bots #Abuse
@osm_tech question. Why do people scrape server which make the data freely available? And, probably, better structured in the final product. I don't see the point.
@JonSaenzAgirre It is a good questions, and we don't know the answer either. Our planet data is so much easier to process and use.
Might be a good idea to become OSMF Member now or just donate some money.
Membership is starting at 15£/yer
https://supporting.openstreetmap.org/
@osm_tech sounds familiar, last year I braved turning cloudflares "under attack" mode off for https://dnshistory.org/ and saw an extra 5 million requests/day (500k unique IPs) overloading things. It's still blocking >700k requests/day a month later...
Are you able to share if the user agent of these bots has some commonality across requests? Because this looks like what I have started experiencing recently with my small, personal Mastodon instance: https://mastodon.n41.lat/@j/115966451464875347
@osm_tech FWIW, if the robots attacking you are anything like the ones attacking me (~790k unique IP addresses at peak, about half of them residential), you can mitigate a large majority of them with three ifs in a trenchcoat, implementable in nginx and Caddy, and probably apache and others too.
I'm able to mitigate ~60 million requests / day on a €4/month VPS. I had to scale that up once during a larger wave to a €11/month VPS, and that barely blinked when I was hit by a 2.5k request/sec wave that lasted ~4 days. (My bottleneck on the cheaper VPS was TLS, the defense mechanisms I employ are very lightweight.)
Happy to help if you need a hand, just give me a shout.
@algernon FYI Mull on android just got me mazed - x-request-id: 4J6to06jHMROFKYFdO0GL butyou have more important things on the go
@arichtman WTF are Mull doing. Chrome, but no sec-ch-ua.
I'm not having much luck in finding their Android browser... I'm seeing Mullvad VPN, and the browser in alpha for win/mac/linux, but not android. Can you point me in the right direction?
Not going to dive into it now, but I'd like to save it for my records.
@arichtman Meanwhile: I found a way to identify this particular browser, at least as long as they add an extra /<version number> part to the Chrome component.
I'll try to deploy that... uhh... sometime soon.
What do I do once I download the osm data? Is there a first party recipe for mapnik and mod_tile that is approachable and usable?
I've seen some 3rd party docker-based efforts in this vein, with varying levels of activity and documentation, probably the best being
https://github.com/Overv/openstreetmap-tile-server
I haven't tried it yet but it seems like a good start.
AI companies can piss off, I just want to be able to serve map tiles over http easily.