While the built-in "trends" per instance is kind of useful for seeing what's trended in the past couple of days, what I'm working on will have adjustable time windows (eg. last 5 minutes, last 30 minutes, last 3 hours, etc.)

There will also be the ability to filter to top links for a particular tag - so for example you could say "show me the top links people have posted in the last 12 hours for #sports"

@ummjackson will it cache the data, or actively query against the instances?

@colossus It caches the links, tags and languages but no identifying data about who shared it. All from public timeline APIs.

@ummjackson that's a pretty neat idea, I like that this makes the data accessible without just being a crawler feeding full text search. In your screenshot, is that a query to be executed by your tool or running against the cache?

@colossus Yeah the goal is to do it in a way that’s still useful for discovery, without compromising privacy or storing any data tied to an individual. It’s just a SQL query executed directly against the cache, after it crawled for a few minutes.

@ummjackson @colossus
I appreciate your efforts are well intentioned, but I can't help but wonder if #analysis tools end up being bad (socially) for #fediverse.

You mention #privacy a lot, which implies consideration for the users data you are going to analyse, but "#trending" tools will still possibly (probably?) herd fediverse members into filter bubbles.

Can you consider that what you are doing is replicating the existing "social media" structures in the fediverse?

@fragrancesensitive I'm not sure @ummjackson 's intent is to build those tools, but they're going to get built.

Someone is going to make sites that replicate those existing structures on data pulled from the #fediverse . E.g., Google's already indexing a bunch of instances, and they'll decide what to prioritize in search results

Seguir

@colossus @fragrancesensitive

It seems to me that an important part of the is the ability to de-federate.

As someone involved in attempts to build networks, I find this feature attractive. I support the creation of small instances, where most users would have some kind of real world connection and can carry on their business without having to participate in the aspects of the internet.

@colossus @fragrancesensitive

As @ummjackson mentions elsewhere on this thread, disabling the public timeline is a potential workaround to avoid getting scraped, as is not as doesn't have federation rules.

However, I don't see why disabling this feature for all should be the answer to avoiding a few bad actors. iptables block lists perhaps? Or maybe a network wide request to respect something more HTTP-ish like a robots.txt?, and reserve firewall blocking for violators?

Regístrate para participar en la conversación
Telecomunicaciones Indígenas Comunitarias

Servidor experimental para I+D en Intranets.