Hopefully I’d mitigate this issue by just not surfacing the old data anymore. The only place you’d see it is on user’s profiles and shelves.
For sure. I have a new game page designed for the IGDB entries, so it would look way different than the existing entries.
Yeah, I would definitely make the old games read only.
I like this. I can create some kind of system that keeps track of migrations.
There’s just so many ways to screw this up! It’s why I haven’t finished anything It’s sad too because the data linkage in IGDB is pretty fun to use. I have downloaded a bunch of the data on my development machine, and clicking around between games is fun. It’s fun to try and play seven degrees of Kevin Bacon with games.
Would it be feasible to set up a prompt via email and when people log in or and send them to a place where they can shift their data to IDGB as of a certain date in order to lessen the impact of mmuffins first point: * One is aggregated data like user ratings or completion times will be incorrect because that data will be split between existing game entries and new ones.*
Basically a cutover date? Those who care about preservation of their info can help ensure its safety or trust the work of others?
I’ve been meaning to respond to this for a few days.
I think I understand what you’re asking. I will message the hell out of the IGDB data being available.
Seems like the idea of trying to just download the IGDB stuff and say go use that and we’ll merge all our old stuff in gradually is just a bad idea. It’d probably be better to do it the opposite where you can’t even use the IGDB stuff unless it’s been “vetted” as matching one to one with one of our existing entries, or it’s been vetted as a new entry entry with no existing match. I think that would alleviate your worry about confusing aggregated data. Some way or another, I’m going to have to download everything from IGDB as a separate entry, and then we just start the awful process of merging them together. I’ll bribe people to help somehow.
Hope that answers the question. If not, message me, or come on a podcast episode again and yell at me.
This is one of those times that I wish I had some concept of how to use an LLM. This is one of those use cases where if I could use an API for ChatGPT or something and figure out how to tune it, it could do a lot of the heavy lifting for us. This seems like a really good case for machine learning, but I only have a vague idea how to do it.
Just throwing a couple more screenshots up. I think I’m fairly happy with what this looks like. Now to just figure out the logistics of the merge tools.
I’m working on it. This Giant Bomb thing sucks. If I gave a date I’d be lying. The first thing I am going to implement ASAP is getting all of the IGDB data downloaded, but not visible on the site to anyone but the curator crowd at first. I don’t think that will take me too long. I’ve kind of thought about just downloading all of the IGDB data, and then just flipping a switch and only showing that data too. Then anyone new that joins the site is getting the “stable” data, and we can work over time to transition the rest of the site’s users data over. I don’t know. It all sucks.
OK, I have officially stopped downloading anything from Giant Bomb and am in the midst of downloading the entire IGDB dataset right now. I’ve matched up all of their genres and platforms with ours the best I can, and the rest we’re just going to have to merge as we can. I’m not done with the IGDB pages, but I’ll get to them.
Just in the nick of time it seems. Giantbomb is currently in a state of meltdown after fandom, their parent company, actively tried to push it towards more and more effective monetization. Prominent staff left, the community is imploding, the words “brand safety” are now a meme on the site, and no one really knows what comes next.
Edit: Just to clarify, that means that new games added to the giantbomb wiki are not added to the grouvee database, right? Is there a interim method to add new games to grouvee until the igdb data becomes (widely) available?
I don’t think anyone is actively working on the wiki there anymore. I think there were 3 games updated a couple days ago. I don’t think anything else has happened. We’ll just speed running adapting to the new dataset now.
Not to keep beating a dead horse, but the Giant Bomb wiki right now is “locked down” to prevent vandalism. Probably not a bad idea. We will be pulling data from IGDB going forward. Once all the data is pulled in from IGDB, I’m probably going to make the switch and just only display that data in the game’s list and the search. It’s going to be a little weird for a minute, but I think it’ll be just fine.
It’s a little overwhelming, but probably around 70k. I think I’ll be doing an interface where it’s sorted by our most popular games, and then have suggested IGDB games to match them up. I’m hoping to make it pretty simple where what I’m calling “super curators” can just check a couple of boxes and say this game matches with this one and then set off a background job that will merge the games together. I’m also trying to think about how I can crowd source from the entire Grouvee population votes to suggest merges too.
That’s quite a bit. Streamlining it as much as possible will definitely help, both for the actual merging of the pages and also for finding the dupes. Maybe we can make it a contest to motivate people to help hahaha.
I need to do a query to find out how many games we have in the database that just don’t have any activity on them. I bet there’s a lot. I could probably just delete those games. I feel like as long as we get the top 30% or so merged properly we’ll be in pretty good shape.