I'm currently working on the Strathspey Archive with a view to not only reinstating timely updates, but also improving the user experience with regards to threading etc. This is proving more time-consuming than I thought because I need to re-index all 65.000+ past Strathspey messages, and the early parts of the Archive are in a fairly terrible state when it comes to metadata.
In particular, many messages have the same message ID (which theoretically isn't supposed to happen at all), and rather a lot of messages either refer to one of those duplicates (which needs resolving by hand) or else have something weird in their »In-Reply-To« header (which could be resolved by hand but I don't have the time just now).
The very early parts of the Archive have moved from one machine to the next for a very long time and it seems that the archiving software of the early days had a few nasty bugs. I hope that when I'm finished with the problems within the first few thousand messages things will improve. (Right now I'm in the 1800s.)
The strategic importance of this work is that I'm trying to move full-text search into the Django implementation of the mailing list archive (and in this process can manage to get rid of the external Sphinx search engine), and that we'll also get archives for the other mailing lists hosted on the system, most notably dancedata-friends.
· · Posted by Anselm Lingnau · 5 May 2014 11:38