ISFDB:Community Portal/Archive/Archive09

From ISFDB

Jump to: navigation, search

Archive of the Community Portal - August-December 2007

Contents

Protected ISFDB pages and editing blocks

Note to ISFDB contributors. Throughout 2007 ISFDB and many other wiki sites have been under persistent spam attacks. We are working on getting some blocks in place but in the mean time have taken these steps to slow down the damage.

  1. You need to create a Wiki account and be logged in to edit pages.
  2. Protecting Pages - The spammers seem to hit certain pages over and over and so we have used the Wiki's page protection mechanism with these pages. ISFDB moderators can still edit these pages but not editors. If you try to edit a page and get an "Page is Protected" error then please leave a message here on the Community Portal or the Moderator Noticeboard. One of the ISFDB moderators either unprotect the page so you can edit it directly or copy/paste your edits.
  3. Blocking the spammer accounts - Technically this should not bother anyone else at all but unfortunately due the way the ISFDB server is hosted there is a small chance that these blocks will also block ISFDB moderators, editors, and contributors. Equally unfortunate is there is no quick and easy way for most people to bypass these blocks once you are hit by one. The ISBDB moderators monitor the Block List and try to remove the automatic blocks as quickly as possible. You should be able to edit again within a few hours. You can also e-mail the ISFDB moderators by using the address isfdb.moderators followed by @gmail.com and the first available person will remove the autoblock. Marc Kupper (talk) 18:20, 18 Jul 2007 (CDT)

ISBN-10 to/from ISBN-13 converter

Re: the ISBN-10 vs. ISBN-13 discussion, there is a handy online ISBN converter that lets you quickly convert from ISBN-13 to ISBN-10 and back. Ahasuerus 15:53, 11 May 2007 (CDT)

One of the things I frequently do is to take an ISBN and to convert it into a link on Amazon.com, Amazon.co.uk, or an Amazon image. Less, common is to convert from an old style SBN into an ISBN (computing and appending the checksum), and once in a great while I have an ISBN, usually via a cover scan on the Internet, where I'm uncertain of a digit. I decided to play around with some Javascript and added http://marc.kupper.googlepages.com/isbn. There's still stuff to be added such as ISBN-13 support, and if you hit <enter> it takes you to the same page with an ISBN argument and I need to figure out how to get that back into the input field. As a moderator I frequently need to get from a title or publication record # back to the page on ISFDB. The lower part of the page has an input field that let's me do this. Marc Kupper (talk) 23:03, 29 May 2007 (CDT)

Worldcat via FirstSearch

FictionFinder has been flaky lately, so I went trawling for other ways to access WorldCat data that would return more data elements than the rather limiting official WorldCat interface at www.worldcat.org. Eventually I found a back door to the complete WorldCat database via an Illinois library which very kindly makes its WorldCat/FirstSearch gateway available to non-patrons.

If you go to its Electronic Resources page and click on FirstSearch, you will be presented with a search form and a dropdown list. Enter the search string (e.g. "Lin Carter" or "Dark Carnival") in the first field and select "WorldCat" from the dropdown list and click on "Search". You are now a proud owner of a FirstSearch session with access to all of WorldCat and some other (less interesting to a SF bibliographer) databases. It may take a few minutes to become familiar with the user interface, but it's very much worth it. Keep in mind that you will be searching a lot of records and proper search limits will quickly become important. Ahasuerus 01:30, 19 May 2007 (CDT)

Data Validation Part I

In my never ending attempts to learn enough Python to be dangerous (why did they have to abandon PDP-11s anyway?), I have written a script that checks all ISFDB Title records against the Publication records that they point to. Here are the preliminary results:

Total cross-reference records: 353815

Number of editor mismatches:  13
Number of omnibus mismatches:  106
Number of anthology mismatches:  72
Number of collection mismatches:  256
Number of novels in magazines/fanzines:  47
Number of other novel mismatches:  705
Number of shortfiction mismatches:  1057
Number of poem mismatches:  4
Number of nonfiction mismatches:  1235
Number of nongenre mismatches:  95
Number of serial mismatches:  48
Number of Chapterbook Titles: 11
Bad title records:  85
Bad publication records:  2
Missing Publication type:  3
Bad cross-references:  748

This is by no means final, but it suggests that only about 1% of our records are grossly out of sync, which is better than I would have guessed. I will continue refining the script and will create separate lists of records for each mismatch type. I will then post them on separate "project" pages so that we could clean them up in a semblance of organized fashion :) Ahasuerus 22:59, 15 Sep 2007 (CDT)

A pointer would be handy. My finding them has been by chance and I've fixed them when ever possible.Kraang 23:12, 15 Sep 2007 (CDT)
Sure, I will post the results and the pointers as soon as I have them. It took me most of the day to get the program to run in under 1 minute as opposed to over 15 minutes, but with that out of the way, I should have something workable on Sunday. Ahasuerus 01:25, 16 Sep 2007 (CDT)
I have added a Serials-specific table to the Data Consistency project page. Please post any comments/suggestions here and I will incorporate them into the script before I run it for other title/pub types. Thanks! Ahasuerus 00:41, 18 Sep 2007 (CDT)
Doh! I just posted my comments THERE! :-(
Oh well - comments for HERE can be "Please share the SQL" (and/or Python bits - I'm sure I can convert them to PL/SQL, although I'm still waiting for a fresher backup before I will bother doing that. BLongley
Sure, I will post the Python code over on the other page, although it's still pretty raw. I ended up loading 3 MySQL tables into memory and processing everything in Python since individual MySQL queries were too slow to be practical. Thankfully, Python dictionaries are optimized for speed rather than memory use and I have more memory in this laptop than a thousand PDPs 30 years ago :)
One thing that we may want to discuss on this page is the policy/help implications of the script's findings that you raised on the other page, e.g. whether that "creative use" of SERIALs really constitutes serial abuse. I could then update the script based on our decisions to minimize subsequent false positives. Ahasuerus 23:32, 18 Sep 2007 (CDT)

Nominating Dcarson for moderatorship

Ref: Moderator Qualifications#Becoming a moderator for the nomination process.

Nomination statement

I nominate Dcarson (Dana Carson) (talkcontribs) for moderatorship; he has accepted the nomination. Dana has 1,753 edits as of today and has contributed to a number of discussions in the last few months, demonstrating good communications skills and ability to work with other editors. I believe that he is qualified.

Support

  1. Support, as nominator. Ahasuerus 21:28, 23 Sep 2007 (CDT)
  2. Support, in total agreement with the above statement. I feel the ability to communicate well is the most important attribute of a moderator, and Dana certainly fills the bill on that count. Mhhutchins 22:02, 23 Sep 2007 (CDT)
  3. Support, most certainly. Dana has displayed a knowledge of important data entry issues and is a good communicator.--swfritter 22:21, 23 Sep 2007 (CDT)
  4. Support, seems good to me. Cautious enough not to do any damage and always willing to learn and communicate. BLongley
  5. Support, communicates well and has a good knowledge of how the db works. Will make a good addition to the moderators pool.Kraang 17:12, 24 Sep 2007 (CDT)

Oppose

Comments/Neutral

  1. Neutral - I'm sorry but I've been out of the ISFDB loop quite a bit lately though would lean towards "support" of the nomination as I don't recall Dana as vexing. Marc Kupper (talk) 13:45, 27 Sep 2007 (CDT)

Outcome

Nomination passes. Congratulation, Dana! Ahasuerus 17:05, 1 Oct 2007 (CDT)

Welcome aboard! BLongley 17:51, 1 Oct 2007 (CDT)

Strange Printing Statement on Ballantine edition of Lester del Rey's Nerves

Can anyone make sense of the printing statement on this edition of the novel? What printing is this one and what date? Jeez! Mhhutchins 17:14, 24 Sep 2007 (CDT)

I believe you did the best possible thing in that you simply documented what's stated in the publication. It is curious that Ballantine seems to have publishing both the original and revised editions at the nearly the same time. On their own copyright statements are often difficult decode and so I just document what I see a and hope that some day we gather enough data to make sense of what we found in individual publications.
It turns out I have I have a Del Rey / Ballantine edition of Nerves that's very similar to yours except the last line that says "Sixth Printing: September 1977" has been replaced by one that says "Seventh Printing: November 1979" [1].
Does your 6th printing have the Historical Note on page 177? That seems to explain what's up but I don't have time to summarize this for the ISFDB title notes. It does seem like we need to separate these into Nerves (1942) (the short story), Nerves (1956) (first novelization) and Nerves (1976) (revised novel).
Even with this additional data I'm still confused by Ballantine's copyright statement. I suspect they ran into a time crunch in that they had the revised edition in the pipeline but had run out of stock of the original 5th printing. They may have published old, new, and special copies all in 1976 trying to bridge the gap and scrambled up the copyright statements. Marc Kupper (talk) 14:42, 27 Sep 2007 (CDT)
I've split out the 1976 and later editions of Nerves into their own title record. It's still a little awkward as it's not a "variant title" though I'm tempted to make it so to get the new title record sorted up next to the 1956 edition. Mhhutchins, I'm assuming, but would first want to confirm, that your copy has the Historical Note on pages 177-180? Marc Kupper (talk) 00:04, 29 Sep 2007 (CDT)

Etiquette Question

I use the "Recent Changes" a lot to see what's going on in the Wiki, and it usually works well. But when somebody's responding to an old entry, I have to go find what changed in "history" for that page. Even new entries under old titles are tricky - does nobody but me get annoyed by the latest "Nomination statement" linking to the FIRST entry with that title? (We're voting on DCarson but the link goes to WimLewis' nomination... :-/ )

Anyway, the simple answer to me looks as though moving the active section to the bottom of the page means everybody will go find the right discussion quite easily. But reordering a user's talk page seems to be taboo, and even a general community page still gets edits to entries made weeks or months back. I've even wanted to respond to things now Archived even though there's more dead-and-buried threads on this page than there are in some archives. BLongley 16:49, 25 Sep 2007 (CDT)

Am I being over-cautious again, or is there really a Wiki-Convention about NOT reordering talk pages? I was never a Wikipedia editor, so there might be some conventions I'm not aware of: but as usual, even if there ARE any there, I'd challenge them HERE as inappropriate for such a small base of Editors and Mods. BLongley 16:49, 25 Sep 2007 (CDT)

I'm with you completely on this and often find it difficult to follow when a wiki section morphs into multiple sub-thread's all with their own nested series of indents. Most of the time discussion occurs on talk pages. The tab things at the top of talk pages include a "+" tab to the right of edit. If you click on it that'll add a new section at the bottom of the page divided off with a == section header. Presumably this "+" is a shortcut to an established wiki-convention and the implication is if someone wants to add a new section to a talk page they do so by appending it to the bottom of the page and sectioned off with ==.
Community Portal is unusual in that it's a "talk page" but it's not on the "talk" or "discussion" tab meaning we don't get the "+" support that wikimedia provides for talk pages. Thus we need to add new sections manually and hopefully follow the same convention.
As to how to reply to old or multiple discussions. I feel this is the largest failing of wiki and why I've pushed at times to adopt something like a message board. Anyway, the "bad" part about the wiki model is that casual participants need to wade through a lot of redundant/painful stuff to keep track of new comments. Active participants don't care as they already have a mental model of the threads and sub-threads. The "good" part that when someone comes along later the threads and their sub-threads are all nicely laid out with a minimum of overhead needed on the reader's part to follow threads. You see comment/reply/counter/agreement all in context and the main thing you may loose is sometimes an active out of band (OOB) conversation gets carried out on the subject/summary lines and that's only visible on the page's history. (so far OOB conversations have not occurred on ISFDB).
As for rearranging a talk page to move items to the bottom. That's quite rare. I'll do it if a new-to-wiki person "top-posts" a question in that I'll move it to the bottom of the page with a new section header and then answer it. I've never moved a thread from the middle to the bottom of a talk page and rather I just reply to it in the middle of the page and yes, the only way to detect this is via recent-changes, watch-lists, and history.
Sometimes on wikipedia people will grab a section of a page and move it (sometimes editing it at the same time). These are difficult for other people to deal with as the history shows a section of text getting added/deleted but it's difficult to tell what got modified within the section. Thus I'd guess the convention would be to not move talk sections threads to the bottom of a page.
I've grown to like IMDB's model which is a mix of message board and wiki. Individual threads are wiki-style with indentation for sub-threads and the list of messages is sorted message board style with the most recently updated threads at the top of the list. The main downside is someone who wants to read all of the threads ends up opening many separate tabs/windows/pages (depending on how you use your web browser). One thing I don't like is that very long threads get split across multiple pages. I wish the page size was much larger (perhaps 100K or 1 megabyte). Of course, I often don't even read the long threads simply because I find that they hand to be a silly back-n-forth or people just speculating about stuff they haven't a clue on and so a [1][2][3][4]... is a sign to me to not even bother with looking at the thread. :-) Marc Kupper (talk) 16:49, 27 Sep 2007 (CDT)
Thanks for the comments! I've worked with various sorts of computerised discussion forums on and off for... well, almost a quarter of a century I guess. (Yes, I know that predates the WWW, but we DID get by without that somehow.) Over the last five years or so people have made me responsible for ORGANISING suchlike, and that's where I want to admit I don't really want to be responsible for such if it means I have to learn a load more social rules for forums I don't really care about. The ISFDB is something I still care about, but Social Rules for it are a bit beyond what I want to do here. BLongley 17:29, 27 Sep 2007 (CDT)
And why don't we use the Community Portal talk page for discussions? It has the little '+' thing for adding new messages.--swfritter 17:50, 27 Sep 2007 (CDT)
I'd have no problems with switching to the talk page though one comment is that Community Portal is in the left navbar and thus always available while a talk page would take two clicks. Marc Kupper (talk) 23:13, 30 Sep 2007 (CDT)

Nominating Chris J for Moderatorship

I nominate Chris J (talkcontribs) for moderatorship; he has accepted the nomination. There are very few people that have contributed more, as you can see at the Top Contributors list and he has been helpful and responsive in discussions with other editors. A special mention must be made of his work on some of the largest, most complicated Series Hierarchies we have: ones that daunt many other people. I believe that he is qualified.

Support

  1. Support, as nominator. BLongley 14:23, 26 Sep 2007 (CDT)
  2. Support. Very prolific and conscientious. Areas of strength include series (re)organization and non-US publications. Ahasuerus 15:16, 26 Sep 2007 (CDT)
  3. Support. Chris has been doing a good job for a long time. --Unapersson 17:23, 26 Sep 2007 (CDT)
  4. Support. The only thing I can add is "It's about time!" Welcome aboard, Chris! Mhhutchins 17:49, 26 Sep 2007 (CDT)
  5. Support. I supported Chris' nomination before have not heard of any tragedy that should change this opinion. Pin a moderator's star onto his hat. :-) Marc Kupper (talk) 14:45, 27 Sep 2007 (CDT)

Oppose

Comments/Neutral

Outcome

Nomination passes.

And there was much rejoicing that now we won't have to stay up all night working on Chris' submissions! :-) Ahasuerus 17:09, 1 Oct 2007 (CDT)

Not that there was anything WRONG with them, there were just so MANY! I blame him for my Moderator Score being higher than my Contributor Score. BLongley 17:47, 1 Oct 2007 (CDT)
Thanks for all your kind words. I shall do my best to keep the high standards required. --Chris J 15:24, 2 Oct 2007 (CDT)
I just realised the Link from the Moderator screen to its Help is broken - use this instead, if you haven't spotted that already. I don't know when the "tamu.edu" problem will be fixed, but for now, always try changing the "tamu.edu" to "org" when a link seems broken. BLongley 16:20, 2 Oct 2007 (CDT)

Slow night?

The Wiki has been really slow for me, to the extent that I seem to have lost some edits. (Which is probably good news for some people I was trying to talk to, as I was in full "Sergeant-Major" mode at times. Got to keep the new recruits on their toes ;-) ...) Anyone else been suffering? BLongley 16:44, 27 Sep 2007 (CDT)

Smooth as it can be connecting from the Rockies tonight. Ahasuerus 17:59, 27 Sep 2007 (CDT)

Pseudonymous stories in author collections?

Here's an example of what I've been seeing too much of lately. We all know that in single-author collections, authors will reprint stories that were originally published pseudonymously, but I've NEVER seen one who will keep the original pseudonym. And I can't imagine that the original submission gave the credits to someone other than the author of the collection. So how could something like this or this happen? Am I just being more observant lately, or is someone allowing wholesale changes to contents that affect all pubs of a story? This evening I had to clean up Anthony Boucher's The Compleat Werewolf which I had updated with contents and verified months ago. In the meantime, two of the stories had somehow become credited to "H. H. Holmes". Someone must have entered the magazine containing the original byline and then merged then with ALL OTHER PRINTINGS of the story, even those credited to Anthony Boucher. How could I respond to someone questioning my verification of a collection containing pseudonymous works when I can't control modifications after the fact? It's times like this when I wish there was a "freeze" option to prevent further updating without the original verifier being in on the process. Vent over. Mhhutchins 22:04, 29 Sep 2007 (CDT)

This is a HUGE issue, as I have stated elsewhere. There is also data that has been corrupted in the magazines. This issue was manageable when there were fewer Editors/Moderators and there was less specialization. If there is only one programming change that can be accomplished this year it has to be this one. There is no need to attempt to assess blame to Editors or Moderators when the problem is with the software design. Nor will I assess blame to the software designer who does not currently have the time to address this issue. The software is still essentially in beta mode as are the Editors and Moderators. To reiterate, this is important.--swfritter 11:40, 30 Sep 2007 (CDT)
As a reminder to everyone: If you see incorrect title/authorship information in a magazine/collection/anthology please follow the process described in this Help. We need also to consider the possibility that the changes are being made at the title level and not within magazines/collections/anthologies.--swfritter 11:40, 30 Sep 2007 (CDT)
Ouch! Yes, those titles look a real mess. :-( I've checked a few books I own that could have been affected by such edits, and those seem thankfully free of such problems. I agree it's probably a problem with approvals outside the approver's specialisation: personally I don't want to mess with a title that's got a magazine publication unless it's to correct the date to that of the magazine (and even that was a bit controversial when that started, as magazine editors were happy to keep the year alone). BLongley 15:32, 30 Sep 2007 (CDT)
Mike, there ARE collections that do give story-level credits to pseudonyms while the overall pub goes to the canonical name: e.g. The Best of Kuttner 2 but that is the only exception I can recall: and The Best of Kuttner 1 confirms that this was a rare exception indeed, as they didn't keep it up for the same series. But it's so rare that it's correct to make a change like that that I'd be happy to have such edits banned until two mods or more approve them. I know, big change, requires Al, etc. :-/ BLongley 15:32, 30 Sep 2007 (CDT)
There are a number of ways for an inexperienced editor to mess up pseudonym attributions, e.g.:
  1. change the story in a magazine/collection
  2. merge the pseudonymous version of the story with the canonical name version
  3. merge the pseudonym with the canonical name -- I have rejected two attempts to merge "Robert Heinlein" with "Robert A. Heinlein" so far (ouch!)
Inevitably, some of these edits will get past the moderators due to their sheer number. The easiest software fix, as we discussed earlier, would be to change the default behavior of the Publication Edit logic so that it would create a new Title -- unless a special box was checked, perhaps. Hopefully, this change will help address at least 80% of the problem submissions.
Having said that, things may not be as bad as they appear. As far as I can tell, the reason why a significant number of old collection Publications attribute stories to pseudonyms is that our program that imported collection data from Contento years ago was flawed and didn't understand Contento's conventions. For example, when Contento described the contents of Pohl/Kornbluth's Before the Universe, he listed the pseudonyms that the authors originally used in the early 1940s pulps, which was not the way the stories were attributed in the 1980 collection. Our record, derived from Contento, uses the early 1940s pseudonyms and apparently has been that way for a long time. (The record was verified by Scott Latham back in February when he was new to the project.)
This is not a good thing, but at least it suggests that in many/most cases we are dealing with pre-existing problems as opposed to newly introduced data corruption occurrences. Ahasuerus 16:23, 30 Sep 2007 (CDT)
One comment would be when reporting a problem to note exactly what you saw and exactly what you expected to see. I can click on the links you provided but without some context I can't tell what the problem is. Granted - your report has far more detail than a bug report someone sent me the other day which said "When I try to access this page its not there. Strange...."
I believe someone has already corrected the records as I did not see anything that I'd say is "wrong." If I knew what to look for I could look at at old download of the database to see if what you spotted was a long standing issue or something created by a recent edit.
I do agree that one of the #1 priorities is to add more code to protect ISFDB records, particularly ones that have been verified, from indirect modification. Marc Kupper (talk) 23:07, 30 Sep 2007 (CDT)
I'm not sure we even need a checkbox for author/title data, it should perhaps be done automatically with a warning. I cannot remember the last time I edited a a pub where it was appropriate to do anything but create a new title. A further complication - the removed title is usually a variant title. We at least need a log of 'un-merges' so the data can be analyzed.--swfritter 12:48, 1 Oct 2007 (CDT)
Good point about an "unmerge log"! As far as the checkbox goes, there are occasional typo fixes that do not require creating a new Title record. Ahasuerus 14:01, 1 Oct 2007 (CDT)
Like ellipsis which should have spaces all around. More than 1000 entries done incorrectly and only about 250 done correctly. Learned the hard way when copying and pasting 'According to You...' rather than 'According to You . . .' throughout the Fantastic series.--swfritter 13:19, 2 Oct 2007 (CDT)
Oh no, not another rule I'd not noticed! An ellipsis should be entered as the sequence "space", "period", "space", "period", "space", "period". If the ellipsis is in the middle of the title, it should be entered with a space after it as well, prior to the start of the following word. :-( It seems easier to change the help than change all our entries though, we seem to prefer the opposite convention by about 4-1. And even if we don't, pointing out this "regularisation" rule should allow us to merge some variants, e.g. go search for the title Breeds There a Man - I'd personally prefer to reduce those to Three variants, if not Two. (I don't see the quotes as adding much value, personally.) Before you go edit a few hundred more, perhaps this should go to Rules and Standards? BLongley 16:13, 2 Oct 2007 (CDT)
I agree that we will want to discuss all pluses and minuses of the competing approaches on the Rules and Standard page. The good news is that whatever standard we agree upon, it will be easy to change the data programmatically. Ahasuerus 16:19, 2 Oct 2007 (CDT)
OK, discussion started. My apologies if my previous question (still unanswered satisfactorily) messes up your display, move that to archive if nobody but me cares. BLongley 16:56, 2 Oct 2007 (CDT)
Good that it defaults to "unmerge". That will force editors to put some thought into their decision.--swfritter 14:40, 1 Oct 2007 (CDT)
Marc, I think Mike is pointing out that a Collection by "John W. Campbell, Jr." shouldn't have "[as by Don A. Stuart ]" on all its entries. They would more normally all be credited to "John W. Campbell, Jr.", although there are very rare exceptions like the Kuttner I mentioned above. Anthologies are more likely to contain pseudonymous entries - especially if there's another title in the same collection by the same author, some editors seem to like it to appear as though all the stories are by different authors. Sometimes the editor hides his own contributions under a pseudonym too. BLongley 14:05, 1 Oct 2007 (CDT)
I think there is a sort of word-blindness that can be acquired here after too many edits/approvals as I slipped up myself recently and verified all the right contents but a couple with the wrong attribution. :-/ No sweeping changes involved there, but I do feel dumb about missing the "[as by]". :-/ BLongley 14:05, 1 Oct 2007 (CDT)

(Unindent) Is this something that a Data-Cleanup script would be useful for? Mike has identified some Collections where ALL entries are credited differently, those seem a good target. Collections with SOME variant author names may be a bit suspicious (although not proven wrong). I don't think we can spot Anthology problems or sweeping-edit problems that easily, but there should be a good start with the Contento-copying problem. BLongley 16:09, 1 Oct 2007 (CDT)

I take a day off and come back to see what a hornet's nest I created. Ahasuerus's explanation about the importing from Contento goes far to explain how these situations arose. Even an experienced bibliographer as myself (ha!) has to look twice when I check Contento. As for the Campbell example that I cited, here's what it looks like on Contento. Look familiar? At least now I feel assured that the current crop of moderators here on the ISFDB didn't have anything to do with these occurrences, and want to apologize if anyone felt that I was making any (indirect, albeit) accusations. Now let's go out thar and round up them critters. Mhhutchins 16:11, 1 Oct 2007 (CDT)
I'll write a script that will look for collections with pseudonymous stories once I take care of my current bugs (not the computer variety). Ahasuerus 14:20, 2 Oct 2007 (CDT)

reCAPTCHA

Lets you both filter spammers and help decode old texts. reCAPTCHA Dana Carson 16:49, 2 Oct 2007 (CDT)

We seem to have defeated the Spambots for now (touch wood, cross fingers!) but thanks for the pointer. Hey, I guess that we Old Moderators can talk about the Spambot Wars in future, just to make us feel more battle-hardened than these newbie Mods... ;-) BLongley 17:01, 2 Oct 2007 (CDT)

Data Validation Part 2

I have finally finished checking all Title records against Publication records in the ISFDB. All suspected inconsistencies have been posted on the data consistency project page. Some are easy to fix, e.g. we have a bunch of EDITOR Titles that should be clearly changed to ESSAY. Others are trickier, e.g. some Collection Titles are legitimately contained within Anthology Publications. And some are just tedious to process, e.g. we have almost 250Kb worth of messed up Nonfiction Titles. Ahasuerus 21:09, 3 Oct 2007 (CDT)

And now the Pseudonyms in Collections script is done. It generated a 145Kb target list, so feel free to break it up into more manageable Wiki pages if it takes too long to load. Also, quite a few of the suspect Collection records use "James Gunn" vs. "James E. Gunn" and similar legitimate variations, which I can't easily filter out in the script :( Ahasuerus 23:31, 3 Oct 2007 (CDT)
Breaking it up into Collections and Anthologies would help - Anthologies don't seem to be as much of a problem. BLongley 08:40, 4 Oct 2007 (CDT)
Yes, in retrospect I should have done a number of things differently, including sorting. I don't think we necessarily want to change the structure now, what with numerous comments already added to the pages, but I will make some changes to my scripts before I analyze the next backup snapshot. Ahasuerus 11:12, 4 Oct 2007 (CDT)

Communicating with editors: have we reached a dead end?

In a discussion with Bill Longley, I've come to the conclusion that the lack of communication between moderators and editors is leading us down the strange and crooked path. Looking back at the user talk page of several editors, I'm seeing moderators' questions going unanswered and a general lackadaisical response when they are answered. Even when I reject submissions in order to get some kind of response, more often than not I get zilch in return. Is there a tutorial (or a link to one) that can be placed in the Welcome message for new editors? One that stresses not only the skills of communicating in a Wiki environment, but the NECESSITY of communication among the ISFDB community? It feels like I'm slamming my head against a wall when trying to talk to some editors when the only result is a headache for me. Or am I being too harsh about a situation when the only reward for any of us is generated from within? Anybody else feel this way? (And I don't blame the tamu.edu problem. Most internet users should know how to get around that problem.) Mhhutchins 18:30, 9 Oct 2007 (CDT)

It may not be the tamu.edu problem as far as responding to moderators but that problem does make it harder to access the Help screens while in edit mode and it does give the site a less than stellar feel and might be creating a lack of confidence in the system. Along with non-responsiveness I recently have seen a comment along the line of "if it's too much bother just reject it" and cases where editors seem to be repeating the same errors after being repeatedly told of their mistakes. The ISFDB is a complex system with elaborates rules. It requires more than a casual interest. If there is something we as moderators are doing to discourage editors I would hope they will apprise us of their concerns.--swfritter 20:15, 9 Oct 2007 (CDT)
There are at least two things going on here:
  1. The communications action is significantly complicated by the nature of wiki as implemented here. There's no ONE place where it takes place. That so badly fragments the process that it becomes nearly impossible to follow successfully.
  2. There's no place an editor can actually see (without reading code and who, exactly, wants to go to THAT trouble) what the rules and relationships are. Where's the explanation of the referential integrity enforced on ISFDB entries? Where's the description of what's in which table? How the tables are linked? You get the picture. For the average editor (non computer buff), this stuff is a dark and intimidating mystery.
I guess the upshot of it all is that you shouldn't be surprised that you don't get questions answered. Where and how did you ask them and where and how should the reply be placed are not trivial questions. In general, the process is broken.

I suppose I'm also guilty as charged. At least I've got an excuse, I've been out of the country since mid-August.
It might be interesting to ask (the question arises: how?) the editors as a body, if they have trouble with the communications process.
--Dsorgen 22:32, 9 Oct 2007 (CDT)

As a moderator I have no need to know about "referential integrity" (whatever that is!) When I ask someone a question about the book they've just entered, they only have to answer the question. Period. If I can place a question on their user page, they can just as easily answer it on the same page. I have no experience in the technical side of computers. I'm a reader (well, I used to be before I came upon the ISFDB.) Anyone who has the passion to become an editor doesn't need to be a computer wiz either, just the ability to fill in the fields. And to answer a question when it's asked of them. It's not that complicated. Or am I wrong to think that the average reader has a higher than average intellect? I'm not tarring every editor with the same brush. There are quite a few who have gone through the mentoring process, and without exception they are the ones who communicate with the moderators. It's that simple. Mhhutchins 22:49, 9 Oct 2007 (CDT)
I suspect that a big part of the problem is that the ISFDB is indeed a "complex system with elaborate rules". Ideally, much of the complexity would be handled by our software, from simple ISBN checksum validation rules to "lastname, firstname" checks and all the way to dynamic database lookups on Authors and Titles a la Web 2.0. Al has suggested that he will work on adding validation logic once he takes care of the known bugs, but we have to keep in mind that implementing complex bibliographic rules is necessarily complex, so it will take some time. If anything, we have been relaxing existing validation rules, e.g. the "page" field used to require an integer number, but then it had to be changed to free text because of entries like "xii+345+238".
In the meantime we have to handle this complexity manually, one submission at a time, which requires extensive communications during our fairly steep learning curve. That's easier said than done under the best of circumstances, and the .tamu.edu problem has made it particularly painful. Al has access to various communications channels with TAMU and I hope that he will have the problem taken care of once he comes back (which was supposed to be this week), but until then I expect it to be headache-inducing :( Ahasuerus 22:07, 10 Oct 2007 (CDT)
I think the tamu.edu problem MUST be putting off more people than we expected. Even the communication between moderators has dropped a lot, as we look at another "tamu.edu" broken link, sigh, think it's not worth the bother of changing the URL, and leave the message unsent. We may not be supporting the newer moderators well either - it's not an instant switch from "beginner" to "expert", but we may be putting unnecessary pressure on new mods that get the power but not the knowledge. I know I've seen some of the newest moderators leave some controversial merges or make-pseudonyms or such-like on the submission queue - no questions asked on the Wiki, but the delay makes me think they're either researching a lot longer than they should need to or could still do with some advice. I know I could still do with some advice at times! I used to get it too - but there is more of a "let's leave it to someone else" attitude now. Or an unspoken "You're a Moderator yourself now, you sort it" attitude. Things were a lot clearer when I started - there were two or three Bureaucrats, a handful of Mods, and they all helped out. Now we have multiple handfuls of Mods, and most Bureaucrats are mostly unavailable, some Mods are deliberately specialising and will NOT progress further than their speciality in the short term, maybe not in the long term either. That doesn't mean they're not useful - they, along with editors mostly under consideration for Moderatorship, actually have made some DECISIONS recently! But we have to move from "Al is God of ISFDB, the Bureaucrats are his chosen Disciples, the Moderators are his Angels sent to help all mankind editing here" to a more realistic model: the Moderators aren't perfect (I'm certainly not!) but should still be actively helping others. Not immediately: we still need to get new Mods over the Learning-Curve of how to deal with approvals, etc, and I haven't seen any new Mod even discuss the Help Screen advice for that recently, and it's needed! But communications are breaking down - even BEYOND the tamu.edu problem - as we lose the sort of "Chain of Respect" we once had. It's more of a "Tree of Respect" now - I don't mess with US magazines, Swfritter doesn't mess with British Books, for instance. I think we both do good work here, but I tend to have slightly more respect for him at the moment as he's mentoring a good new editor and I'm not. I'm happy to do so though. BLongley 17:03, 14 Oct 2007 (CDT)
Bit of a rant there, Sorry! Particularly if anyone was offended by the Christian mythology - I'm not Christian now, just have a lot of that history to overcome. :-/ I don't want "Archangel" level Mods or such, just a recognition that accepting Moderatorship doesn't mean "you're on your own now" and that Mods still help Editors. And ideally that good Editors help new Editors too. (How did Wikipedia get over this stage?) BLongley 17:15, 14 Oct 2007 (CDT)
Remember the blocking thing? That may have also discouraged a few people. In any case the more experienced moderators need to maintain a sense of cool when dealing with editors and each other - and thankfully the moderators seem to be having no problem with each other. --swfritter 17:47, 14 Oct 2007 (CDT)
Oh yes, that blocking thing may too be a factor, particularly as it blamed people that didn't know the Wiki software was using their name on rejections. :-( Hopefully we've got over that problem now, but we may have lost a few editors at the time though. BLongley 18:49, 14 Oct 2007 (CDT)
I'm not entirely sure all Mods ARE cool with each other yet though: and frankly, I'd never expect them all to be now that we have this many. There's a lot of etiquette to be resolved - e.g. I dealt with a few Dissembler submissions today, and we all "know" Al deals with those, don't we? I'm hard-skinned enough to accept a complaint if he meant to deal with them himself and didn't want interference. Most active mods have stepped on each other's toes at times. I once suggested we add "specialities" to the Moderators table, although that may mean yet more work directed at individuals until we get multiple Specialisers resolved. It's a transition time here - I've finished one or two passes of every SF book I own now, and to some people that might be time to quit. I'm sticking around as I feel I can still contribute (and not just from new book arrivals) - but sometimes I will be the annoying person that asks if a Verified publication really DOES say that. Or I'll be the person that helps clear a Submission queue so long that you'd have to go through four pages before you got to the edits you want to approve. I'm always willing to TALK though, and even if I get twenty "you're a PITA" comments one week it won't stop me helping. Twenty "you're not helping" comments though, and I might retire. But I'd at least like to see the days of twenty comments a week returning! BLongley 18:49, 14 Oct 2007 (CDT)

(unindent)I suspect that we may be discussing a couple of different issues here. First, the fact that more submissions are left in the queue for longer periods of time may be simply an attempt to defer to a moderator who is known to be mentoring the submitting editor or who is working on a particular area, which is not a bad thing in itself. On the other hand, if the mentoring editor is not available for a few days, it can lead to a pile of stale submissions sitting in the queue. I am not sure if this has been a problem lately, but in the past some moderators dealt with these situations by asking another moderator to take a look at a specific set of "held" submissions before gafiating for a bit.

Second, what (in early 2007) may have appeared to be a hierarchical "Al-bureaucrats-moderators-editors" structure was actually a rather ad hoc affair. The first few people to sign up in mid-2006 were made Bureaucrats so that they could run the administrative side of things while Al concentrated on fixing bugs. And there was a ton of them, BTW. By the time the beta phase of the project opened at the end of 2006 and outside submissions were enabled, we had stopped bureaucratizing new moderators. Naturally, Bureaucrats were more experienced than new editors in early 2007, so they may have appeared to be "experts". Eventually, the original Bureaucrats became less active and I think I am the only one constantly monitoring the project at the moment (when I am not sick, that is). Since the only thing that distinguishes a Bureaucrat from a Moderator is the ability to promote editors to Moderators, I am not sure if we need more of them, but I can always wave the magic wand if need be.

Third, our inability to communicate with some editors due to (we assume) the tamu.edu problem could be conceivable addressed by changing our Wiki settings to ask all new editors to provide an e-mail address at sign-up time. The reason why we didn't do it in the past was that we didn't want to scare potential editors away, but it may be the lesser of two evils under the circumstances. BTW, Al indicated the other day that he will be mostly unavailable until the end of the month, but once he is back, he may be able to address the issue programmatically by changing all www.tamu.edu pointers in the software to www.isfdb.org. Ahasuerus 20:02, 14 Oct 2007 (CDT)

That would be a prerequisite before a move to another host, so a good clean-up to do just on case that is needed at some point (probably a simple global search and replace too).
You're right, there's several different issues here: editors not using the wiki, editors not listening to mods, new mods being left to fend for themselves, etc. And now you're worrying me about the "ton" of bureaucrats that have disappeared! What's the half-life of a Moderator these days? Or do we Spontaneously Combust when stress-levels and frustration reach a certain point? BLongley 14:25, 15 Oct 2007 (CDT)
Actually, "And there was a ton of them, BTW" referred to bugs, not bureaucrats :) Ahasuerus 17:49, 15 Oct 2007 (CDT)
I'm not too keen on email as the backup communication method, I've deliberately tried to keep my e-dress pretty secret since I moved to the latest ISP and have a nice quiet spamless time now. But some better directions at sign-up would be a good start, I think. BLongley 14:25, 15 Oct 2007 (CDT)
Oops, Blongley, I just gave you my email address on your talk page. Feel free to ignore it. I think part of the movement of editors/moderators is that people have entered all the data they are interested in seeing in the database. It was my goal to fill in the gaps for the 50's magazine this year and I think, with the data I have added, there will be entries for nearly all the sf-only American mags from that era by the end of this year at which time I will probably slow down. Or maybe not. Magazines are the gift that keep on taking.--swfritter 18:09, 15 Oct 2007 (CDT)

Nominating Rkihara for moderatorship

Ref: Moderator Qualifications#Becoming a moderator for the nomination process.

Nomination statement

I nominate Rkihara (talkcontribs) for moderatorship; he has accepted the nomination. Rkihara has been working on 1950's magazines and recently on some 1930's magazines. He has shown attention to detail and extreme caution in protecting data. He has a capacity to communicate promptly, concisely, and logically. He has recently started getting involved in discussions where he has provided valuable opinion and insight. I believe that he is qualified.

Support

  1. Support, as nominator. Swfritter 13:41, 11 Oct 2007 (CDT)
  2. Support: although I've left approving his edits to those mentoring more closely, the ones I've reviewed have been of high quality , and his communications very clear. BLongley 14:08, 11 Oct 2007 (CDT)
  3. Support. A careful and thoughtful editor. He has mostly concentrated on magazines, but I am sure that if he runs into any issues with non-magazine material, he will not hesitate to ask questions. Ahasuerus 15:45, 11 Oct 2007 (CDT)
  4. Support. Qualified? Hell, yeah! I wish more editors were able to catch on to the complexities and nuances of database entry as well as Rkihara. And he knows how to communicate as well. Mhhutchins 17:47, 11 Oct 2007 (CDT)
  5. Support. Communication is the key to learning how the database works. Rkihara has my full support.Kraang 20:21, 11 Oct 2007 (CDT)

Oppose

Comments/Neutral

Outcome

Nomination succeeded. Congratulations, Ron! Ahasuerus 12:19, 17 Oct 2007 (CDT)

Thanks!--Rkihara 16:20, 17 Oct 2007 (CDT)

Data Cleanup Part 3 - Serial Dates

As promised, I have created a Wiki table that lists all Serial records whose dates do not match the dates of the Publication records that they appear in. In some cases the problem is a missing month in the Serial record and in some other cases the Serial record itself is suspect, e.g. Cherry's "Untitled" pieces, which we probably want to change to a Series of short stories. Still, there is a substantial number of Serial records who dates are significantly out of synch according to our latest Help changes.Ahasuerus 16:44, 13 Oct 2007 (CDT)

"Fix-up's" A possible short term solution?

I just did E. C. Tubb's Alien Dust [2]. Its a bit of work but this may be a solution for other fix-up's.Kraang 23:09, 13 Oct 2007 (CDT)

I like it alot, but as you point out, it takes some work. I see on Ahasuerus' user page that he will be asking Al about creating a new "fix-up" relationship, similar to the contents to collection/anthologies relationship but for novels. I'd hate to see so much work go into fixing-up the fix-ups, just to see a new standard established somewhere down the line. If I were sure that no such relationship could be easily created, I'd readily agree to your solution. Mhhutchins 23:39, 13 Oct 2007 (CDT)
As Bill pointed out a few days ago, one of the issues with this workaround is that title merges can do major violence to it :( Ahasuerus 03:17, 14 Oct 2007 (CDT)
Well, my example still seems to have all its links working still, but on less stable stories I think they'd all be broken by now, although the text should still remain to give the pointer. Note that I did mine the lazy way, using "view source" on a pub that had all the contents listed, lifting the relevant HTML section to put in the title record, and trimming a little. That's why mine has so many redundant links to Silverberg, but it was far faster than creating from scratch! I think I'll continue to do that when I find suchlike, but probably not bother with creating links otherwise. BLongley 03:32, 14 Oct 2007 (CDT)
Actually, it's almost as easy to take it from the source of the Author's page - more trimming, but complete lines rather than extracting columns. I just tried that with Sister Alice, which was a surprise (to me) fix-up. BLongley 14:03, 15 Oct 2007 (CDT)
I'll try it from view source on the next one. The one thing mine has is also a link from the shortstories back to the fix-up novel. This way anyone that looks at the shortstories in isolation will find the expansion into the novel.Kraang 17:10, 15 Oct 2007 (CDT)

Mentoring

I touched on this subject briefly with Blongley. We do not have a protocol for mentoring new editors. It is a responsibility we need to spread around. I became a default mentor for Rkihara because I was the only moderator doing extensive work in magazines. From that experience I think I can propose some basic standards:

1) The mentor should be solely responsible for the first few submissions. Instead of just informing the new editor about mistakes, the mentor should make the changes and then explain every modification.

2) After the new editor has some experience their submissions can be processed by any moderator in the usual manner. The mentor should monitor the new editor's talk page and step in when they think it is appropriate. It is especially important that conflicting advice be explained. Extended discussions of issues that extend beyond the scope of a specific submission should be continued at a more appropriate area.

3) All conversations with the new editor should be on the new editor's talk page so that they can be easily monitored. If another moderator thinks that they have valuable input about ongoing dialogue between the mentor and the new editor they should post messages about the subject matter on the mentor's talk page so that he can filter the information.--swfritter 16:52, 16 Oct 2007 (CDT)

You've got some good ideas there, Swfritter. The results you achieved in bringing Rkihara up to the level of moderator are remarkable enough to show just what mentoring can do. I think the active moderators should consider taking on a new editor, depending on their ability to devote the time and effort to do so. What's important is that while some of us just seem to talk the talk, you've actually walked the walk. Three cheers! Mhhutchins 16:01, 18 Oct 2007 (CDT)
Sounds good. The reason I didn't respond here earlier is that except in the case of magazines, there doesn't seem to be a natural way of quickly deciding who the best mentor should be (and even that will become less clear soon enough, I expect): for instance, I've talked to Clarkmci a lot recently, but I'm by no means the only one that has since he started, and I'm actually asking almost as many questions as I'm answering!
Natural fits seem to be:
  • for those Editors whose native language isn't English, matched with a Mod that speaks the Editor's main language
  • similar activity times, where people are submitting at the same times you're usually moderating the queue
  • similar specialist interests
The last is particularly hard to determine early, unless the speciality is so strange no other Mod will touch it. I suggest therefore, that when an active editor turns up, the Greeter should be the one to handle the first few submissions (gently of course) and then either offer to continue as Mentor or ask around for someone else to take over, based on the sort of submissions made so far. Basically, when we say "Welcome" with the standard template and so say "If you need help, check out the community portal, or ask me on my talk page" then you DO mean yourself at first, and other Mods and Editors should just add to the Welcome rather than give conflicting advice at first. (Of course, Mods should hold a destructive edit if they see one, but they should ask the Greeter to convey the reasons.) Hopefully this won't put off people greeting new editors too much, and might encourage a "he/she's MINE, I found him/her!" protective attitude, that, if used right, might make this an even more supportive place. BLongley 16:37, 18 Oct 2007 (CDT)

Life after Verification

One of the issues mentioned above had to do with the kinds of activities that ISFDB editors could most usefully engage in after Verifying their collections. Here is a brief list that came to mind while I was mulling the question over:

  1. Review all ISFDB Publications published in a given month. Recently published books have been often entered by Dissembler and can use some TLC, especially at the Title level, i.e. Awards, Series, Synopsis and related data. We have a project page to document these issues, but it has been largely inactive since January when I began spending most of my time on moderation as opposed to data entry.
  2. Various data consistency projects, especially our latest additions, which list suspected discrepancies in a relatively easy-to-use format.
  3. Other cleanup projects, including Project:Repair Awards, cleanup of authors, series cleanup, etc.
  4. Add and clean up data based on print and online bibliographies that are listed on the Sources of Bibliographic Information page. For example, Michael is working on Tuck's 3 volume encyclopedia in his plentiful spare time (who needs sleep anyway?). OCLC, Contento and Locus are all obvious candidates for reconciliation with ISFDB. Unfortunately, the libraries that form the OCLC cooperative tend to be in the US (with a few UK, Polish, Oz, etc exceptions) and foreign publications are not always well represented. There are other free sources of cataloging information, e.g. Sigla, a search engine that will search about 2,000 library catalogs on your behalf, but it's rather buggy and its user interface leaves much to be desired, e.g. it doesn't work with Firefox. There are also commercial solutions, e.g. Bookwhere, but cheaper versions (e.g. Bookwhere Academic, $99/year) are not always fully functional, while the fully functional versions are not always cheap (almost $500/year for the full version of Bookwhere). Nor are the commercial products necessarily bug free and particularly user friendly either. I have been working on a library search engine of my own the last few months, but I am not as good at this computer stuff as Al and I don't have the resources that OCLC, Bookwhere, etc have, so it's been slow going. I'll post more about it once I have some real data to demo.

Any other obvious areas that I am missing? Ahasuerus 13:35, 17 Oct 2007 (CDT)

As one of the post-Verification Mods, I've dabbled in some of the above. Quite often accidentally!
For 1, I just noticed Al wasn't clearing all the Dissembler submissions as fast as usual this week (Satellite delays?), so I dived in and fixed a few of the ones I didn't think he'd be getting around to for a while. Quite fun, and as we haven't been up to date fixing them all, one latest edition sometimes lead to fixing two or 3 previously-entered books in a rapidly-growing series.
I see Eve Bunting is on the project page - I know I've worked on pushing some of those into Non-Genre, and created a series of Definitely-Genre but Juvenile Dinosaur/Time-Traveller Stories that might help people look for suspected Genre ones out of the rest. Nora Roberts I've come across while sorting out some other "Paranormal Romance" author. There's a Series for those too that I should have noticed.
For 2, I've had a look at the ones that can be done without Primary references, or that I had Primary References for, but some clean-ups will just move a problem from one category to another unless we clarify/clean-up/agree a few conventions first: or suggest what the clean-up script SHOULD be looking for. Comments on the processes would be useful.
For 3: CAN we repair Awards? I do look at my nominated Authors occasionally, and have whacked a few Series into shape to some extent. I keep forgetting some are listed there, but as I've COMPLETED none then there's no need to remove them, but I could comment and add some more I guess. Definite RPG non-books I do delete occasionally, and non-SF Comics, and some Juveniles that are clearly not SF, but Deletion is such a pain I'm not overzealous about it. Give me a BFG for Deletions and I'll be more aggressive, but deleting a pub, then going back to delete the title, is painfully slow...
For 4: I considered buying Tuck once (well, I've only once had the opportunity to do so in person once since I got to the ISFDB, and I already had 40 other books to take home that day), but I've concluded that the Static Canonical References can be left to others. If two or three others verify Tuck and leaves notes about where Tuck is WRONG, that'll do me - soon WE should be the definitive reference for what Tuck covered. For Contento, it's still two-way, both improving the other. It's satisfying when he admits a mistake and thanks you for the correction! :-) For a lot of the sources - well, does the name "Sisyphus" mean anything to you? ;-) BLongley 15:14, 17 Oct 2007 (CDT)
As to other stuff that I've found quite satisfying, and/or frustrating, but worth doing:
1) Go improve/fix the Amazon entry relating to the pubs you've verified. We link back to them, and while it's quite satisfying to see how much better at data than Amazon WE are, I find it even better to add the cover-art and some comments on the exact edition. (I steal the link back to the cover afterwards anyway!) I've been borrowing art from Jim Gardner for months, pointing out where Amazon can't even get a pair of authors right, etc, and it's nice to get some ego-boo from them too. Don't do too many though, we WANT to stay ahead! ;-)
2) Rather than the expense of (4) above, I've visited every second-hand book-shop, charity shop and junk-shop in my area: scoured them of SF, and found a) a lot of new-to-me SF books I'm quite happy to keep, b) better copies of books I have that are falling apart, and c) am now swapping some ten-for-a-quid titles for RARE books at Read It, Swap It.
3) Pick a random word or even a SYLLABLE and try it in a Author, Title or Pub search here. I've found mistyped words all over the place that can be tidied up without touching a verified pub. Or that will take you to an unknown author or series that needs tidying - we're GREAT on the big-name authors, but we have plenty of unknowns where you can easily find say, an illustrator that needs moving from an author spot, but might also show the way from one set of deletable juveniles to another... or to a trilogy where we have one book of three, and Amazon have the rest...
4) Go read all the other conversations that people are having. Help out when you have something to contribute.
5) Go read all the ARCHIVED talks. Some things were never resolved at the time, but could be now with current active people.
6) Go publicise ISFDB a bit more elsewhere - you all had lives BEFORE here, didn't you?
7) Download a backup of the database and play with it.
8) Write Help for a specialist area that you're a/the specialist in. I started on "The Problems with Printing Numbers over Multiple imprints of the same family of British Publishers" tonight, but then my Internet access got restored. Still needs doing though, but I'm talking here instead... :-/
9) Talk here! When we reshuffled the forums, it was my plan at least that this bit could remain sociable and we'd do serious stuff in other areas.
I'm never bored here. Sometimes frustrated, sometimes a bit lonely (different time-zones) but there's always something to do. BLongley 15:14, 17 Oct 2007 (CDT)
Oh, and as a last resort, go become a Wikipedia Editor and correct all the references to us as the "Internet Science Fiction Database" back to the "Internet Speculative Fiction Database". :-/ BLongley 16:32, 17 Oct 2007 (CDT)

Life with Verification

The practical aspects of the various definitions of Verification are starting to have an impact, at least with magazines. As I fill in the month part of the dates for the stories in magazines I am coming across occurrences of multiple verifiers having merged collection/anthology appearances with the magazine versions without changing the date of the story to the original date of publication including the month. Is this a level of change that requires notification of multiple editors?--swfritter 16:36, 17 Oct 2007 (CDT)

Well, the record merging algorithm as it affects dates is pretty simple - "use the earliest correct date" - so it sounds like this is either an education problem or an "misclick" problem. As long as the change results in a mismatch between the Title date and the Publication date of the magazine record, I think the easiest way to address the problem is periodically to run a script that identifies these discrepancies. Ahasuerus 19:34, 17 Oct 2007 (CDT)
Perhaps it should state "choose the earliest correct date, and if two of the choices are from the same year, choose the one with a month designation over one without it." Mhhutchins 21:18, 17 Oct 2007 (CDT)
Yes, that's a very good point. It's second nature for me now, but I remember that when I was just starting working on ISFDB-2, I had to stop for a second and think whether I wanted to select 1967-00-00 or 1967-09-00. Ahasuerus 00:02, 18 Oct 2007 (CDT)
The primary question I was asking is whether I should notify editors that their verified pubs have been updated with more precise data? I have been dealing with stories that have appeared in collections that have been verified by as many as three different editors. And there can be multiple stories in each issue which appear elsewhere. To my mind updating the months in publications, as long as it is the same year, should not require notification but considering the problems with variants it is very easy to get a little paranoid about changes made to verified pubs.--swfritter 13:57, 18 Oct 2007 (CDT)

Disappearing talk updates

Am I the only one whose attempts to update/respond to stuff on my talk page results in NO returned page verifying that the update has "arrived"? Just askin'...
--Dsorgen 21:15, 17 Oct 2007 (CDT)

Guess what, it ain't just the talk page. THIS page disappears too...

No that's happening to everyone. Your edit was accepted, it's only that the correct page isn't being reloaded. Just send your browser back twice and reload the page that you were editing. Mhhutchins 21:17, 17 Oct 2007 (CDT)
Alternately you can just edit the "isfdb.tamu.edu" in the blank page's url to "www.isfdb.org". This generally results in a working url for me (and it's useful when following a link to the wiki from the editing/moderation interface, for example). --WimLewis 04:11, 20 Oct 2007 (CDT)
I have also been bookmarking commonly used pages, My watchlist being the most useful.--swfritter 13:58, 20 Oct 2007 (CDT)

Uploading Images?

Are there any plans to host images on the ISFDB server? I have a fairly large stockpile of images from scans I did for Visco and for my eBay listings. It would take add a little more time to scan in an image while entering in the magazine data. I would host the images myself, but I have worries about the amount of traffic, and if my ISP changes it's name (three times in ten years) or I move to another, that would break all of the links.--Rkihara 11:40, 29 Oct 2007 (CDT)

The issue has come up repeatedly, primarily because outside sites like Amazon.com can't be relied upon to keep the images that we link to in the same place (or sometimes at all). Unfortunately, there are a few problems with hosting images locally: intellectual property considerations, bandwidth issues (especially if we ever move to a commercial host) and disk space issues. Al would probably know more about our current options since he is our point of contact with TAMU, so we may want to ask him when he re-emerges (hopefully) in early November. Ahasuerus 22:40, 29 Oct 2007 (CDT)
Other than reducing the size of the image, some of the bandwidth problems can be taken care of by using a graphics program with a "Save for Web" function. I use it for hosting images for eBay listings, for a unnoticeable 10x-15x reduction in file size. Gordon van Gelder gave permission to Visco to use cover images for the asking and even supplied images for some of the later issues. Other magazines may be willing to grant permission if asked.--Rkihara 01:21, 30 Oct 2007 (CDT)
We definitely have our individual preferences. I wish we had a profile option to turn them off. It will require some individual to volunteer the time - and perhaps even the money to do accomplish such a task.--swfritter 14:51, 30 Oct 2007 (CDT)
Several of us have actually volunteered money to solve certain problems - we want to keep this site useful, whatever server it is located on. If we could usefully use a separate "images.isfdb.org" site I'd help - if Visco and the Ace covers site have got away with hosting such for so long I'm not too worried about the "fair use" or "copyright theft" issues for OLD books or magazines. And we DO have permissions for a lot more than we currently explain in help - see here for instance. It wouldn't be too difficult to create a site for image-links or image-art that couldn't respond to a take-down notice from an artist or publisher very fast. BLongley 17:03, 30 Oct 2007 (CDT)

Front-Page Embarrassments

I know a lot of us don't look at it, going directly to our specialist areas, but today we show (amongst other stuff):

Authors Who Died On This Day:
    * Wallace Wood (1927-1981)
    * Wally Wood (1927-1981) 

I'm not an expert on these people/this person at all, but can someone take a look and merge/separate/fix/spindle/mutilate as required? BLongley 16:09, 2 Nov 2007 (CDT)

Thanks for the heads up, all fixed now. Ahasuerus 18:15, 2 Nov 2007 (CDT)

We could also do with some more work on the Covers and ISBNs on other forthcoming titles. :-/ BLongley 16:09, 2 Nov 2007 (CDT)

Good News, Bad News

The good news is that we exceeded 250,000 title last Sunday and are adding new ones at the most impressive rate of about 500 every couple of days. Congratulations all around!

The bad news is that my main notebook PC with 3 databases, 5 different types of "copy cataloging" software, various homegrown data cleanup scripts, hundreds of megabytes of partially processed bibliographic records, etc, etc, took a hit the other day. Almost everything was backed up, so the data loss is minimal (probably a couple of recent data cleanup scripts) and I will likely be able to salvage the data from the hard drive anyway, but it may take me some time to get up to speed. The lesson is that backups are a VERY good thing.

In related news, I am working with Al on getting the raw TAMU backup (320Mb) downloaded to another computer. Ahasuerus 12:17, 6 Nov 2007 (CST)

Update: The latest ISFDB snapshot (they are taken every 24 hours) has been downloaded and burned on a CD, so even if TAMU goes belly up, we should be able to recover. I will work on rebuilding my stuff tonight; once I am fully functional, I will extract the core ISFDB backup file out of the raw 1.2Gb monstrosity and post it here, hopefully later in the week. Ahasuerus 14:43, 6 Nov 2007 (CST)
Yay! I look forward to unleashing my fearsome SQL Skills on it! ;-)
(Hopefully I WON'T find that we have to delete/merge several tens of thousands of titles though.) BLongley 17:01, 6 Nov 2007 (CST)
I have restored the database from the full backup and confirmed that the data that I entered on Monday is there, which means that TAMU's backup software is working and we have everything in the master backup file. I will delete all Wiki tables and personal information (like e-mail addresses) tomorrow and then post the core backup file here.
Whew! It's good to see your data on a CD and not some flimsy magnetic device of unknown reliability. Need I mention a certain Leiber story and what it disclosed about the vast electric conspiracy?.. Ahasuerus 00:14, 7 Nov 2007 (CST)
Probably. I'm thinking it might be "Conjure Wife", but without re-reading (and if I made time for that, I wouldn't be here) it might be "The Silver Eggheads" or somesuch. I probably have it here somewhere, whatever it is.
Still, the first good thing about my memory loss problem is that I can reread books again and find them new and exciting, except that I don't seem to have lost enough memory at times, and remember the rest half-way through. :-(
The SECOND good thing about my memory loss problem is that I can reread old books and find them fresh and new and exciting.
The THIRD good thing about my memory loss problem is - hang on, I have a memory loss problem? I don't remember that... BLongley 16:59, 7 Nov 2007 (CST)
(And do you have a backup on something reliable like paper-tape? When the Aliens come, those of us with ASR-33 teletypes in the attic may be the human race's last saviours!) BLongley 16:59, 7 Nov 2007 (CST)
The conspiracy was described in Leiber's The Man Who Made Friends with Electricity. And no paper-tape, please! It was an awful medium and its disappearance is one of the (few and precious) signs that the human race may not have been a total of waste of valuable protein after all! Ahasuerus 18:05, 7 Nov 2007 (CST)
But that and punched cards were the last computer media where we could recover the data visually! These kids that have never had to debug a program with a sharp pencil and sticky-tape will never make proper IT people... BLongley 14:52, 8 Nov 2007 (CST)
Proper IT people? I am yet to be convinced that we really need them, proper or otherwise. Back in the day if you had a problem, you wrote the code yourself, then bought/stole/borrowed some computer time and tinkered with the code until your problem was solved (or you were out of time/money). And if you were really good at this stuff and didn't like the way the compiler was handling your code, then by golly, you changed the compiler! Admittedly, there were some problems with that paradigm, but it felt good, which should count for something, right? :) Ahasuerus 15:15, 8 Nov 2007 (CST)
Shhh! Please don't convince people they don't need an IT person, or I'm out of a job! (Well, temporarily... it only takes a year or two for a little user-written Excel spreadsheet or Access database to become a critical application that won't scale, then they call people like me.) BLongley 15:29, 8 Nov 2007 (CST)

Fresh Backup File Uploaded

The latest and greatest backup file, which contains ISFDB data as of late 2007-11-06, has been uploaded. I believe it contains all of the tables that are needed to drive the applications, but this is my first attempt, so please let me know if there is anything missing or otherwise funky. I will try to post a new backup file weekly, various gods willing. Ahasuerus 23:59, 7 Nov 2007 (CST)

I don't know if it's enough to drive the application, but it's certainly good enough for my SQL investigations. Thanks! BLongley 14:45, 8 Nov 2007 (CST)
And now I can see things like the reference_id in "verification" NOT being the same as the reference_id in "reference"... strange, but useful to know. BLongley 15:50, 8 Nov 2007 (CST)
I don't think the name of the verifier gets wiped out if the pub is unverified. The name probably doesn't change until someone else verifies it.--swfritter 16:34, 8 Nov 2007 (CST)
My pubs moved from "Primary" to "Primary (transient)" seem OK. I can't recall anything I've unverified that I didn't verify another way though. But it seems that the reference_id in the "verification" table is the RELATIVE number, not the ABSOLUTE one: so although "Primary (transient)" is reference_id 17 in the "reference" table, it's actually just the 12th entry there, and "12" is what you'll see in the "verification" table. References 1-10 seem to match, but "Currey" (13, but use 11) and my invention (17, but use 12) don't.
I really should learn Python sometime. But as work demands mastery of XSL by the end of next week, and some XSL-FO, and I need to have five years Java or .NET experience by the middle of next year, it's a bit low down the list of things I have time to learn. Fortunately my SQL has translated from Oracle to MySQL reasonably easily, although I miss PL/SQL here. (Which is why I need Python or Perl or something to make up for that lack.) BLongley 17:12, 8 Nov 2007 (CST)

(unindent)I have uploaded the latest backup file, which contains data as of this (early) morning ISFDB time. Even after streamlining the process, it remains rather time consuming, about 2 hours from start to finish. I'll try to stick to the weekly schedule, but I expect it will slip from time to time :( Ahasuerus 22:36, 12 Nov 2007 (CST)

Our Backup Discussion Group on Google

As some of you know, we have a backup Google Group in case the Wiki crashes and stays down for a long time. It is currently configured to "hold messages from non-members for moderation". Unfortunately, the volume of spam messages has been steadily climbing and I have to wade through 30-40+ messages advertising Viagra, lotteries, etc every day.

We haven't used the group since August and we have had only one legitimate message sent from a non-standard e-mail address since we began using it in April, so I am about to change the group's settings to "only members can post". If you think that you may need to use the group in the foreseeable future, please apply for membership and I will approve it within a few hours. Ahasuerus 16:41, 11 Nov 2007 (CST)

I think I asked this once before, but does such a membership require that I reveal an email address to the world? I usually create an (incoming-only) email address for any company that I deal with that may want to "helpfully inform me of other products and services I may be interested in" - I can redirect those to the Company's complaints department, or their CEO, or uce.gov, or equivalent, if they get misused. Redirecting all of USENET or readers thereof isn't as easy, and I know my last e-dress used for Newsgroups now gets 30-40 messages per HOUR. BLongley 17:41, 11 Nov 2007 (CST)
Well, you have to have a Google Groups account to access Google Groups and I don't think you can get one without giving an e-mail address to Google. Google scrambles your e-mail address when it displays messages from you, but I suppose it could be unscrambled if somebody really wanted to. However, I can change group settings so that the whole group would only be accessible to members, which should make it much harder for spammers to get in.
You can never be 100% sure, of course, but on the plus side, my email.com address has been used on Usenet over the last few years and I get 0-2 spam messages per day, so it's not that bad any more. Perhaps it's just that Usenet is no longer as popular as it once was. Ahasuerus 05:25, 12 Nov 2007 (CST)

Creating a list of prospective Essay series in Magazines

I have been approving Hall3730's cleanup submissions of various issues of Isaac Asimov's Science Fiction Magazine and noticed that many essays and interior art entries could be organized in series, e.g. "Mooney's Module" or Baird Searles' "On Books". I am also thinking about writing a script that would identify suspected duplicate titles, which will identify a lot of these latent series, and I am wondering whether we want to have a Wiki page where we could briefly document these series as we find them? Ahasuerus 22:53, 17 Nov 2007 (CST)

Some of the series lists are now on the magazine Wiki pages. See the Analog Wiki where Davecat has been doing an excellent job of filling in the series data and also the Fantastic Wiki where some series data has been started. As previously discussed these series lists can be quite long but sorting issues with series make it difficult to break the series down by year.--swfritter 12:18, 18 Nov 2007 (CST)
Thanks, that makes sense, I must have missed parts of that discussion. I will add these series to the IASFM's Wiki page. Ahasuerus 15:04, 18 Nov 2007 (CST)

Isaac Asimov's Science Fiction Magazine

I have a "Special Issue" December 1982 in which 2 chapters of "Foundation's Edge" appear. Scattered throughout the chapters are 16 commentaries by other authors. Do I capture them in the magazine's listings as separate essays? rbh 23:05, 20 Nov 2007 (CST)

Yes, that's what I would be inclined to do. Ahasuerus 00:08, 21 Nov 2007 (CST)

New Backup File Uploaded

A brand new backup file has been uploaded. Ahasuerus 23:05, 21 Nov 2007 (CST)

Arbitrary (Placeholder) Page Numbers?

I ran across this G. C. Edmondson Ace Double the other day and was surprised to see the following comment in the Note field: "The page numbers for the collection are unknown so arbitrary numbers have been assigned". And sure enough, the page numbers in the collection half of the dos-a-dos were 5,15,25,35,45,55 and 65. I can check my copy on December 1 and enter the actual page numbers, but I was wondering if this type of placeholder information is useful (as opposed to potentially misleading)? Ahasuerus 21:19, 22 Nov 2007 (CST)

This is something I did with the idea they would be replaced by myself from my copy(if I have it) or by someone else. I mostly did it to put them into a more visual order. I had put off dealing with the Ace Doubles until I had the right way of recording them. My first run to put them in a more consistent state has left me with the solution(novel/collection types) I was looking for.Kraang 22:16, 22 Nov 2007 (CST)
I think I'd find it misleading. Admittedly I'm usually dealing with magazines (so far, anyway), but I often look at the listing without so much as glancing at the metadata, including the notes. And if I'm entering data where someone else has partially entered content, I'd much rather enter into an empty field than overwrite data that's already there. Just my $0.02 cheap. -- davecat 11:22, 23 Nov 2007 (CST)
I found one of these that was verified by an editor who has not been active for awhile so obviously he verified it without even owning it; it was earlier this year when verification expectations were less stringent. I entered the correct data from my own copy. In another case this caused a great deal of confusion. I don't remember the title, but Robert Silverberg wrote two unrelated stories with the same title - one of which was a novella and one a short story. The bogus numbers made it difficult to determine whether the collection contained the short story or the novella and I did not own the Ace double which contained the collection.--swfritter 13:26, 24 Nov 2007 (CST)
Verification is one of those things we've never yet agreed on, so I can't be so sure "obviously he verified it without even owning it" - somebody may have added data to a verified (with minimal data) pub after the fact, without leaving notes as to the post-verification edits. That's why we started discussions about when and how it's OK to mess about with verified pubs... and that keeps petering out too. Mostly, I just leave queries on the verifier's talk page for now - and if the verifier isn't active any more then that's probably not going to be of any use in the end, so I can understand people going ahead anyway even if I don't think it's RIGHT to do so without a note at least. I do add Cover-art and THEN query rather than wait for answers before making the edit, on infrequently published books: if it's totally unverified I'll probably edit and leave notes in the publications that MIGHT have come from a better source. "Bogus page numbers" are OK to me so long as there's a note. I can't see a reason to make them 5,15,25,35,45,55 and 65 - I'd rather see them all OBVIOUSLY bogus in case I missed the note though. But I can understand making the collection's contents sort after the other entries. BLongley 18:08, 24 Nov 2007 (CST)
Just a quick note to the effect that American participation in any non-trivial discussions may be limited this week. Many (most?) American editors are still trying to recover from devouring 30lb turkeys and surviving travel/family reunions. Having said that, I agree that the "5,15,25,etc" scheme seems to be potentially confusing. Ahasuerus 18:38, 24 Nov 2007 (CST)
Will I have my decoder ring taken away for causing all this confusion for the "mag people"? :-)Kraang 19:32, 24 Nov 2007 (CST)
No, but we may tickle you until you enter every last one of R. L. Fanthorpe's Badger Books in the database! ;-) Ahasuerus 20:15, 24 Nov 2007 (CST)
I came across this site once before and found it of interesting. A quick sampling revealed some variant titles that aren't in the database and there doesn't appear to be many of this publishers books in the database either. Once I'm finished with the novel-anthology mismatches I think I'll have a crack at entering some of the Badger Books. When I was in London recently I saw a couple of them in Camden Town, I would have bought them but I was already lugging to much stuff home.Kraang 21:31, 24 Nov 2007 (CST)
Clearly, not only does tickling work, it works retroactively! :) Ahasuerus 02:59, 25 Nov 2007 (CST)
Whereabouts in Camden? And how much? I've only come across 3 of them so far. BLongley 12:43, 25 Nov 2007 (CST)
The old Horse Hospital (beside the catacombs that are under renovation) where the antique stalls are, top level. One of the booths has a lot of books, that's were I saw them. The booth is a bit of a mess and you'll have to have a good look around. I think the asking price was about 3-5? pounds. Good luck. WARNING: Don't go into "CyberDog", you may come out looking like a "Borg" from Star Trek! :-)Kraang 14:03, 25 Nov 2007 (CST)
Ah, a bit pricey then - I've only paid 1-3 quid for Badger Fanthorpes so far. (And now I find my cleaner has sorted the pseudonyms out, so I've no idea where the others are - but I must have more than three as I can only find the R. L. Fanthorpes now, and know there were at least two more under pseudonyms.) Still, a new place to visit is good, even if they aren't doing the "25p each, eight for a pound" or "10p each, 12 for a pound" deals that have made this house a bit overloaded recently! BLongley 16:09, 25 Nov 2007 (CST)
By the way, what's WRONG with looking like a Borg from Star Trek? I'm sure there's more people than me alone that have fantasies over Seven of Nine. Of course, I don't really want to have an arm replaced with a kitchen blender unit or suchlike, but most UK kids have played "I AM A DALEK!" at some point. I still want to be a cyborg when I grow up. IF I grow up. ;-) BLongley 16:09, 25 Nov 2007 (CST)
Anybody remember that old 50's movie - "The Attack of the Mag People"? Nertzel from the planet Froidonker boiled down old pulp magazines and formed semi-intelligent creatures whose mental capacities were even further diminished by chemicals in the glue that held them together. Wrestlers from the Amphitheater in Chicago were sent to protect the nation but could not get a good grip on the creatures because the edges of the monsters were not trimmed. The hero of the movie finally created some creatures from a better grade of paper. They easily digested the pulp monsters.--swfritter 16:47, 25 Nov 2007 (CST)
But seriously, I would go with Blongley's suggestion of using an obviously made up numbering system. The books in question here appear to all be collections that were packaged as part of Ace Doubles. The Locus Index to Pre-1984 Anthologies and Collections probably lists all of them and based on comparing a sampling of books in my collection they also list the stories in the order they were printed in the books. Perhaps it might be better to use a sequential numbering system starting with 1 and having no gaps. The page numbers would then be used to list the titles in the same order that they appear in the collection. The methodology could be documented in notes as it is now. Such a methodology might also make more sense to the casual user - I certainly hope that people other than editors are accessing The ISFDb. --swfritter 16:47, 25 Nov 2007 (CST)
Point taken, as I work my way through the Ace Doubles I have, I'll change the unknown ones after consulting "Contento" to the sequential numbering system and adjust the note to explain it better.Kraang 19:36, 25 Nov 2007 (CST)
It looks like your solution for the Ace Doubles is a good one. Eventually an editor with copies of the missing titles will show up.--swfritter 14:29, 26 Nov 2007 (CST)
Yes, "1,2,3,4,5,6,7" would be more clearly bogus than "5,15,25,35,45,55,65" - but so would "99910,99920,99930,99940,99950,99960,99970", leaving room to adjust things with fewer edits. (Who else remembers renumbering lines of a computer program and wishing there'd been larger gaps left?) Still, so long as it's bogus and sorts well there's no particular benefit to any particular numbers. Any suggestions for the NEXT way we'll misuse this field? ;-) What happens if we extend "fep" and "bep" to create "bfep" and "ffep" and then "fbfep" and "bbfep", like NNE and NNW for compass points? (Apart from driving Al COMPLETELY mad of course!) BLongley 17:07, 26 Nov 2007 (CST)
Is there a glossary somewhere which explains "fep", "bep", etc., etc.? -- davecat 20:27, 26 Nov 2007 (CST)
Oh yes, it's all documented on the "New Publication with Contents" Help page:
  1. c -- front cover
  2. fep -- front end paper, or inside front cover of a magazine
  3. bp -- unpaginated pages that precede pagination
  4. ep -- unpaginated pages that follow pagination (although generally we would expect people to count forward to find a page number)
  5. bc -- back cover
  6. bep -- back end paper, or inside back cover or a magazine
  7. ## (actual count) -- actual handcounted pages, not counting the front cover or front endpaper, for a book without page numbers
  8. <text> -- descriptive of the location in some other way. E.g. "Inset artwork on poster inserted with this magazine".
Ahasuerus 22:52, 26 Nov 2007 (CST)
In the meantime, the cause celbre has been Verified. We'll presumably want to let Contento know that "The Misfit" is a short story and not a novelette. Ahasuerus 00:31, 1 Dec 2007 (CST)
Contento's database modified and its Web version will be updated shortly. Ahasuerus 22:14, 11 Dec 2007 (CST)

2007-11-27 Backup File Uploaded

The latest backup file has been uploaded. Have at it! Ahasuerus 23:48, 28 Nov 2007 (CST)

Database Entropy

For what it's worth, it occurred to me while I was entering some data, that a lot of the cleanup work we do lowers the entropy or disorder in the system. Work such as entering missing pubs referenced by book reviews, page numbering stories and articles in digests, title merge, and so on. It seemed logical to me that a database should have some way to measure and display this. If it's not built in, maybe we should ask for it to be implemented? This way we would be able to quantify the effects of work done on the database. Maybe charts showing entropy change over time, and rate of change, for the whole database, and maybe for selected areas, like books, or magazines?--Rkihara 22:31, 29 Nov 2007 (CST)

That's a very good question -- or rather a whole bunch of related question -- and I have been thinking along similar lines for some time. We have a variety of Wiki pages linked from the Bibliographic Projects in Progress portal, notably the ISFDB:Data Consistency page. I wrote a number of scripts to analyze Title vs. Publication Type Consistency a couple of months ago and I plan to write even more when I get a chance. There are other paths that we could pursue to ensure that our data is both complete and accurate, e.g. automated validation of our records against OCLC and other library catalogs, but that can get tricky. I am traveling today ad will be doing Verifications over the weekend while I have access to my collection, but may post at greater length on the subject early next week. Ahasuerus 15:05, 30 Nov 2007 (CST)
It's a bit hard to measure entropy. E.g. if we could accurately measure the titles that still :needed merging, then we could do the merges automatically. Unnumbered contents sometimes don't matter, if there's only a Novel and its introduction. Titles with no contents are something I'm working on, but I couldn't yet measure the ones that NEED contents accurately. We can provide some measures: but unless we know what the final result should be, we can't measure how far away from it we are. Do we have too many authors due to a lack of merges, or too few as we have missing variants? Should we have zapped a few publications as non-genre to clear some up? (Only today I discover we have a book by a former England Football manager, about football, co-authored by someone that might be the same person that wrote the book that the film "Straw Dogs" was based on, and if so, has been credited for such on the cover of Micronauts books that do qualify for entry here...) I'm all for some encouraging statistics, but how to make them meaningful I don't know... I suspect tying up dead links is GOOD, and therefore that number decreasing is GOOD too: titles increasing may be good, pubs increasing may be good, but decreasing some of these for the right reasons is good too. Who can suggest statistics a) we CAN generate and b) ARE meaningful? BLongley 16:35, 30 Nov 2007 (CST)
The entropy measurement need only be qualitatively accurate, since this measurement will not be used to rework the data. I would suggest something like the following: For magazines, titles entered without page numbers, book reviews unlinked to book data, missing features in magazines known to have them, empty fields that would usually be filled with data, dates that read xxxx-00-00. For books, no price, date, page count, etc. Authors could be checked by similarity, pseudonyms by checking against a master list. Verified pubs are at zero entropy (or at least minimum entropy) by definition, and as their number increases, the entropy calculation will become more refined.--Rkihara 18:37, 30 Nov 2007 (CST)
I see that Bill has begun working on it. Thinking back, I recall that I posted some stats of our binding types October. Perhaps we could create a new Project page where we could document our findings?
Sure, you create the pages and I'll add to them as and when I create/find something related. I'm not sure whether my SQL experiments are being ignored or just haven't been found - some seem to have proved quite useful, some still need more work as they only provide guidelines to dodgy areas that need human eyes - much like the Data Consistency pages, where an entry may indicate a problem or maybe not. Some just lead to a standards discussion: e.g. a lot of the binding types can obviously be cleaned up immediately, the audiobooks probably need a discussion on how many categories we need. (If anyone cares, I think the Cassette/Audio CD/ MP3 CD categories should be kept separate for a start, as they're not necessarily interchangeable for a user. Number of Cassettes/CDs might be useful, could we use "Pages" for that? And Unabridged/Abridged is quite an important distinction in audiobooks... not that I own any, hence me not starting the discussion.)BLongley 13:38, 4 Dec 2007 (CST)
Also, it occurs to me that there are two basic types of data integrity issues in the database. The first type is limited to individual fields, e.g. we have 3507 occurrences of "unk" and 4 occurrences of "paperback" in the binding field. The second type has to do with discrepancies between two or more fields in one or more tables, e.g. back in the late 1990s our automated data harvesting resulted in hundreds of alleged "tp" publications from the 1980s even though their price is mostly in the $1.50-$2.95 range.
I looked at "unk" for a bit and most seemed to be audiobooks. Not good enough for an update script though, unless someone thinks it would be an improvement to switch them over and leave a note. BLongley 13:38, 4 Dec 2007 (CST)
I also created a few "crib sheet" tables of "reasonable prices for format and year" - again, these could be used as reference for "this is suspicious" type queries, but there'll always be an occasional "World Book Day" pub priced suspiciously low or a special edition priced suspiciously high. Still, we can create the scripts and see if they're useful, it's better than just working through title-id s 1,2,3,4,5,etc.... BLongley 13:38, 4 Dec 2007 (CST)
I can relatively easily (time permitting) write a bunch of scripts that would identify both types of data integrity issues and post the results here. However, I think it would be more useful to spend a little more time on each script and create lists of potentially discrepant records for subsequent cleanup in addition to posting the entropy numbers. Anybody have a few dozen waking hours that I could borrow? :) Ahasuerus 00:28, 4 Dec 2007 (CST)
Some of Rkihara's suggestions are very simple and easy, not very useful if people WANT to clean them up (where are you going to get a missing page count for a book from? Hopefully not Amazon), but the statistics should show our improvements. I've no idea where the proposed "master list" would come from though. BLongley 13:38, 4 Dec 2007 (CST)
As to waking hours: Nope, I'm limited myself till the 15th December. BLongley 13:38, 4 Dec 2007 (CST)

(Unindent) Update on the first three measurements:

Magazine titles entered without page numbers shows improvement: less bad, more good: Before:

Bad	53750	
Good	48353	

After:

Bad	52599
Good	52163

Pubs without pages shows improvement: less bad, more good: Before:

Bad	19535
Good	82131

After:

Bad	19502
Good	84153

Pubs without Prices is mixed: more bad, more good: Before:

Bad	13118
Good	96196

After:

Bad    13468
Good   97749

These may all be distorted by inaccurate SQL, of course. I'm not an expert here, hence me still dumping thoughts here rather than "definitive" answers anywhere else. BLongley 14:02, 9 Dec 2007 (CST)

Pretty interesting. Reworking your numbers as percentages.
Magazines without page numbers, 52.6% to 50.0%
Pubs without pages, 19.2% to 18.8%
Pubs without prices, 12.0% to 12.1%
--Rkihara 15:58, 9 Dec 2007 (CST)
More entropy numbers:
ISFDB:Invalid characters in Publication titles: 14 as of the 2007-11-27 backup
Malformed ISBNs as of the 2007-11-27 backup:
Valid ISBN - 86910 (83.95%)
Catalog ID - 6608 (6.38%)
No ISBN - 7955 (7.68%)
B000 (from Amazon.com) - 1019 (0.98%)
ISBN is not 10 digits - 836 (0.81%)
Fails checksum validation - 200 (0.19%)
Once I double check Bill's numbers, I will finally create a Project page just for the entropy numbers. Ahasuerus 10:22, 10 Dec 2007 (CST)

Merovingen Nights

There is a Usenet discussion of the finer points of Merovingen bibliography if anyone wants to follow up. I won't be able to do much about it until December 29. Ahasuerus 07:00, 3 Dec 2007 (CST)

I happen to be entering some of those. I'll see what I can do. Dana Carson 15:27, 3 Dec 2007 (CST)
Synchronicity strikes again! Where is Lionel Fanthorpe when you need him? :) Ahasuerus 15:38, 3 Dec 2007 (CST)
I've just been ignoring the problem and letting Dana get on with it until everyone sorts out the adjectives - Merovingan, Merovingian, Merovinian, Merovingain? The "platoon of crack bibliographers" is awaiting further guidance. OK, I mean THIS cracked bibliographer is awaiting some sort of impulse to go drag the books out of the spare room. Or the cassette tapes out of wherever my cleaner put them, to read J-Cards. I do know I'm a little wary of pubs like this where I seem to be verifying a load of stuff I never actually entered and actually have little interest in. Still, I own a few (I think) and can at least moan from a primary verification POV, or just unverify and make it Someone Else's Problem. BLongley 17:19, 3 Dec 2007 (CST)
As to Fanthorpe, I understand he's still alive and well and living in South Wales. I'm alive and not so well and within driving distance: perhaps Reverend Me could meet Reverend Him and get some research done? BLongley 17:19, 3 Dec 2007 (CST)

Bookscans.com

Has anyone stumbled upon this website? It catalogs an extremely large library of cover scans from vintage paperback books. I'm thinking we could ask permission to link to those covers from the ISFDB. Mhhutchins 12:29, 9 Dec 2007 (CST)

I've seen it before, but it hasn't covered many of the titles I work on. As he's soliciting contributions, maybe some of our scanners can help with contributions before we ask for his bandwidth? If he has to sell CDs to cover the site costs, our extra traffic may not be too welcome. BLongley 12:44, 9 Dec 2007 (CST)

Prioritizing Al's time in December

As Al recently indicated, he will have some time to work on fixing the software in late December. We may want to review our outstanding bugs and prioritize them before Al starts so that we would use his precious ISFDB time most effectively. Also, he is a very fast coder -- for example, the whole voting system was coded in one morning -- so we probably want more bugs lined up rather than fewer in case he runs out of things to do. As you can see on his Talk page, I have requested two things so far: adding support for ISBN-13s and addressing the problem with Contents edits affecting multiple Publication records. It's probably best to has things discussed here first instead of cluttering Al's Talk page.

Please note that we have an "official" bug list at SourceForge (and a deprecated one at ISFDB Bug List) and an "official" feature request list at ISFDB Feature List. Ahasuerus 15:58, 9 Dec 2007 (CST)

P.S. We may want to reconcile our old, now deprecated, bug list with the list at SourceForge to save Al time. Also, the last time I went over the bug list (6 months ago?), some of the bugs had been ninja-fixed, so another review may be in order. Ahasuerus 16:16, 9 Dec 2007 (CST)

Two I'd really like to see fixed are the tamu.edu links & the "ANDNOT" (shd be "AND NOT") in the advanced search. (The latter is almost certainly as simple as changing "ANDNOT" to "AND NOT" in one line of code.)
Linking back to Amazon.co.uk (rather than Amazon.com) when we have an ISBN and a price starting "£" rather than "L" should be pretty quick too, and I'd like that as editors are using the correct currency symbol now. There's also an Amazon.Ca that we might be able to link to if we had a standardised way of denoting Canadian dollar prices? We can also remove the broken links to B&N. Maybe add some to Fantastic Fiction, now we're using their cover images? BLongley 16:47, 10 Dec 2007 (CST)
Well, I've seen the change in the list of "Other Sites" Links now, and I for one am finding it very useful - thanks Al! BLongley 17:06, 18 Dec 2007 (CST)
I'm not sure how quick this would be, but one that's been annoying me often is the lack of multiple series support. At two levels, but tonight this one: e.g. this pub is part of the "Master SF Series" and this one is part of the "Corgi SF Collector's library". (Where the "F" may be Fiction, Fantasy or Fact.) Worth recording, I feel, but it's a publication level series and we don't have that. I'm not sure if it's easier to add pub-level series that won't interfere with title-level series, or just enable multiple series for titles that need it (e.g. many Moorcock books). I think it's worth a look though, and "you have to be kidding!" is a valid response! BLongley 15:57, 12 Dec 2007 (CST)
There has been some talk about publication series, which could help with "Ace Science Fiction Specials", "Corgi SF Collector's library", etc. I don't know how easy it would be to implement, but I think it's really something that needs to be done at the Publication level as opposed to the Title level. It does resurrect the ever vexing "edition vs. printing" question, though. Ahasuerus 16:58, 12 Dec 2007 (CST)
Being able to define sort orders for parent series would be great but it might be enough, and relatively simple to implement, if the parent series sorted alphabetically. In many cases the only difference in the series text is the year data - for instance "Editorial (Fantastic - 1963)", "Editorial (Fantastic - 1964)", etc.--swfritter 17:22, 12 Dec 2007 (CST)
I think there are a few data elements that we could add quickly (I hope) which would enable us to capture and display our data much better. "Series order within parent series" would be one such element and "Printing number" would be another, the latter helping with organizing Title bibliographies, especially when the title has been reprinted multiple times. Adding a new field for "non-genre" (and decoupling existing Novel records accordingly) would enable us to support non-genre short fiction, which is currently a major headache.
Other low hanging (I hope) fruit would be:
  • changing the Summary Bibliography display logic to show short fiction series even when they don't contain book length entries
  • adding a "Remove Author Pseudonym" option with a list of existing pseudonyms and check boxes
More significant than initially realized. Once somebody makes the decision that a certain name should be the canonical name there can be a cascade of editors assigning titles to that canonical name. This is particularly true for artists. We should not really assign a canonical name for them until the data is in the system.--swfritter 16:19, 17 Dec 2007 (CST)
  • changing the Series display screen not to display variant titles (a major pain when moving series data to the canonical name)
  • changing the Summary Bibliography display logic to display Non-genre series (e.g. Sherlock Holmes in Arthur Conan Doyle's bibliography)
  • changing all isfdb.tamu.edu links to our Help pages to www.isfdb.org -- a major issue for our new editors who think that our Help pages are gone
Ahasuerus 23:01, 12 Dec 2007 (CST)
A thing high on my list would be the ability to save entries as you are working on them. Working on a magazine today, I entered all the artwork, some missing essays, and the reviews. 30 minutes work, went to submit only to lose it all to the ISFDB being down. In the past, lost it due to internet problems. Typically, I enter page numbers and missing essays, submit, wait for moderator review, add artwork, submit, wait for moderator review, enter the reviews, submit. Lose less work that way but it takes a lot of days to get through a single magazine. Over the past year, I have lost almost 10 hours of work. Having some way to save and then continue with editing until all the entries are made would be a real help to me and increase my productivity. Thx, rbh 19:20, 16 Dec 2007 (CST)
If you use Firefox and the ISFDB is down when you click the Submit button, all data is saved by Firefox. Just wait for the server to come back up and then hit F5. Firefox will ask you if you want to resubmit the form and when you answer "Yes", it will send the data to the server. Internet Explorer isn't as forgiving or at least it wasn't when I last used it. Ahasuerus 20:06, 16 Dec 2007 (CST)
I'll try that on the home computer but when I am on the road (about half the time), I am using the company laptop and downloading software to it is not permitted. Thx, rbh 20:31, 16 Dec 2007 (CST)
I have also had better luck with Firefox than Explorer when I use the back button to bring up a screen I thought I might have lost. Using the back button might work sometimes on Explorer.--swfritter 16:19, 17 Dec 2007 (CST)
I can't see there being a possible solution to your particular problem. If the ISFDB is down, it can't save your work for you either. You'd need a local application that can save it on your laptop, and submit saved changes later - but you can't install such an application. BLongley 18:19, 18 Dec 2007 (CST)
I wasn't as clear as I could have been, normally when you are creating something complex or long, you frequently save and then continue, that is what I would like to see rather than get to the end and then not be able to save. I have started running two log-ins to ISFDB and before saving, I use the second to check that I have a working internet connection and that the ISFDB is up before saving. Takes longer but so does typing in the same thing two or more times. If it is not up, I just leave the browser up until it comes back before I save. rbh 15:24, 19 Dec 2007 (CST)
OK, you're thinking that the ISFDB (while up) should be able to save a partial submission for you to return to, without it going into the Moderator queue for approval (and therefore making you wait)? That should be possible, but I can't imagine it being quick to code though. You'd still risk SOME loss of data even then, so the local application still sounds a bit better. (I'm STILL not definitely offering to write such though!) BLongley 16:35, 19 Dec 2007 (CST)
If I couldn't use Firefox and was about to start entering a thick anthology/collection, I would probably do it in chunks of 3-4 Contents Titles. Of course, I can approve my own submissions and RBH can't. Hm, how close is RBH to self-sufficiency anyway? :) Ahasuerus 16:54, 19 Dec 2007 (CST)
However, I think I would LIKE such an application where we can work off-line in an ISFDB editing style (not necessarily with the same user-interface, but working from a recent backup of the data at least) and submit the updates later when we can connect. I know I occasionally visit a book-store that will let me check my book-list against their shelves for several hours, knowing that I'm going to buy a shelf-load before I leave: and they wouldn't mind me doing some "Primary (Transient)" verifications in the mean-time. Or adding some missing books that I wouldn't buy myself. I could probably write such an application, given time and preferably some help with Submission Formats - even a subset of the possible submissions would be useful. (E.g. I don't think we'd want Variant creations, or pub or title Deletions, available off-line.) BLongley 18:19, 18 Dec 2007 (CST)
Dup Candidates - Do not match Collections/Shortfiction, etc unless in similar title mode. Perhaps don't match items at all unless they are of the same type. In the case of Shortfiction do match even if the story lengths are different.--swfritter 16:21, 18 Dec 2007 (CST)
Agreed, definitely don't offer to match the long multiple-content titles with short ones like Collections/Anthologies with Shortfiction. With Shortfiction, do offer to match with Essay and Poem though - I've found that useful. Consider Nongenre Shortfiction as an option, and maybe "Song" as another option to "Poem" - although we could cover that with better help guidance. BLongley 18:19, 18 Dec 2007 (CST)
Of course, it's always possible to merge anything with anything using advanced search.--swfritter 15:55, 19 Dec 2007 (CST)

(unindent)One other very quick (and very useful) thing to do would be to hyperlink Variant Titles from the canonical Title page. They can be hard to get to since they are not always linked from the Summary page, so you have to use the Titles page for the author or, even worse, for the pseudonym. This makes it very hard for new editors to Add Publication to the appropriate Title (or just edit them), so we end up with bad submissions that need to be intercepted and redone. Ahasuerus 19:14, 19 Dec 2007 (CST)

Low Hanging Fruit Part 2: Adding the name of the Verifier to the "WARNING: This publication has been verified against the primary source" message which moderators receive when approving changes to verified publications would be very useful since we could then immediately tell whether an editor is changing "his" or "someone else's" publication. Ahasuerus 00:26, 20 Dec 2007 (CST)

Tracking relationship of magazine cover art to interior story

Reposting a question from a new editor for general discussion:

Hi! I'm new. Question for you - are we tracking whether or not publications were illustrated in the original, and especially which stories inside a magazine have associated cover and interior illustrations, by what artists?

I'm particularly interested in the question of tracking printings, reprintings, and online availability of short fiction and am in conversation with some folks about automatically saving TOC - related info for collections that are being digitally scanned. Would love to coordinate efforts.

Netmouse 13:10, 10 Dec 2007 (CST)

(My answer is on Netmouse's Talk page). Ahasuerus 16:10, 10 Dec 2007 (CST)

We do not have a specific field designed to document the story illustrated by the cover of a magazine. As a result the only place to document this information is the magazine notes field which means that the inclusion of such data is not required of the ISFDb editor and not accessible for listings. I always try to include such data.--swfritter 20:20, 10 Dec 2007 (CST)
As I've been entering all these Analogs I've wished many times for this; all (or very, very nearly all) of those covers are intended to illustrate a specific story or article, & it would be good to put this in. I have not, however, been putting this in the notes. (For Analog, at least in the years I've got, in almost all cases, the first story or article after the editorial is the cover feature. And it wouldn't take very much to determine this just from the pub listings, which include the online cover images. Not much need to go back to the original magazines.)
It would really be nice to have a cover-title field in the pub record; I think that would be better than a separate content item, but I could be wrong. Dave (davecat) 11:07, 11 Dec 2007 (CST)
Well, the Coverart Title record is editable after entry, you don't have to leave it as "Cover: Pub Title". In which case making this visible on certain other screens could be the feature request, rather than adding a new field. I've noticed this isn't tightly coupled, as if you correct a title the Coverart record doesn't change. There's a 'title_ttype' of 'BACKCOVERART' allowed for too, which doesn't appear to be used. I wonder what Al intended for that? BLongley 13:21, 11 Dec 2007 (CST)
Aha. I think I'd once observed that there is a coverart record for magazines, but forgotten it. But the format does seem quite standardized. What would be a reasonable replacement? Maybe:
 Cover: Satan's World (Part 1 of 4) (Analog, May 1968)
or something like that? Dave (davecat) 05:46, 12 Dec 2007 (CST)
Online fiction. I have been using tags to document Project Gutenberg stories. See my User Page. There has been a certain hesitancy to document online sources for a number of reasons chief among them being the transient nature of web sites and the possibility that the ISFDb could be used as a portal for downloading copyrighted material. Baen's Universe is included only because they have downloadable versions and an official ISSN number. If they were a website only we would not include them.--swfritter 20:20, 10 Dec 2007 (CST)

Synopsis Data

I am beginning to wonder whether we want to police our synopsis data a bit better. For example, the synopsis that we currently have on file for Edmond Hamilton's The Star of Life is nearly incomprehensible in addition to being a review rather than a plot summary:

Classic. One of my all time favorites. Kirk Hammond is frozen in space and then enters the Earth 10,000 years in the future. He encounters three species of man kind: humankind who make up most of the galaxy, the Vramen who never die because they live on the restricted planet whose sun is the Star of Life. Mankind wants desperately to enjoy eternal life too, but the Vramen will not allow it. Because their children, and their children's children. The ending is superb along with the story.

Should we ask editors questions about these types of submissions rather than automatically approve them? Ahasuerus 16:29, 10 Dec 2007 (CST)

A lot of synopsis data is showing up on Wikipedia where articles are more thoroughly vetted. Wikipedia is a more appropriate solution for biographies and it may be a more appropriate venue for synopsis data also.--swfritter 17:06, 10 Dec 2007 (CST)
Yeuch! I can't recall ever approving a synopsis submission title-edit alone, and would have rejected that one if that was all it contained. Reworked it maybe if it came with other data. I see very little use of the synopsis field though and would support removing it - we link to Wikipedia for significant titles (this is used), and have support for "User Rating" (very little-used). We're bibliographers, not reviewers, aren't we? That field's existence alone suggests we welcome copy'n'paste entries from other sources (Copyright Problems) or long entries about why this book is "the bestest-ever best book you should all read!!!!" (self-publisher abuse). No use to me, no use to our users. (We keep forgetting them - try showing our "simple" interface to real people some time and watch them get confused by coverart entries and reviews and other stuff they do NOT want to see.) BLongley 17:18, 10 Dec 2007 (CST)
I believe there were two reasons why the Synopsis field was included in the original design specs. First, we had neither Wikipedia nor tag support in 1995. Second, libraries (from the Library of Congress down) do include brief (1-3 sentences) descriptions, typically provided by the book's publisher. The practice is particularly well established in the field of children's/YA literature, where a brief synopsis is almost de rigueur. Over the years, I have occasionally found these brief summaries useful when trying to determine whether a particular book was "non-genre" for our purposes. When working on OCLC reconciliation, I typically cut-and-paste "publisher's descriptions" in the Synopsis field since it takes all of 5 seconds. I also seem to recall a few Title records where the Synopsis field was used for borderline bibliographic purposes, e.g. to explain the difference between different versions of the text, so if we decide to zap it, we will want to carefully review what we have first. Ahasuerus 22:18, 10 Dec 2007 (CST)
Hmmm... I'm sure I've seen editors questioned about notes that seem to come from publisher's blurb, what's the cut-off point for fair-use? Or are you just letting OCLC decide for you? BLongley 12:29, 11 Dec 2007 (CST)
Whenever I quote from a publisher's description cited in OCLC and/or other library catalogs, I add "Publisher's description: " to the synopsis. I am not particularly worried about legal issues in this case since anything that is printed on standard 3 by 5 cards typically comes from the Library of Congress (or a similar national library) and is presumably fair game. However, I believe it is important to distinguish publishers' words from our users' contributions in part because publishers may embellish things (imagine that!) and in part because they have been known to make mistakes describing their own books (hey, we just publish the stuff!).
As an aside, I should note that not all libraries participate in OCLC and even when they do, some library records are considerably more detailed than what OCLC has on file. It's always a compromise between the convenience of using just one (or at most a few) records per edition in OCLC vs. being able to get much more comprehensive information by examining dozens of individual library records. We just need more hours in the day... Ahasuerus 12:59, 11 Dec 2007 (CST)
I seem to have a natural 26-27 hour cycle, so I'd support that. I wonder if that gives away which planet my ancestors came from? BLongley 13:27, 11 Dec 2007 (CST)
Having said that, I agree that tags have partially superseded our existing synopsis data, at least for our purposes. I am less sanguine about Wikipedia and its ability to do a consistently good job describing books (see our previous discussions for details), but we are already using it for various other purposes, so we might as well continue. Ahasuerus 22:18, 10 Dec 2007 (CST)
If I ever get free time something I wanted to explore is wiki storage of the current description fields (title synopsis, title notes, publication notes, etc.). This would allow for wiki formatting and also audit logging of edits to the fields. It's lower priority but I'd like to add support for reviews. I had put about a dozen reviews on ISFDB in the synopsis field (though not the one for The Star of Life that triggered this discussion) and also had linked the reviews using a the "Reviewed" tag. Tonight I realized these went against the spirit of "synopsis" and so for now cut/pasted the text to http://marc.kupper.googlepages.com/bookreviews while I decide what to do with the reviews. I probably can copy/paste the synopsis part of each review back into ISFDB but that'll take time to look at each review and to do any edits needed to generate a synopsis.
Regarding that "natural 26-27 hour cycle" - I'm similar and have been able to put it to good use when I was working on ships that were headed to the west. There we would set the clock back one hour each night meaning the days were 25 hours long. Fortunately, my job at the time involved flying to Europe and taking a ship back to the USA meaning I did not need to work on ships headed east and with 23 hour days. Marc Kupper (talk) 03:23, 13 Dec 2007 (CST)

Backup update notifications?

I uploaded the latest backup file on Monday and updated the backup date on the ISFDB Downloads page, but didn't post about here. I know that at least half a dozen editors use the back file to run searches, but I am not sure if they check the Recent Changes list of whether I should continue posting here whenever I upload a new file, which usually happens around once a week? Ahasuerus 22:27, 12 Dec 2007 (CST)

Would it also be possible to keep the old backup files as sometimes I want to look at what the db looked like at some point in the past (usually when trying to recover from an accidental edit). Marc Kupper (talk) 03:06, 13 Dec 2007 (CST)
I keep all of our officially posted (and some unposted) backup files on CDs, but I can't make them available online since there is no publicly accessible place for them on the TAMU server. Al may know more about these limitations, but for now the best way to get to our historical data is to download new backup files as they get posted. If there is interest, I could burn a few CDs worth of 2006-2007 backups and mail them to the interested parties. Ahasuerus 12:40, 13 Dec 2007 (CST)
It strikes me that it should be possible to load multiple copies of the backup and run comparisons: create ISFDB2/ISFDB3/ISFDB4/etc database versions and SOURCE different files into each one? There's doesn't seem to be references to the database name in the backup, the first mention of 'ISFDB' alone is for the backup of this data - some unimportant record, right at the bottom. ;-) BLongley 13:31, 13 Dec 2007 (CST)
Oh yes, it is possible to have multiple copies of the database running in parallel in MySQL. I think I have 4 at the moment, although I haven't tried comparing their respective data yet. Ahasuerus 13:42, 13 Dec 2007 (CST)
OK, you're DEFINITELY the person to ask for the Database Entropy numbers then! BLongley 14:41, 13 Dec 2007 (CST)
Well, yes, but have you seen my (partial) To Do List lately? :( Ahasuerus 14:47, 13 Dec 2007 (CST)
It's a long list, indeed. Perhaps if you didn't keep working on adding to it you might have got one of them done? ;-)
So far, my ratio has been "three new tasks added for every one completed" :-\ Ahasuerus 00:03, 14 Dec 2007 (CST)
Seriously though - you could delegate some of those, or ask for help. BLongley 15:19, 13 Dec 2007 (CST)
It so happens that I have been thinking about posting some of these tasks on the Verification page to see if anybody would be interested. Not all of them are easily delegatable, though. How many of our editors know enough SQL to write data cleansing scripts (quickly) and how many know enough Polish to sort out Lem's biblio? Ahasuerus 00:03, 14 Dec 2007 (CST)
I have no Polish - strangely though, the Pole I knew best actually nicknamed himself after Lem, he's such a big fan.
I think we have people with SQL, but probably not with much ISFDB schema/usage knowledge. That's why I keep posting examples as I find out something new: it's a very strange and unintuitive design, IMO. I'm still not at the stage where I could write all the documentation, but I'm getting better. BLongley 11:45, 14 Dec 2007 (CST)

Personal tools