Publisher talk:Project Gutenberg

From ISFDB

Jump to: navigation, search

Contents

Searchability

Is there a way Advanced Search can be used to retrieve a list of Project Gutenberg published stories by a particular author? I think I elsewhere suggested putting the short stories in a virtual collection by author which would provide a central repository for all the stories by an individual author. —The preceding unsigned comment added by Swfritter (talkcontribs) 16:47, 8 Feb 2008

The publication section of the advanced search form lists publisher as a field to search on, and author as another. An "and" search on these two fields ought to show all publications for a given author published by a given publisher, such as PG. However a search by publisher just now returned an error. -DES Talk 16:07, 8 Feb 2008 (CST)
Hopefully it will get fixed. I had a little experience merging chapbooks with shortfiction and that works fine so it looks like the chapbook method is workable.--swfritter 16:56, 8 Feb 2008 (CST)

Author credits

As you know, I've entered two or three PG titles. Maybe I've handled the author wrong; but in any case the logic I've used doesn't apply quite so nicely to the one I was going to do next.

So, for example, I did Randall Garrett's "The Bramble Bush". The PG text is derived from the Analog publication, which was under the name Randall Garrett. If you search on "Garrett" at PG, you find "Garrett, Randall, 1927-1987". Within the pub, the text showing "Randall Garrett" is unchanged. So, notwithstanding the fact that PG's metadata shows the author as "Gordon Randall Garrett", I thought it most reasonable to enter it as being by Garrett & put a note in the pub. (If you don't think that was the right way to handle it, say so here & I'll go fix it & "A Spaceship Named McGuire".)

Now I'm looking at "...After a Few Words...". (Hmm. I'll go fix the ellipses right now while I'm thinking about it, both variant & parent. OK, submitted.) This was, in Analog, as by Seaton McKettrig, & that's reproduced; but again the metadata shows "Gordon Randall Garrett". (If they're going to use a form of the name he didn't use professionally, AFAIK, why not the full legal name, which we at least have as Gordon Randall Phillip David Garrett? But that's another issue entirely. Anyway.)

My tendency here is to enter this as a pub containing the McKettrig variant, as by McKettrig, with a pub note about the metadata. I think that's the right approach here; but I thought I'd check with you before doing it, as that would be just a little messier to fix up. And would the tag be pgg or pgm? Comments? Thanks. -- Dave (davecat) 10:14, 20 Mar 2008 (CDT)

One other weird thing about that particular story. Illustration is credited to "Summer" (not "Summers"). I can't spot any "LRS" signatures; he usually (but not always?) put them in (sometimes they got cropped off (I've seen half-cropped ones)); & these aren't to my unartistic eye obviously his work. I don't happen to have that issue, I think, but I presume PG didn't mess the transcription up on that point. -- Dave Dave (davecat) 10:22, 20 Mar 2008 (CDT)

The text in the PG editions normally reproduces with good accuracy whatever was in the original text, and on those occasions when what the trascribers thought was an obvious typo is the original is corrected, a transcriber's note to that effect is provided. The metadata, however, may derive from other sources, and may use whatever PG thinks an author's canonical or legal name is, which may not match the sources the ISFDB already has. Therefore, i would follow the form of the name in the actual text, and either ignore the metadata (as I have done on several occasions) or add a publication-level note.
The artwork was attributed to Summer in the original so it should be treated as a pseudonym if it is fairly certain that the work actually is by Summers. I would use the author name as printed in the text - the standard is that the name on the title page of the story takes precedence.--swfritter 10:20, 22 Mar 2008 (CDT)
(My off-hand guess is that "Summer" is not LRS; but given the possibility of a mistranscription I thought I'd mention it. Thanks for confirming that it was transcribed properly.) -- Dave (davecat) 17:47, 23 Mar 2008 (CDT)
As for the tag, i would follow the canonical name, and therefore would use "pgg" for all of Garret's work, whatever pesud may have been used at the original publication. For one thing, the PG edition itself is cataloged at PG under the "real" name, not under "McKettrig".
If you are adding with Project Gutenberg as publisher do not worry too much about the tag. I think we can dispense with them once the titles have been added with that publisher - but you are right "pgg" would be appropriate.--swfritter 10:20, 22 Mar 2008 (CDT)
I am going to copy this discussion to Publisher talk:Project Gutenberg, and I suggest continuing the discussion there. -DES Talk 10:31, 20 Mar 2008 (CDT)
Above copied from User talk:Swfritter. -DES Talk 10:33, 20 Mar 2008 (CDT)
I will continue to tag items just so we don't miss anything.--swfritter 18:45, 29 Mar 2008 (CDT)

Price field

Discussion copied and refactored from ISFDB:Community Portal#Project Gutenberg) I don't think we have documented a standard for the price field when entering free books yet. The last time a related issue came up, the consensus seemed to be that we should be leaving the field blank as opposed to entering "npp" for books with no printed price, but free books are different. Should we use "free" or "$0.00"? "$0.00" seems to be too US-centric since Gutenberg is accessible worldwide. Ahasuerus 12:08, 6 Feb 2008 (CST)

I'll follow whatever the consensus is, of course. I think that leaving the field blank is a mistake, because that is what we do for items with unknown prices, and here the prices is known -- also if anyone runs a query on price it would be better to be able distinguish free books. I would prefer some form of 0 to free so that if we ever develop stats on average prices or the like these will work properly. I was using ($0.00) because we use a currency with all numeric prices, and while the Project Gutenberg is accessible worldwide, they make a significant point of being a US-based project -- specifically they look only to US copyright law in determining what is in the public domain, and have posted works where someone has claimed that a non-US copyright is still in force. Note that there is a separate Project Gutenberg Australia (which carries a number of works that are not PD in the US, but are in Oz), and a separate project Gutenberg EU, and I think that a separate Project Gutenberg Canada is in existence or being formed. I would mark works published by each of those as zero in their respective native currencies (euros for the EU PG). -DES Talk 13:46, 6 Feb 2008 (CST)
Those last five words decided it for me - I prefer zero without a currency symbol. No way would I want hard-working British contributors to have their works revalued in foreign money. ;-) And zero converts to zero worldwide - well, for currency anyway, it's not like Centigrade to Fahrenheit to Kelvin. BLongley 13:57, 6 Feb 2008 (CST)
Actualy the PG-Europe site has been concentrating on works not in english, particularly works where full unicode representation is desired to handle accented characters. British works are mostly being done by either PG-Aus or PG-US, because of the ways in which the copyright laws interact. But that is just a tendency, not an invariable rule (to misquote Prof Parkinson).-DES Talk 14:15, 6 Feb 2008 (CST)
I realize this has been laying fallow as a discussion for some many months, but after trying it different ways, I have come to the conclusion that as the ISFDB stands today, it is better to include a Currency Symbol than not. I think from here on out, it would be better to standardize on using the currency symbol of the country associated with the Gutenberg site, cited as the source of the ebook. $0 for Project Gutenberg, A$0 (or AUS$0) for PG Austrailia, and €0 for PG Europe (though there is likely little SF in English at PG-EU that is not also at PG or PG-AU). As to whether or not to include the trailing ".00" that is still up in the air, though to me "$0" appears cleaner. Kevin 17:33, 20 September 2008 (UTC)
  • 0: When looking at the Title Listing, The presence of a Currency Symbol makes it 'human readable' to casual user that the '0' displayed is referring to price. $0 Example 0 Example Kevin 17:33, 20 September 2008 (UTC)
  • 0.00: When looking at the Title listing where 0.00 is used without a symbol, there may be an implied currency notation associated with x.xx to the casual database user, but it appears, incomplete when viewed in a larger list of publications. $0.00 Example 0.00 Example Kevin 17:33, 20 September 2008 (UTC)
My own (fairly strong) preference would be to not include a currency symbol. -- Dave (davecat) 22:11, 21 September 2008 (UTC)
Why not? "Wherefore is this price different from all other prices?" :) I have always used a currency symbol on all PG publications I have entered. -DES Talk 22:45, 21 September 2008 (UTC)
As Bill said (in effect), it's different in that it's the same no matter what currency is used. The dollar sign at the beginning of "$27.95" gives useful information about the price; at the beginning of "$0.00" it doesn't. PG is not pricing in US dollars as opposed to pounds or euros.
That said, Kevin certainly has a point that, in the listing of pubs for a title, the dollar sign flags it as a price. But in a list where other pubs have prices, it's not exactly rocket science to identify the position in the list.
At any rate, that's the way I've been doing it, consistently. (Either there was more discussion somewhere else which seemed to come to more of a consensus (reasonably possible), or Bill's was the last comment I saw back in February, & I thought it was the last word. I didn't see Dave's (DES's) comment or else didn't see it as relating to this question & so forgot it.) -- Dave (davecat) 13:10, 22 September 2008 (UTC)
You should know by now that I never get the last word! ;-) (Well, if I do, it probably means people are just leaving the insane person in the corner alone out of sheer pity or incomprehension.) BLongley 20:49, 22 September 2008 (UTC)
If the consensus is FOR a currency symbol, I'd prefer it to indicate the source publication's original currency. (Which I suspect will not be easily available, but I haven't read many Gutenberg titles.) It would work fairly easily for the US site, it's less consistent for Australia, and for Europe it's absolute chaos - I doubt you could ever be sure when it was a Euro or a Peseta or a Franc or a Mark or a Crown, and British Crowns don't match Swedish ones either... BLongley 20:49, 22 September 2008 (UTC)
The problem with a convention for the Projects Gutenberg is that while it works consistently for the US site, it's less consistent for Australia (who moved from pounds to dollars in the 1960s I think?), and for those of us vaguely in Europe that haven't adopted the Euro, it's also a bit insulting. (And I'm not upset just because a third of the world used to use the pound, shillings and pence and then a small fraction of the world invites us to join THEIR "common currency" - I still vote NO currency symbol be imposed on people that don't actually use such.) BLongley 20:49, 22 September 2008 (UTC)
As to "0" or "0.00" - I agree "0.00" probably makes it look slightly more like a money field to most of us, but again that's coloured by our current perceptions. I could argue "0/0" (zero shillings and zero pence) would make more sense for pre-decimal British publications. Or that 0.000 is more suitable for countries that divided their main unit into thousandths rather than hundredths. (Has anyone ever wondered what the "Mils" I sometimes enter for "Other prices" in notes actually mean?) BLongley 20:49, 22 September 2008 (UTC)
Having said all that - I don't actually care that much, Gutenberg is an area of ISFDB I don't usually enter pubs for, nor moderate. But if we want a good number of moderators to cover such, they ought to be in agreement and the simplest conventions are the ones that moderators learn. The complicated ones just get left on the queue for someone else. So while I would take it as a mild personal insult if people DO start using "€" despite all the above, I would have some small revenge by leaving such on the queue for someone else to check if it was for the right Project. ;-) BLongley 20:49, 22 September 2008 (UTC)
I will point out that all Project Gutenberg publications have dates associated with them that are long after decimilzation or Dollorazation in the case of Australia. Can a publication actually be issued in a defunct Currency. Kevin 23:40, 22 September 2008 (UTC)
I meant the source publication as in the source for the Gutenberg edition. So no euro price symbols at all yet as it hasn't been around long enough for pubs to get old enough to qualify for Project Gutenberg. Nor has the Australian Dollar, but the Australian pound (as opposed to the pound sterling) is old enough. These pubs already have the price (0) and the publisher, if you're going to put a currency symbol on it make it for a useful reason, indicate something that isn't otherwise already obvious. Otherwise it isn't worth the extra typing. BLongley 18:46, 23 September 2008 (UTC)
I will also point out that each Gutenberg has a Country of origin for which a Currency symbol exists and is expected to be used in where clarity is desired. Kevin 23:40, 22 September 2008 (UTC)
If the Gutenberg editions have got a Country of origin, and you want to use country of Gutenberg edition rather than original edition, why do you want to lump all the European currencies together? Or do you mean each PROJECT has a country of origin? In which case it's still not adding anything. Where is Project Gutenberg Europe anyway? BLongley 18:46, 23 September 2008 (UTC)
As we are treating Gutenberg publications as new editions, not simply reprintes of their "source" texts, it seems to me that the relevant date is the PG release date. But I do think you have a point about using the currency of the country of origin of the work, when known, at elast for things from PG europe, sicne that project covers multiple countries and does not claim to be located exclusively in one. PG-US is clearly located in the US, and wherever its authors may have lived or written, should IMO get a $0.00 price, just as a Baen reprint of a work originally published in England, or Poland, will get a $-labeled price. Simialrly, PG_AUS works should IMO all get A$0.00 prices. -DES Talk 22:45, 23 September 2008 (UTC)

Tags

Oh, and on a related note swfritter has been using Tags for Gutenberg titles, it might be good to merge the effort - I'd like to have ONE solution. Do Gutenberg have only one edition of a book, or do they have multiple versions of some titles? If the latter, multiple publications would be better than single tags. BLongley 13:41, 6 Feb 2008 (CST)

On looking at User:Swfritter#Project Gutenberg Science Fiction It appears that he is using only 26 distinct tags: all PG pubs by authors whose last name begins with A get the tag "pga", all PG pubs by authors whose last name begins with B get the tag "pgb", and so on. Thsi is in effect a hack to provide a search by publisher/author for PG titles only. I have no objection to addign these tags, and they in no way conflict with what I have been doing with PG pubs. -DES Talk 14:16, 6 Feb 2008 (CST)

Pages fields

I have been leaving all page number fields and all fields for page count of the work blank. it has been suggested that for an ebook collection or anthology "placeholder" page numbers of 1, 2, 3... be entered to preserve the order of the contents, but there is not yet any consensus on this, as far as I know. -DES Talk 14:36, 6 Feb 2008 (CST)

I have started entering such "placeholders", when, and only when, it seems to me that the order of the items in a work is significant to the overall effect of the work. Discussion on whether, and if so how, to make this a common practice is in progress. -DES Talk 13:29, 13 Feb 2008 (CST)
I notice this entry mentions number of lines rather than pages. Is that an emerging standard or a past experiment? BLongley 20:23, 9 September 2008 (UTC)
(Not on my part, either way.) -- Dave (davecat) 21:50, 9 September 2008 (UTC)
Thinking about it, I'm inclined to question the number of lines idea, anyway. The lines are likely to display differently, depending on one's browser & browser settings & resolution; a line break in the HTML code probably is being ignored, isn't it? -- Dave (davecat) 15:11, 10 September 2008 (UTC)
I'm entering the page count as "unpaginated" if there are no page numbers, after seeing a bunch of others. That at least gets rid of an error message. For page numbers, if they're there I'm using them. If they're invisible-buried-in-HTML numbers, I'm still using them but saying so in the notes. I don't know whether anyone else is bothering with that. -- Dave (davecat) 21:50, 9 September 2008 (UTC)
Thanks for the comment - it's a rather vague field database-wise, so if people stick to some standards* then I can treat "unpaginated" as "0" or "unknown" and exclude them from any sort of questions like "the average size of a paperback in 19xx was..." queries. BLongley 22:12, 9 September 2008 (UTC)
* We LOVE standards at the ISFDB. That's why we have so many of them! BLongley 22:12, 9 September 2008 (UTC)
Note the verification date, I think that must have been one of the very first PG pubs I did, i certainly haven't done anything like that recently. I have always been leaving the page count blank when i can't enter a numerical figure. I will start entering "unpaginated" if there is a consensus for it -- It would distinguish the many pubs entered from secondary sources where there is a page count, but we don't know what it is. -DES Talk 00:39, 10 September 2008 (UTC)
OH, I like the other Dave, use page numbers in HTML unless they are so will hidden that i miss them. -DES Talk 00:41, 10 September 2008 (UTC)
So we are using 'unpaginated' now? Should we add this to the page instead of the stated blank right now?Kevin 04:57, 10 September 2008 (UTC)
(I wonder whose idea it was? I kind of thought it was Dave's (DES's); as I say, I found a bunch of them.) But the reason Dave gives (makes clear that it's not just "unknown") is why I thought this was a good idea to follow. And as Bill says, if we're consistent about it, batch things can be adjusted to handle it correctly, whatever "correctly" is in a given context. Unless someone raises some problem with it, I'm for making it a standard practice & documenting it. -- Dave (davecat) 15:03, 10 September 2008 (UTC)

Separate editions?

I've got a question here. I've been entering quite a few PG ebooks now, most scanned from magazines, some from books. In general what I'm actually looking at is a HTML version; usually it includes the illustrations from the original magazine or the book's covers, sometimes a magazine cover when the story was the subject of the cover art. I've been entering the artwork in ways that have seemed most appropriate on a case-by-case basis.

I've just realized that these ebooks are in fact available in different forms. In particular, if there's an HTML version, there's usually (maybe always) also a plain text form as well. (There is also sometimes something called Plucker that I don't have any way to read, but which is supposedly generated from the HTML version if it exists; I'll ignore it for now.) But, of course, the text version lacks the illustrations I've been entering.

I'm not sure whether we should be treating these as separate editions or printings, or what. I guess for now I'll continue as I've been doing, but I'd appreciate others' thoughts on this. (Maybe this should have been entered in the rules & standards discussion instead of here. Dave, if you think so, feel free to move it.) -- Dave (davecat) 12:28, 11 Apr 2008 (CDT)

Ezines and ebooks from Fictionwise come in about fifteen different formats. I use the most universal format available - in the case of the Fictionwise releases that is PDF (they don't have HTML versions). I further clarify the binding as 'ebook: PDF'. Some of the other formats exclude artwork. In the case of Jim Baen's Universe I use HTML because I consider it more universal than PDF. Plucker is for use on the Palm PDA platform.--swfritter 12:43, 11 Apr 2008 (CDT)
I would suggest making a notation in the notes stating the source as 'HTML' and perhaps even stating that it is also available in text and Plucker versions. You certainly have no obligation to enter all three formats.--swfritter 20:38, 16 April 2008 (UTC)
I have started putting an entry in notes such as "This ebook is avaliable in ASCII an HTML formats" together with a link to the page where all the formats are listed. -DES Talk 23:24, 26 April 2008 (UTC)

Page counts/numbers?

What about cases where marginal page numbers are given in the text, corresponding to page numbers in the source publication? So far I've put in a note but ignored these otherwise; however, particularly when there are multiple contents, it seems to me it might make sense to use them. And if they are used to generate a page count, it would turn off that irritating warning in the biblio. Anyone? -- Dave (davecat) 20:45, 26 April 2008 (UTC)

Where these exist, I have been treating them exactly as if they were page numbers in a printed volume. Note that PG is doing this more often in recent works than it used to -- it is becoming routine. Note also that when this is done, the page numbers always match those in the volume/edition from which the transcription is made, pretty much exactly (sometimes plus or minus one line). -DES Talk 23:21, 26 April 2008 (UTC)
Good; that suits my instincts on this. I also view the cross-reference to the original source pub as a big advantage; one possible exception is when there are illustrations. Particularly an illustration that spans two pages, which just winds up split into two separate illustrations, since the pages are no longer side by side.
(And not so good that I'll eventually have to revisit some I've already done, I guess.) -- Dave (davecat) 17:38, 27 April 2008 (UTC)
I think you will find that large illustrations are not split in such cases -- the PG books are not precise facsimiles or sets of page images, after all. -DES Talk 13:15, 28 April 2008 (UTC)
I've definitely encountered some that were split. (I had the original text to compare.) I'd have to dig some to turn up an example, though. -- Dave (davecat) 18:38, 28 April 2008 (UTC)
Then I am mistaken. I believe that the favored practice is otherwise, at least recently, but I might be mistaken again, or it might depend on whether a single unified illustration can be made at a size that will be clear within the width of the rest of the formatted text. -DES Talk 18:42, 28 April 2008 (UTC)
An excellent example, which I just happened on, is in Ebook #23561, "Anchorite" by Randall Garrett. The first two illustrations, separated by text, are the second & first (in that order!) halves of an illustration which occupies the whole of pp. 100-101 in the original magazine. The one given first, taken solely by itself, just looks like a confusing muddle, IMNAAHO. -- Dave (davecat) 21:32, 5 May 2008 (UTC)
I think the best option for ebooks, at least for now, is to make one entry for the artwork as is currently the case for the above Garrett story. An entry can be made in the notes indicating artwork anomalies. I am not sure that page numbers are really necessary for ebooks unless they are collections, anthologies, or omnibuses.--swfritter 18:08, 6 May 2008 (UTC)
As for a single artwork notation, i tend to agree, with a possible exception when individual pieces of interior art have separate titles or captions. In such cases, it might be desireable to capture them in detail, this wou;d be a judgemnt call.
As to page numbers, I agree as far as "placeholder" numbers go. But when the transcibers have indicated the original page numbers, i think they should be entered in the length field, just as if the ebook was a paper book. I see no downside to doing so, and it can help determine whether an oder edition is the same as the edition from wich the PG text was transcribed. it may not be desireable in such a case to note page numbers for interior art, but that is a judgement call even on paper books. -DES Talk 18:21, 6 May 2008 (UTC)

Links

First of all: Dave (DES), good on moving the discussion items to the talk page. And, mostly, on the new items reflecting stuff that was discussed.
Second: in the item titled "Link" (Publisher:Project_Gutenberg#Link), you say:

Please include a link to the actual Project Gutenberg edition in the notes field. For example:
This ebook edition is available in HTML, ASCII, and iso-8859-1 formats as <a HREF="http://www.gutenberg.org/etext/20836">Ebook #20836</a>.
The link should as in the example, go to the root page for the etext, rather than to any of the actual texts. the root page is always at an address like "http://www.gutenberg.org/etext/nnnn", where "nnnn" is the etext number.

I can't track it down at the moment, but I thought someone (Ahasuerus? Swfritter?) had argued against doing this directly, on the grounds that some of these titles may be copyright outside the US. I'm sort of inclined to agree, though I don't feel real strongly about it. I may be misrepresenting what was said, too. -- Dave (davecat) 18:52, 28 April 2008 (UTC)

I don't buy that argument, for several reasons. It is pretty clear that there is no legal liability in such cases, or PG itself could not continue to operate. (I understand that PG has never yet had to take down a work because of a legal demand.) Our servers are located in the US (I presume this is still true after the move) and so US copyright law is all that legally concerns the ISFDB itself. Also, the links should be to the root/metadata page, which itself contains a warning to the user not to check the copyright laws where the user resides if the user is not in the US. (The page whose URL is given in the example includes the text "Copyright Status: Not copyrighted in the United States. If you live elsewhere check the laws of your country before downloading this ebook.") It is fairly well established that providing a link for informational purposes, particularly when there is no profit derived by the link provided, is not contributory infringement. I can dig up caselaw if anyone really wants. As to moral grounds, most works posted by PG are long out of print everywhere, even if technically under copyright in some countries.
All of that being said, if people want to discuss this, fine. if the consensus is for a different police, i will comply, but I will continue inserting such links unless a fairly clear consensus against them emerges. I strongly object to copyright paranoia, while I am just as strongly in favor of respecting actual copyrights. (If a living author personally objected, I might well feel differently about that case, even if such an objection had no legal standing.) -DES Talk 19:14, 28 April 2008 (UTC)
No problem in my mind of linking an American site to legal American downloads. Probably alright to link to legitimate foreign sites. A responsible site is going to have an obvious disclaimer about the legal issues. For primarily ethical and perhaps legal reasons we should avoid linking to any pubs from even remotely questionable sites - which is why we only use Project Gutenberg. We are not actually providing downloads. We are only linking to pages where a user can download at their own discretion.--swfritter 20:00, 28 April 2008 (UTC)
I think the main concern was to links to OTHER Project Gutenbergs, Australia as a particular example. Currently, the US seems to have the most restrictive copyright laws of the English-speaking countries represented here (the "copyright renewal needed" issue aside) so if it's OK there, Al is safe. If we linked to a Russian Project Gutenberg I think we'd see our site disappear pretty fast. :-/ Still, we might have an ISFDB editor's meet at Guantanamo Bay then? ;-) BLongley 19:45, 28 April 2008 (UTC)
Actually i am pretty sure that linking even to an acknowledged pirate site (say one that posted copies of all of Lois Bujold's work) would not be illegal as long as we did not urge people to follow the link and download the pirated content, but merely provided it in an informational context ("A pirate edition has appeared online at...") and as long as we in no way profit from the link, in particular not getting commissions from the destination, nor getting revenue that depends on volume (such as per view rates on ads) where there might be reasonable grounds to think that the presence of the links increases that volume and thus our revenue. Since we do neither, we are safe even from links to PG-USSR or Pirates-R-Us :). (A true pirate link might well be considered immoral, but that is of course a judgment call.) Still i do see your point about PG-AUS. What do others think about this? -DES Talk 19:54, 28 April 2008 (UTC)
People in the UK have already been arrested and had their computers seized for creating a site that LINKS to sites that MAY be offering illegal downloads. That's not something I want for myself, so "lowest common denominator" of safety from such is what I'd recommend. "Legally right, financially unable to prove such" is an issue I don't want to get involved with. BLongley 22:19, 28 April 2008 (UTC)
By the way, in some ways EU/UK rules are more restrictive than US ones. US copyright law only applies the "life+70" rule to works published after 1977 (see http://www.copyright.cornell.edu/public_domain/) while the UK applies it retroactively to all works published after 1909, if I am not mistaken. (Of course the US currently applies the foreign rule to foreign authors in many cases, due to the Uruguay round trade agreement). A work published in the US by a US author (who died in 1950) in 1942 with copyright not renewed is now PD. A work of the same date by a UK author who died at the same time is still/again copyrighted. -DES Talk 20:01, 28 April 2008 (UTC)
Yes, in some ways EU/UK rules ARE more restrictive than US ones - that's why I said 'the "copyright renewal needed" issue aside'. "Lapsed copyright" in the US seems to have created some legal battles, or opened up opportunities for bringing lost works back to the public. I have no idea whether UK rules are retroactive rules or not. I just don't think we should be doing anything that puts ANY of us at personal risk of lawyer attention. BLongley 22:19, 28 April 2008 (UTC)
I am reasonably sure that some UK works that were in the public domain in both the UK and the US (such as those of R. Austin Freeman, which I looked into for PG) had their copyrights restored both in the UK and in the US.
As to not risking legal jeopardy, I don't disagree. I am personally convinced that the risk of putting up a link is so small that I wouldn't hesitate to do it on a personal website, indeed that it is smaller than the risk of driving through urban traffic on any given day. But I am not a lawyer, and other people's ideas of risk may not be mine. I have looked into this matter somewhat, but if we know a friendly lawyer who would be willing to give an opinion for a low fee or pro bono, it might be a good idea. We probably could use a legal opinion on what we can safely do in regard to image hosting anyway, if we are going to start doing that. -DES Talk 22:40, 28 April 2008 (UTC)
We more than likely have Australian users who can download the pubs legally from Project Gutenberg Australia. Of course, if we had an Australian editor who was entering the Project Gutenberg Australian pubs that would show a true interest in the value of such data.--swfritter 20:07, 28 April 2008 (UTC)

(unindent)It sounds like there may a few separate classes of issues here. The first one has to do with what our main server can store legally and/or ethically according to the laws of the country that it is located in, which has always been the US in our case. This can range from legal problems as US laws evolve over time to getting SF writers/artists upset even if what we are doing is not illegal. We can certainly use an IP lawyer's opinion re: links and hosting images and I seem to recall that Al did something in this area a few years ago. Perhaps we should ask him first?

The second class of issues has to do with what our contributors may face in their own countries of residence based on their contributions to the ISFDB. I suppose if a Freedonian contributor tries to make the ISFDB a mini-repository of links to Web-based texts that the government of Freedonia objects to (say, Salman Rushdie's works, which is arguably SF), he may find himself in trouble in Freedonia. I am not sure what, if anything, we can do about this short of warning our contributors to be careful and noting that although we don't make our editors' private information available in the backup file, we can't be 100% sure that our host's servers won't be compromised at some point. On a somewhat related note, the ISFDB data has been occasionally mirrored by other servers. If one of those mirrored servers happens to be in another country, there is no telling what the legal implications for its owners may be, but that's not something that we have any control over.

And then, of course, is the ever present issue of mutability of online sources. A site that was run by pirates just a few years ago may have become a legitimate site (I have seen reports of this happening in Central Europe under pressure from EU/US/local government) or, conversely, a previously legitimate site may have lowered its standards and is now hosting works without permission. The number of possible permutations is almost endless, which was one reason why we were reluctant to link to Web-only works in the past. Ahasuerus 17:55, 29 April 2008 (UTC)

Project Gutenberg America is a significant enough entity that any of the concerns we have about their content would have been addressed long ago. As we know from what is happening in China the capacity for people from other countries to link to them can be controlled by the other countries.--swfritter 18:31, 29 April 2008 (UTC)
Oh sure, Gutenberg-US has been "in" for a while now, I am just mulling over various issues that we may face if we link to other, potentially less established and problem-free online sources. I don't expect Gutenberg-US to go feral any time soon :) Ahasuerus 19:25, 29 April 2008 (UTC)
On my mind was the legal validity of linking to Project Gutenberg which was the original unresolved subject of this discussion. It seems like PG would have gotten into trouble with other countries a long time ago so it doesn't seem that we could get into much trouble for merely linking to a site that seems to have a sparkling reputation.--swfritter 22:51, 29 April 2008 (UTC)
I quite agree. -DES Talk 23:08, 29 April 2008 (UTC)

(Unindent) As to what editors/users/operators of mirrors in other countries may face in the form of legal risks, i think all we can do is warn people that there may be such risks, depending on the laws and policies of the countries in which they reside, and that we can't guarantee the privacy of ISFDB registration information unconditionally, although we do take all the measures we can to keep it secure.

As to our own legal risks, while I am confident that linking from a bibliographic site to online texts is not a risk under US copyright law, it would be a good idea if we checked that with a qualified lawyer. As to cover images, since many of them are under copyright, our use has to come under the tricky "fair use" doctrine. This is an area where a legal opinion would be a very good idea, although we could perhaps piggy back off the MediaWiki foundation. They have paid for legal advice, and have developed strict guidelines for the use of images under "fair use". One of their guidelines is that every image must have attached to it text that identifies its source and presumed copyright status/ownership. Another is that there must be a "fair use rationale" attached to every such image for every such use (i.e. every page on which the image is linked to/displayed). Among the reasons they consider acceptable is "an image of a book cover, to identify the book, in connection with an article/information about the book" (this is from memory, so the wording may not be exactly correct). Obviously that would cover pretty much all of our uses. While we do not have to have exactly the same rules as Wikipedia, following something like them may well be fairly safe, as they are a much larger target, and if they haven't gotten into legal trouble while following those rules (and have drafted them after consulting an attorney) we should be OK following similar rules, at least pending getting a legal opinion of our own.

On ethical issues, that is more of a judgment call. I doubt that any author would object to images of the covers of their books -- after all that may help sell books, which authors generally like. Much the same should be true for artists -- we are one of the few places where an artist's portfolio or something like it is publicly visible. As to links to PG or similar sites -- i suspect that most authors, even if still living, would not object, but some might. i don't see any ethical problems with providing such links in the absence of objections, but others might feel differently. -DES Talk 22:25, 29 April 2008 (UTC) Obiously Al should be consulted about this, but I didn't want to press so soon after the move. -DES Talk 22:25, 29 April 2008 (UTC)

Data for other editions

This could be generalized, it's not just a Project Gutenberg issue. Sources should be noted ESPECIALLY carefully in this case - from my experience, prices noted for similar publications to the one you're holding may be only available direct from the publisher, they are NOT necessarily the Recommended Retail Price. Even if they are, they may be for last year's edition (not yet sold out) or probable prices for forthcoming books. There have been periods of price stability in the UK where I'd be 95% sure, for instance, that a 2/6 paperback referring to other 2/6 paperbacks from the same publisher was accurate. But 95% isn't enough to use the data indiscriminately, and I'd be slightly less confident with Gutenberg titles as the transcription adds another level of possibility of error. In other years I would assume that the prices suggested weren't worth the paper they were printed on, prices changed just too fast, several times a year - and when not printed on paper, even less reliable. BLongley 19:34, 28 April 2008 (UTC)

Quite true, but that is why I am thinking of this particularly in the PG case. When PG has a source text for an entire book (not just a magazine excerpt) it is pretty much always prior to 1964 and usually prior to 1923. At those distances in time the distinctions between "retail" and "publisher" price pretty much disappear, and other sources of info on price are also much sparser. I have recently been working a lot on the publications of Frank R. Stockton, who died in 1902, and who has several texts in PG. I have been able to find data for several entries in the copyright and ad data present in the PG editions.
All that said, a generalized version of this advice, with your caveats, might well be a good idea. What page do you think it would belong on? -DES Talk 19:47, 28 April 2008 (UTC)
The Help Page for "Entering Data from Secondary Sources", of course! ;-) Not that we have one. :-( I do recall a discussion where we agreed that such was a good idea, with certain caveats - Prior publication info was considered useful, as it sometimes gave pointers to missed editions/printings, or dated prior editions/printings better than the original publications themselves. I know that such can mislead on Imprint at least, although probably not Publisher. BLongley 20:38, 28 April 2008 (UTC)
For the issue of Price, Template:PublicationFields:Price seems the obvious place to adjust (it needs updating anyway, the Canadians should rebel against US bias and the Europeans (but not the British) should have the Euro properly respected, and Australians and New Zealanders probably have a few issues too, and these days most books are printed for a world market with multiple prices on each and we're duplicating unnecessarily or putting the wrong "Canonical" price on, IMO). But adjusting such templates just tends to lead to longer Help pages that are even less likely to be read. :-/ "Help" is a mess as most of us figure out what we should be doing via actual PRACTICE and advice - I freely admit I've learnt 90% of the "right" things to do (currently) from direct talking with the perceived experts in the area, rather than the help pages, and I suspect help is very out of date in many areas. BLongley 20:38, 28 April 2008 (UTC)
Oh, and we should probably exclude "Cover Price" as an Entropy stat for books over a certain age, by country, by format as well. I know my 1920s H. G. Wells hardcovers would NEVER have such a thing - FAR too vulgar! Whereas the "Everyman's Library" books were proud to say that they were quality books, but ALWAYS a shilling or less. And those Pesky Australians often demanded that British books sent there had to specially say that for them ONLY, the Cover price was just a guideline, whereas the "Net Book Agreement" back home was mercilessly enforced. BLongley 20:50, 28 April 2008 (UTC)

Credits

Most PG editions have a production credit, indentifing the person or people most responsible for creating the etext edition. I invariably copy this to the "notes" field of the entry for a PG edition. Does anyone object to my mentioning this in the "guidelines" on the publisher page? -DES Talk 19:35, 28 April 2008 (UTC)

I've been doing that, too. I've wondered whether it's really worth it, but I've viewed it as potentially useful bibliographic information. -- Dave (davecat) 14:44, 29 April 2008 (UTC)
Some producers are more likely to do things like include page number indications in the HTML. That can be worth noting. Besides, i feel the credit is worth giving. -DES Talk 15:08, 29 April 2008 (UTC)

Talk vs publisher

I have moved all the threaded discussion onto this talk page, and left the items on the main publisher page in the form of guidelines or requests -- I hesitate to say "standards" much less "rules". I have left these signed in most cases, since I drafted the language currently present for many of them. However, it would IMO be better if signatures were removed so that anyone felt free to edit the guideline text with feeling that s/he was altering someone else's signed statement. I ask people to look over the page, and start or continue discussions on any that seem dubious. Any that appear to have consensus -- and i will take silence for consent -- I plan to remove the signatures from and leave as the pronouncements of a "voice from on high" after a reasonable time has elapsed. -DES Talk 19:41, 28 April 2008 (UTC)

Link in Title Vs. Publication

Should the link to a Gutenberg publication also be put into the notes for a Title in addition to the notes for a publication?

Example: Mary Shelley's Frankenstein has LOTS of publications. Expecting a casual user to notice that one of the publications is a GUtenberg e-text may be expecting too much. But putting the link in the Title Notes, lets anyone who looks at the Frankenstein record to see the link. I'm not saying the link should not be in the publication record... it definitely goes there... but is it also reasonable to put it in the title record? Thoughts? Kevin 06:01, 23 August 2008 (UTC)

It could be placed there. But since the publisher is shown in each publication record listed on the title page display, the PG name is fairly visible on the title display already. But if you want to add such links to title records I won't object. (It is probably more useful in a case where there are many pubs and the PG listing could easily be overlooked, and less useful where there are only 2-5 pubs, say.) What will you do where there is more than one PG exext for a given work (rare but it does happen)? -DES Talk 06:35, 23 August 2008 (UTC)
List them both, but probably give an editorial nudge that one link is of higher qaulity (more formats, newer standards, etc) than the other, unless each link is merely based off of different sources.Kevin 07:00, 23 August 2008 (UTC)
Fair enough. -DES Talk 07:14, 23 August 2008 (UTC)

Interior Artwork Promoted to Cover Artwork? (Ebook shortfiction)

I just put in Peter Baily's short story Accidental Death. In the original magazine publication, there is an interior artwork credit for Schoenherr. Should that be added as an 'Interior Artwork' credit for the ebook, or (since it was 'opening artwork' for the story) should it be enetered this time as Coverart.?

The item that brought this to my attention was Frank Banta's Droozle ebook has a 'coverart' for uncredited... but in the original magazine version the artwork didn't even get a mention (probably because it was both uncredited, and backgound artwork (See the ebook to see what I mean).

Thoughts? Kevin 16:19, 31 August 2008 (UTC)

Not really. such a change obscures the identity of the artwork a bit, but we aren't merging art pubs at this time anyway, and a note can handle the matter. it is at least arguable that in such cases the art does derive the PG edition as cover art, and i think i have so listed it in at least one case. I'd say this is a judgment call, to be made case-by-case. -DES Talk 19:43, 31 August 2008 (UTC)
Agreed that it's a judgment call. I've been treating all art in PG pubs as interior art, even when it was originally a magazine or book cover, & identifying it with something like "[magazine cover]" or "[book cover]" after the title. (Now, actually, I'm doing things like "[magazine cover] (reprint)", but that raises a whole different issue.) My reasoning is that it's not like cover art, where you can look at the outside of the work & see it without opening the thing up. That's my own judgment; if there's a lot of discussion with a clear consensus the other way, I'll be game to follow.
About the "(reprint)" thing: I started doing this after seeing the discussion on rkihara's talk page (mostly him & swfritter, I think); but the more I think about what I've seen PG do to illustrations, the more I think this is a Good Idea whenever artwork is reprinted, at least unless the editor can really verify that the artwork is fully & faithfully reproduced. (PG's usual fault is breaking illustrations that crossed the gutter into separate illustrations, but there are other changes such as cropping magazine cover art to remove text, as well.) -- Dave (davecat) 21:27, 1 September 2008 (UTC)

Gutenberg Links - Redux

The current suggested link language is

This ebook edition is available in HTML, ASCII, and iso-8859-1 formats as <a HREF="http://www.gutenberg.org/etext/99999">Ebook #99999</a>.

I am going to start using this instead

This ebook edition is available from <a HREF="http://www.gutenberg.org/">Project Gutenberg</a> in HTML and other formats as <a HREF="http://www.gutenberg.org/etext/99999">Ebook #99999</a>.

When I come across the rare item that does not have an html edition, I will change it to read "in ASCII format" or "in iso-8859-1 and others". This way, I list the most desired (well formatted) version of the ebook, but EVERY Edition has ASCII, so why list it, and only some editions have an ISO version... and lastly the formats available can change (new ones added) and I don't think we want to be responsible for being complete. Just letting them know an HTML version plus others should be enough.

Thoughts? Kevin 17:01, 31 August 2008 (UTC)

It is the case that every PG ebook has an ascii version, or pretty nearly so. It is not the case for other ebook publishers, and it is not safe to assume that all our users know this about PG. I am going to continue listing all available formats for all ebook editions, and I urge others to do the same. Yes versions available can change, but this is rare. I don't think we should promise to be complete, but I think being as complete as we can be is a good thing. I think this is particularly important when there is only one non-ascii format, as mentioning multiple formats makes the general multi-format nature clearer.
As to linking to the PG main page, there is no harm in it, but since there is always a link from the ebook's canonical page to the PG main page, I don't think it adds much either. -DES Talk 18:13, 31 August 2008 (UTC)
agree to disagree on listing ISO, ASCII, Plucker (which is autogenerated)? If they can see the database.. they can read HTML so the HTML version is always relevent. If we were to list .MOBI, .LIT, .PDF that I could get behind... but no-one honestly wants to read in ASCII or and ISO format. I think the phrase "and other formats" communicates the availability of 'other formats' just fine. In regards to it being 'particularly important' when there is only one non-ascii format I agree (and didn't make myself clear enough.
  • Only ASCII (ignoring plucker) - List 'ASCII format'
  • ASCII plus one other something(ignoring plucker) - List 'SOMETHING and other formats'
  • ASCII, ISO, HTML - List 'HTML and other formats'
  • ASCII, ISO, HTML, .MOBI, ETC - List 'HTML, Mobipocket, and other formats'
It was my intention to always list higher level formats instead of ASCII (except where ascii was the only one). If I find an interesting format that someone might want I intended to list it, and if I ever find ePub format I will sing and dance, and then say 'EPub, Html, and other formats' and put epub first.Kevin 19:03, 31 August 2008 (UTC)
Yes i understood your intent. i don't intend ever to use 'and other formats". I intend always to list all available formats, including plucker and including ascii. That way I don't have to guess what formats a user might find intersting, or whether a user knows that PG pretty much always includes an ASCII version, and very often a plucker version. I urge others to do the same: always list all formats. If you aren't persuaded, i'm not going to come along behind and change your records to my prefered method, but i will hope to persuade you to change your mind on this point in time. I am not asserting that my method has a clear consensus here behind it -- most of the regular editors have not done much with or about PG editions. This is simply my view, to which no one previously had offered explicit objections, although one editor had suggested creating a separate publication record for each format -- although i don't think anyone ever actually did that. I might add that I often do want the ascii format form PG, even when there is a "higher" format available, particularly if I want a searchable version. -DES Talk 19:39, 31 August 2008 (UTC)
<sigh> I somehow missed this discussion when it was new. I'd been listing all formats, except Plucker (on the ground that it's generated on the fly, not a stable version), but had just recently started saying only "available in various formats". Dave's convinced me, though, so I'll start listing them all. I won't try to clean up my older stuff unless I'm looking at it for another reason, though. -- Dave (davecat) 16:40, 11 September 2008 (UTC)
As to linking Project Gutenberg.... The statement "is available at Project Gutenberg" is a correct statement, and making "Project Gutenberg" clickable in that statement is just good practice when publishing an HTML document. It is the first mention of an internet resource, in the text (defining the text to be the note).Kevin 19:03, 31 August 2008 (UTC)
I think making the first mention of something in an html document clickable by default leads to a lot of over-linking. In this particular case the link is close to redundant, since we know that a more useful link to the same domain, (a page one click away from the page for the link suggested) is going to be provided a few words later. I won't object to such a link if you or others put one in, and I won't strongly object if it is mentioned as an alternative form on the publisher page. I don't plan to use that form myself. -DES Talk 19:39, 31 August 2008 (UTC)
Interestingly ebook #10339 has no ascii version, only HTML and Plucker. This is admittedly rare, but I think it is a further argument for listing all formats, not just the "most important" ones. -DES Talk 21:54, 10 September 2008 (UTC)

Date Field

I updated (Read added) instructions and examples of what to do on the date field for Project Gutenberg editions. Comments and revisions are welcome. Kevin 19:04, 7 September 2008 (UTC)

I'm not sure what you put in (using original release date) is the best way to go, though I can see some reason for it. My main reservation is that, in the example you cited, anyone with a copy of the work itself-- which is our normal basis for entering & verifying data--can't tell what the correct date is. The work itself only says "June 23, 2008". Is this really different from the kind of thing where we show separate editions/printings all the time? Especially when the book itself says "Updated editions will replace the previous one--the old editions will be renamed.", & under the files directory there's a directory called "old" which has what I believe are the previous versions - so those versions still exist & can be examined?
Moreover, in other data items (though maybe not this one), the catalog data found on a page such as http://www.gutenberg.org/ebooks/62 sometimes is wrong, not to say bizarre. I recently cataloged one work, from 1950s or early 1960s Astounding or Analog, in which the illustrator was (correctly) listed as "Douglas" - but the catalog page showed it as Frederick Douglass, the 19th-century abolitionist. (I think this has since been fixed, though.) -- Dave (davecat) 22:56, 7 September 2008 (UTC)
What I am saying is that when someone puts a gutenberg book into the ISFDB, they are cataloging the 'EBook #nnnnn' and the information about that ebook number. That #Number only gets published once, will never get reused for another different bibliographical 'work' or 'title' or 'author' (The three primary things we document). Just as we require an ISBN or an ISSN for 'for profit' released Electronic publishing, in order for it to be in... we are in a sense canonizing the ebook #Number for this same purpose for Gutenbergs not-for profit publishing, which does not then spend money on ISBNs or ISSNs. Kevin 23:48, 7 September 2008 (UTC)
If Gutenberg issues a /new/ version with a different bibliographical history, then they issue a new #Number. A different bibliographical history usually means that it was based on a different source, and hence we should catalog it separately (which we allow due to the different #Number). Kevin 23:48, 7 September 2008 (UTC)
As to the Catalog being wrong... this happens (more frequently than I would desire/hope). You can submit a correction by emailing cataglog@pglaf.org and I put the ebook number in the subject and my references (often the ISFDB) in a list in the body (I also try to describe the error in a single sentence right at the top). Most items get fixed in about 48 hours. I was primarily trying to put our house in order with this, the fact that the catalogers there are just as fallible (and volunteers) as we are is just a fact of lifeKevin 23:48, 7 September 2008 (UTC)
My point was that the catalog is much more subject to error & change than the text, & I cited a case of an extreme example (which, on digging it up & checking, has indeed been fixed.) It's also true that in quite a number of cases, in quite significant ways, the catalog & the book disagree; exact form of the author's name is definitely one of these, & for those I'd argue that we should follow the text. (And even "the text" is ambiguous here. In many cases PG has put a wrapper on a book saying, for example, that it is by Henry Beam Piper, while (correctly) transcribing the author's name on the title page as H. Beam Piper.)
If the date in the catalog is consistently the original release date, that's a useful piece of information, & it's also one I'd judge much less prone to error. Granting all that, I'd still say we should be looking at the ebook itself, not the catalog entry. -- Dave (davecat) 16:11, 8 September 2008 (UTC)
As to 'is this different from separate printings' which we do catalog... Yes it is, on two fronts. We do not (currently) catalog multiple formats as separate entries for electronic works (Baen produces in 6-7 formats, and Gutenberg in 1-5 formats per title). Separate formats as entries would be a much better use of our time, than "The June 23 edition fixed a typo on line 1038 from 'at' to 'as'" as nothing else will or even can change bibliographically. Secondly, we catalog different printings in order to build a complete catalog of pricing, covers, publishers, and catalog numbers. In the case of Gutenberg editions, there already exists a complete catalog of the various 'printings' at Gutenberg, and this effort would be duplicative and wasteful.Kevin 23:48, 7 September 2008 (UTC)
In the case of a Gutenberg edition and what is in hand.... 'EVERYONE' has the same copy in hand.... the copy at #nnnnn. If you can't get to Gutenberg, you can't get to the ISFDB, so you don't have either. In the case of Gutenberg, I feel we are cataloging, not what I downloaded.. but what is available for users of the ISFDB to 'go' download. If someone already has a copy of the ebook downloaded and they want to see if it's been updated, they should go to Gutenberg... not to us. Thanks for raising all these points. The more holes we try to poke in this, the fewer holes we will find later. !Kevin 23:48, 7 September 2008 (UTC)
Um. It's not the case that everyone has the same copy in hand. I personally almost never read an ebook on line; I download it & read it from my own copy at my own leisure. I'm sure that there are plenty of others who do the same, though we probably are a relatively small minority. (I eventually get around to burning collections of them to CDs for my later use.) -- Dave (davecat) 16:11, 8 September 2008 (UTC)
It used to be the case that PG would, occasionally, issue a "revised" version with the same ebook number, mostly to correct errors discovered in the transcription. I don't know if they still do that. In such a case, it is arguable that the the later date should be considered a different publication. Also. I think that there have been cases where an older ebook was revisited for the purpose of adding additional formats, particularly HTML, and in such cases the same ebook number was retained. In such a case, the revised date could be considered to be the equivalent of a second printing, where no text had changed. But all of these cases are rare. Still. Part of the reason we consider PG works catalogable at all is that there are downloadable, and that they are not simply available from the website, but copies may be stored by many people, just as many people may have copies of paper-based books. In a sense I think that I am cataloging, and still more verifying, the ebook as it was when i accessed/downloaded it. I am certainly not cataloging any revisions that might be done in future. If I verified a PG ebook, and it was later revised in a non-trivial way, I think i would probably record it as a new entry, just as if there was a new printing. So, although it was not my previous practice, i am more and more inclined to suggest that the latest revised date ought to be the date cataloged, or else entries should be made for each date. This would be particularly true if revisions had been made to all formats (to correct a textual error, for example). -DES Talk 00:36, 8 September 2008 (UTC)
Are you seriously proposing that we waste our time cataloging versions 10, 11, 12, and 13 (See [1]) of The Princess of Mars, and that this level of detail is a more useful dedication of time and resources, than it is to list as separate publications the HTML and the txt versions of this work that are the currently available in the catalog under ebook #62? That documenting as a printing, that an older version of the text, that has been deemed to be in error (And not Big Brother 'in error' kind of revision) is more correct than entering Baen's ebooks as separate publications in .Mobi, HTML, Sony reader, RTF, etc etc. Format changes actually impact the way a user will interact with them and are more worthy of our time. Text corrections are of almost zero interest to the nth degree to any bibliographer, except where they are first edition points (Which only matter because of the collectability of print editions), or 'evidence' of Big brother type revisions (at which point we should document them - don't get me wrong). Kevin 01:15, 8 September 2008 (UTC)
I doubt that I'll ever go out of my way to track down & catalog older versions of PG ebooks. But if I download a PG ebook, go to check whether it's been cataloged, & see that I have a later version, I'd be inclined to catalog it as such. In fact, I think I did so once, only noticing after the fact that there was an earlier PG version listed. (I missed it in the long list of pubs, as there were probably five or six other things between the dates in question.) -- Dave (davecat) 16:11, 8 September 2008 (UTC)
In the real world... when someone buys, or finds a book of SpecFic.. they might come here to learn more about it. In the real world, if someone finds an old ebook on their harddrive from Gutenberg... they are going to go to Gutenberg to learn more about it. In what world would anyone care that The Gutenberg ebook #62, The Princess of Mars even /care/ that there were prior /incorrect/ versions of the text, and that person then /not/ know to go to Gutenberg to look at them, and instead they come to the ISFDB for this information. My argument is essentially that the information you just proposed cataloging does not serve any community in existence, and will not serve any community in the future. Hence if it serves no purpose, it's not worth our effort to document.Kevin 01:15, 8 September 2008 (UTC)
You have just convinced me, to use the revised date in all cases. A) we shouldn't assume what people do and don't know, the ISFDB should be sell-sufficient. B) Why should the ordinary person care what the date of any PG edition is? Anyone who does care about the date at all, may well care about the date that a specific text was issued. No one is required to enter multiple printings, anyone who doesn't choose to do so is free not to. But if a single entry is made, than i would think that the latest revised date would be better than the first issued date. But, let's hear what others think about the matter. -DES Talk 01:27, 8 September 2008 (UTC)
I misunderstood you position. You want to use only the latest date.... but theres a problem with your statement of using the latest date. Does that mean previous 'revisions' are not suitable for cataloging and that after a revision your previously verified entry should be deleted from the database? Or does it mean you consider all revisions suitable for cataloging? I think you have to choose... all dates, or first date. The latest date can't be the only allowable choice can it? (Honestly perplexed here) Kevin 02:49, 8 September 2008 (UTC)
No what I am saying is that IF an editor chooses to catalog only a single publication they should use the "then" latest date. A subsequent revision can either be ignored or else cataloged as a separate printing. Similar, if when the publication is first cataloged, the editor may choose to either ignore previous revisions or catalog them as separate printings. In the past I had sometimes ignored later revisions and used the earliest date. I won't do that any more. -DES Talk 03:00, 8 September 2008 (UTC)
I think I'm in agreement with Dave (DES) here, if I understand him correctly. No one's under any obligation to seek out & catalog other editions/printings (whatever we want to call them). Similarly, if I have a PG ebook, & see that someone's already cataloged an earlier (or later) version, I can decide not to bother entering the one I've got. But I may do so (either of those things) if I choose. -- Dave (davecat) 16:11, 8 September 2008 (UTC)
If "it was later revised in a non-trivial way" it would be given a new ebook #. That's the point of the ebook #'s. The only revisions are going to be trivial.Kevin 01:15, 8 September 2008 (UTC)
Point noted, but still. -DES Talk 01:27, 8 September 2008 (UTC)
I'm not even sure I agree on that point. I think pretty clearly Kevin's idea of what is a "trivial" revision is somewhat different from mine. Certainly, if PG goes & transcribes a different source edition, they give it a new number. And those may have very extensive differences in text or especially illustrations. (Consider Piper's Ullr Uprising vs. Uller Uprising.) But if they change a few words in the underlying text (as opposed to in PG's legal text), I view that as non-trivial. Clearly Kevin doesn't, though. -- Dave (davecat) 16:11, 8 September 2008 (UTC)
The only thing I care about is that they are available in Project Gutenberg editions which is why I have continued to maintain my tagging system. Since the changes involved do not represent authorial changes it is only important that the ebook editions be linked with the original source material with the data from the original source taking precedence. The transient nature of e-editions was the strongest arguments against including Project Gutenberg entries in the first place. Creating complicated data entry rules for them will only discourage editors from entering the data and could result in future resistance to emerging formats.--swfritter 18:34, 8 September 2008 (UTC)
I am, I think, arguign that on the date field we treat PG editioons just like any book, which may or may not hwve mutliple printings, and we may or may not record all such printings. -DES Talk 20:02, 8 September 2008 (UTC)
Dave - It's not that 'they change a few words', it that they 'correct a few words' to match the print edition. If the Author shows up and edits the work, that a revision. If the work is revised to 'correct factual errors' that is a revision. If Chapter 3 is placed correctly between chapters 2 and 4 (instead of between chapters 29 and 30), that is a revision. If someone replaces 7 commas with periods, to match the original published text, that is trivial. If the word 'be' is replaced once with 'he' once where it states 'he' in the print edition, that is trivial. Kevin 03:05, 9 September 2008 (UTC)

UNINDENT I have revised the date field instructions to remove references to 'Original Release Date' and it now only discusses that you should use the ebook date (from the details page or from the ebook) and not the date of original publication (1912 in the example given). The result of this is that all Project Gutenberg dates should be from 1991 forward. (The first Speculative Fiction release by PG was ebook #11, Alice's Adventures in Wonderland by Lewis Carroll, initially released in Jan 1991, most recently re-released (and updated) June 27 2008. The fact that this ebook is listed with a new release date /proves/ that the release dates were not an canonical as I thought and was the final tipping point, but all of your arguments helped sway me. Thanks for the discussion! Kevin 03:27, 9 September 2008 (UTC)

(Am I also correct in thinking that we have agreement that no Gutenberg work should be listed under the date of the print publication it was based on? or do we need to beat that one into the ground... I don't recall anyone voicing an opinion on that aspect, so I'm thinking we are good there, but realized I should ask) Kevin 03:27, 9 September 2008 (UTC)

Sounds fine to me, and your revision looks ok at first glance. I dont think anyone has ever advocated listing PG pubs under the original pub dates of the titles -- we don't do that for any other publications, after all. We either use the actual publication date or 0000-00-00 if it is unknown. -DES Talk 03:53, 9 September 2008 (UTC)
Actually what prompted me to add this section was a Gutenberg ebook put in by a Mod with a publication date of 1896. That's what made me realize we needed the entry date defined somewhere.Kevin 01:30, 10 September 2008 (UTC)

Title and Author fields

I added a section on the Title and Author fields, noting our general rule that we follow the title page, not any other source, when there is a conflict and the title page is available. The PG metadata page, IMO is more or less equivalent to a DJ or spine for this purpose. I hope no one disagrees with this aspect. -DES Talk 20:12, 8 September 2008 (UTC)

I concur. I too have been indexing based on the Author name and title as listed in the ebook, and not the canonical author in the details, nor the details page title. I amended the text for clarity, and also to mention scanned title pages. Feel free to re-amend as needed, I just tried to make it clearer. We seem to be in agreement on this issue. Kevin 03:41, 9 September 2008 (UTC)
I agree, too, though:
  1. my metaphor for the metadata page would be a catalog card in a library (which some of you may be too young to remember) - that is, something actually separate from the book but intended to describe it, &
  2. I did add a couple of sentences to Kevin's section on Places for Possibly Wrong Information, intended to encourage pub notes when PG's info does not match the info we actually use. (I've been including quite a few such notes, but I wouldn't (say) reject a submission because an editor didn't include one.) If this is controversial, pull it out & we can discuss it. -- Dave (davecat) 18:35, 9 September 2008 (UTC)
Looks good to me. Neither metaphor is quite perfect, but both indicate that the metadata page is not the primary source. -DES Talk 00:33, 10 September 2008 (UTC)
I remember card catalogs. I remember wanting one at home. (Then again, I also remember being a geek before we were cool - Go Figure). Kevin 01:29, 10 September 2008 (UTC)
February 28 1997: Last day libraries could order catalogue cards from the Library of Congress. Dana Carson 21:33, 10 September 2008 (UTC)

Definition of Speculative Fiction - Is it non-rational?

I was working on ebook #64, and had just (above) been working with ebook #62. I wondered what #63 was and should it be in the ISFDB... So I looked. Does 'Not Rational' equal SF? If so... should we index pi #50, e #63, or a revised, directors cut of e to a million digits, #127. Sorry, I couldn't resist the very bad pun. Kevin 04:06, 9 September 2008 (UTC)

Gutenberg Audio

As if you weren't busy enough. Guess these are no different than any other audiobooks.--swfritter 16:53, 10 September 2008 (UTC)

I don't see why they should be. List formats as usual. -DES Talk 18:59, 10 September 2008 (UTC)

Bibliograhic listing via pub

Very doable but, as I suspected, with extra processing for variant titles/psuedonyms. I am trying to set a daily ISFDB time limit (clock on the wall says I am close) so it may take a while to get the PG pubs in. Meanwhile, I have a list of the PG titles that are in the system as pubs but are missing title level PG tags; left joins are so much fun. The tags are valuable whether or not the the titles are listed in pubs with PG as the publisher. Once I get those in the tag-based list, which was easy to implement, will be more complete and will eventually be replaced.--swfritter 14:31, 21 September 2010 (UTC)

Use "Project Gutenberg" as tag

The three character PG tags served there purpose of narrowing the listings but will become obsolete if we have PG biblio pages. I would suggest that "Project Gutenberg" be the tag used to designate PG titles. The vast majority were entered by myself but I am not sure if I should replace them or keep them and add a separate "Project Gutenberg" tag. Of course, I have no access to tags entered by others and could only add the full tag in those cases.--swfritter 14:37, 21 September 2010 (UTC)

For items where we have an indexed publication listing "Project Gutenberg" as the publisher, what is the purposed served by having a single tag? How would it be used? I don't object to it, but I don't see what value it adds, since we can do a publisher search, or an advanced search by publisher & author. Perhaps I'm missing something.
It does ocur to me that the system where each individuals tags are private in teh sense that no one else can edit them, is perhaps non-optimal. -DES Talk 22:28, 21 September 2010 (UTC)
Al originally borrowed this implementation from Amazon, Goodreads and other social cataloging sites which let users create their own tags. They tend to work best when a lot of people tag the same book so that you end up with "crowd-sourcing". It's not as useful when you have just one or two tags per book. Typically, the upside is that user-specific tagging is instantaneous, otherwise they would require moderator approval. The downside is that some bad, irrelevant or confusing data can get in easily. Ahasuerus 04:35, 22 September 2010 (UTC)
Since tags show up in the title listing they are a good form of documentation and, in this case, actually work better when there are fewer tags. The user will immediately know that there is a Project Gutenberg version of the story. My intent is to mark all stories that have Project Gutenberg pub entries. If a user regularly accesses PG pubs through our system they will know there is a PG pub they can go to pick up the link. It is also a way for editors to immediately know that the data has been entered at the pub level. An alternative method would be to place a "Online and downloadable versions available at Project Gutenberg." notation as the first entry in the title notes field which would serve my purposes just as well; I basically need some consistent manner of determining which titles have been processed.--swfritter 13:11, 22 September 2010 (UTC
I think it will serve my purposes to use the "Project Gutenberg" tag to indicate titles that are contained in PG pubs; tags are not a reliable form of data entry so this will be only a temporary solution. In the interests of getting the primary data into the system the pub entries will have only required data. Thanks for the input.--swfritter 12:56, 24 September 2010 (UTC)

Listing formats available at PG

The publisher page states "Project Gutenberg etexts are always made available in a pure ASCII format. Frequently other formats, such as HTML, Plucker, and the like are also available for a given text. Please include an entry in the notes field documenting the formats available for a given etext. For example: "This ebook is available in ASCII and HTML formats". The formats available at PG are under constant revision. It would seem that are a more generic statement would be optimal. Perhaps "Online and downloadable versions available at Project Gutenberg"?--swfritter 13:36, 22 September 2010 (UTC)

I strongly disagree. In my experience, the list of formats that a given work is available in is rarely changed, and when it does, formats are added but not withdrawn. The standard formats for new works do change, although which are implemented depends on the "producer" who handles a particular book. I could see adding "Other formats may be made available in future." or some similar text to the standard message on the publisher page, but I think the specific listing of formats is valuable and should be retained. ---DES Talk 14:07, 22 September 2010 (UTC)
The Project Gutenberg site has gone through a major upgrade lately with standardized title entries and formats. Epub and Kindle appear now to be the major downloadable formats and I don't think any of our note entries list them.--swfritter 12:52, 24 September 2010 (UTC)
I have listed EPUB on many many entries -- it has been common on newer titles for at least two years, and I have always listed it. My more recent PG submissions have all included Kindle. A Google search on EPUB+Gutenberg returns 45 results. I still regard HTML as the primary downloadable format and it is what I always download. -DES Talk 13:08, 24 September 2010 (UTC)

Link entry in Publisher page

Since users can now link to PG pages from PG pub entries the Link entry seems to be obsolete.--swfritter 13:44, 22 September 2010 (UTC)

It may be, although i think that it is not as clear as it might be that the entry in the nave bar links to the specific work as opposed to the PG site in general -- indeed there seems to be no indication on screen that the specific work is the target. ---DES Talk 14:09, 22 September 2010 (UTC)
Yes, it is non-intuitive. It took me a while to figure out that I could use the link button to access specific titles at amazon, etc.--swfritter 15:05, 22 September 2010 (UTC)
And on a PG pub none of the other "stores" appear (because there is no ISBN), so there is less of a clue, IMO. -DES Talk 15:09, 22 September 2010 (UTC)
It was non-intuitive enough for me that even though I had seen mention elsewhere of this feature... I didn't figure out where the link was until reading the comment above about other store links not being available. And as fresh eyes, it looks to me like a link to just PG in general. Perhaps just mark it as extra optional at this point? Kevin 23:00, 22 September 2010 (UTC)
The question is, is that programmed-in link sufficient that the Publisher page should no longer recommend a link in the notes field? -DES Talk 23:25, 22 September 2010 (UTC)
Personal tools